Skip to content
This repository has been archived by the owner on Dec 8, 2022. It is now read-only.

Fetching data with PHP #288

Closed
cfecherolle opened this issue Jan 8, 2013 · 32 comments
Closed

Fetching data with PHP #288

cfecherolle opened this issue Jan 8, 2013 · 32 comments
Labels

Comments

@cfecherolle
Copy link

Hi!

I've been really impressed by your app so far, and what it's able to do, and I thought that maybe you could give me some piece of advice.
If you can't, that's okay :)

I've been looking for a solution all over the internet, and you are quite my last hope to achieve this project.

I'm currently developping, for private use, a website which would allow my friends (or other people) to watch the statistics of the mobile applications which I developped, statistics which would be fetched from Itunes Connect, and Google Developper Console.

So far, I've managed to do interesting things for the Itunes part of the work (fetching data, getting it into a database... basic stuff you would say) because there are scripts and API existing, that I can tweak to fit my needs. But I can't figure how to gain access to the Developer Console's data.

Right now, I'm looking for guidelines so I could figure out how to get data from the Google Developer console, which seems quite not friendly for this special use.

I already tried to send POST requests (based on what I sniffed looking at the console) after authenticating in PHP via Oauth 2 process. I couldn't test my authentication because I didn't know what requests I should use. And since there isn't any API to allow me to communicate with Google for this developer console, I'm stuck.

I've read your information/TODO page about it, but it seems like the login part is still on the high level to do list... :(

I don't know if you've made recent progress about all this stuff, but I'd be glad to have any further information, since you look quite documented on the subject.

Keep up the good work! :D

@willlunniss
Copy link
Contributor

That wiki article is very out of date, we need to find some time to update it. All of the items on the todo list are now sorted except for multi-connected developer consoles, and we are using the new dev console in our current release.

See #285 for a related question

You would need to use the PasswordAuthenticator, or similar code to login and get a token. Have a look through the console.v2 classes then come back and ask questions as needed. I don't really know any php so cannot help you with that side of things, but can help with the rest.

@cfecherolle
Copy link
Author

Thank you for your quick answer. I have started to study the code from PasswordAuthenticator, DevConsoleV2 and BaseAuthenticator, but I don't understand what GALX represents, as I see it several times in the code but can't figure out what kind of data it is.

@nelenkov
Copy link
Contributor

nelenkov commented Jan 8, 2013

It is just a temporary cookie you get from Google. You need to pass it along when authenticating.

@cfecherolle
Copy link
Author

Okay, thanks. I was pretty sure it would contain some useful auth information. (I'm curious btw, what do those letters stand for?)

I'm trying to follow the guideline of your PasswordAuthenticator. For now, I have made the same GET request (I'm using Curl for the PHP aspect) but the only Cookie I can seem to get from the response is named GAPS... not GALX. Maybe there is something I didn't understand.

Also, I saw you often use the cookieStore java class. There is no such class in PHP, but with the extension I'm using for http requests, I might be able to put the cookies in a file after each request, and then retrieve them when needed. I would rather not have to parse this, to save some time, but well, if it happens to be necessary I'll do it. At this point, I'm not sure.

So, Is it really important to have them stored in such a way (similar to CookieStore in Java) or can I simply make use of variables?

@willlunniss
Copy link
Contributor

Not sure what GALX stands for...

Check https://github.com/AndlyticsProject/andlytics/blob/master/src/com/github/andlyticsproject/console/v2/HttpClientFactory.java which sets a number of properties used for the GET request.

I don't think you need to use a CookieStore (which is probably just backed by a hashmap), its just a convinence class so variables would be fine (and is what is used in the old version).

@cfecherolle
Copy link
Author

I've been trying to get this GALX value by parsing the response of my GET request, but I can't find how to retrieve it. Indeed, I can see it under the cookie: header, in Chrome's debug console in the network panel.

I noticed there are two requests upon loading this page : https://accounts.google.com/ServiceLogin?service=androiddeveloper

One of them has the GALX value I'm looking for, but under the cookie: header.
This cookie: header is in the Request header, not the Response header (according to what I observed with Chrome).

When I check the headers fetched after executing the GET request in my page, it looks like I only have the Response Header part available... For example I can easily get the set-cookie: header which starts like "GAPS=" but not this GALX value...

I don't have much experience with HTTP requests, so I'm sorry if my question is in fact, dumb or trivial.

@nelenkov
Copy link
Contributor

nelenkov commented Jan 8, 2013

Cookie names, etc. are not documented so you have to guess. If you want to do this you have to get familiar with how HTTP, cookies, etc. work. Generally, you get cookies from the server in the Set-Cookie header(s), and the client sends them back with Cookie header(s).

Besides setting the proper headers, you also you need to make sure your HTTP client follows redirect automatically, or handle them yourself properly. Curl should be able to handle this, as well as maintain the session cookie store for you. If not, try to find an HTTP library that does, otherwise you have to implement the whole thing yourself.
Here's some auth code in Python, should be more readable than the Java one (urllib2 does all of the things listed above):

import urllib
import urllib2
import getpass
import re

email = raw_input("Enter your Google username: ")
password = getpass.getpass("Enter your password: ")

cookie_processor = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(cookie_processor)
urllib2.install_opener(opener)

# Define URLs
loing_page_url = 'https://accounts.google.com/ServiceLogin?service=androiddeveloper'
authenticate_url = 'https://accounts.google.com/ServiceLoginAuth?service=androiddeveloper'
dev_console_url = 'https://play.google.com/apps/publish/v2/'

# Load sign in page
login_page_contents = opener.open(loing_page_url).read()

# Find GALX value
galx_match_obj = re.search(r'name="GALX"\s*value="([^"]+)"', login_page_contents, re.IGNORECASE)

galx_value = galx_match_obj.group(1) if galx_match_obj.group(1) is not None else ''
print "GLAX: " + galx_value

# Set up login credentials
login_params = urllib.urlencode( {
   'Email' : email,
   'Passwd' : password,
   'continue' : dev_console_url,
   'GALX': galx_value
})

# Login
auth_contents = opener.open(authenticate_url, login_params).read()
print auth_contents
f = open("dev-console.html", "w")
f.write(auth_contents)
f.close()

for c in cookie_processor.cookiejar: 
  print c
  print "\n"

#dev_console_contents = opener.open(dev_console_url).read()
#print dev_console_contents

@cfecherolle
Copy link
Author

Thanks a lot! I've been able to get the GALX value (by finally, using a cookiejar feature with Curl, and parsing the resulting file, couldn't find another way...)

I've seen no expiration date for this cookie, so I'm wondering, does it only refresh its value when the session ends? (seems like it, from what I've read about the 0 expiration date)

Now, I'll try to do the POST with my auth parameters + GALX, and the Python code will certainly be much clearer for me to understand (I've never used Java, in fact).

@cfecherolle
Copy link
Author

I've done the POST request, and I do get the whole source code of the developer console's homepage as a result of my POST, the same that I get when login in my browser usually, so here, the job seems done! :)

EDIT: Off-topic bit removed, so it won't get confusing for anyone else who reads this page

@nelenkov
Copy link
Contributor

nelenkov commented Jan 9, 2013

This is getting quite a bit off topic. We can help with understanding the protocol, but not really with your program, especially without seeing it. Also not exactly clear what you are trying to do? BTW, all cookies used are session cookies, so there is no expiration date. They will be gone once you close/destroy your HTTP client instance.

@cfecherolle
Copy link
Author

I understand. Actually, I should have been more precise while asking. (comment edited)

In fact, it looks like I'm authenticated correctly now (with the right cookies), so the next step is to send basic requests, such as getting the apps list.

I've been trying to do so, by observing related Andlytics files (mainly DevConsoleV2, DevConsoleV2Protocol) and decided to start by fetchAppInfos() (DevConsoleV2).

But there is a part which I don't understand. I need a xsrf token to put into a template at some point, in createFetchAppInfosRequest.

Indeed, the function returns : String.format(FETCH_APPS_TEMPLATE, sessionCredentials.getXsrfToken());

I can't see where the xsrftoken property of these sessionCredentials has been set before, so that it can be returned and used to format the string with this template.

Sorry for the previous off-topic bit, and thanks again for helping me.

@nelenkov
Copy link
Contributor

nelenkov commented Jan 9, 2013

The token and developer ID are extracted from the first response and used for all subsequent ones. That's towards the end of PasswordAuthenticator#authenticate(). The actual implementation is in BaseAuthenticator.

@cfecherolle
Copy link
Author

I've now managed to generate fetchAppsInfosRequest with the xsrfToken, in a JSON string format, and the url for this request (fetchAppsUrl). I also have my curL Cookiejar file, which still contains all the cookies from the requests I've made so far (including the AD cookie, which I saw mentionned in your code even if I have no idea of its actual purpose).

Yet when I send this POST, Google understands my request but refuses to give me the data.

I get this response : {"error":{"data":[null,-1] ,"code":-1}}

I'm asking you in case you already got this error before, and in case it has a precise cause. I'm quite stuck with this.

@nelenkov
Copy link
Contributor

Not sure about the exact cause, but you are probably missing some parameter. Did you append the developer ID to the URL? Compare with identical request in Chrome/FF, etc. Rinse and repeat :)

@cfecherolle
Copy link
Author

I did it! I've managed to get the list of the apps data! Thank you for your advice!
Actually, some of the headers I was sending were not quite right, I did some testing and now it's working.

Next step, 'll try to do a parser quite like yours, and extract some things out of this JSON :)

I think that the hardest part is done, still. :D

@nelenkov
Copy link
Contributor

Congratulations :) JSON parsing is indeed somewhat easier, but also tricky because you are almost never sure which parts are always there and which can be omitted (null, etc.).

@cfecherolle
Copy link
Author

Okay, so I managed to get some data about the app (ratings, active installs, etc).

Yet, I am gonna need to store previous data from the past statistics. I saw that you have made a function for this (DevConsoleV2#fetchStatistics, thank you for this I think it's gonna help me a lot) but you are not using it at the moment.

I would like to put historical statistics into a database, so I'll need to fetch some. I want to give it a try but I'm missing some information about one parameter: statsType.
Since you are not currently using that function in your app I can't check the needed values by launching debug mode. I don't understand this parameter, since you already have a hard-coded STATS_BY_ANDROID_VERSION (= 1) in the string format generating the request. What's the difference between those two integers?

@willlunniss
Copy link
Contributor

You can get full historical stats by types eg active installs, daily installs and by breakdown eg android version, app version. However, you can also get simple active installs and total downloads in the main app info request, hence why we don't need to use the full one yet.

See https://github.com/AndlyticsProject/andlytics/blob/master/src/com/github/andlyticsproject/console/v2/DevConsoleV2Protocol.java#L40:L51 for the constants

Also note that at the moment the full statistics parsing just jumps to the last entry, so you will need to itterate over all entires.

@cfecherolle
Copy link
Author

Okay, so I gave it a try. It's true that these JSON are quite a hell to parse, NULL nodes everywhere, data scattered between nested arrays... Plus, this data keeps changing (not every app has this or that android version) but I managed to create objects to keep it.

For now, I've done it by Android Version, and by Device but the method should stay the same for the other ones.

For example, if I want to get updates by Android version, I return an associative array with one entry per timestamp (converted to yyyy-mm-dd format) and with one property per entry for each Android Version. When you get the property's value, you have the number of downloads/updates/whatever.

Now I'll retrieve the data in the same way for the other types (Country etc.) and then see what I'm gonna do to put this into a database. I thought about one table with general app infos, and several others for version, device, etc (with a key which would be the packageName of the app).

@cfecherolle
Copy link
Author

Hi, it's me again. In the meantime, I fetched every historical kind of data in associative arrays, the functions are working fine.
Yet something disturbs me. I don't know if that's the case for anybody but while the active installations count for my app (the only global stats, actually) do show me every version of android/app/carrier/whatever, I can only see a few of them in my daily stats (installs/uninstalls/updates). I can't understand why, and this is making those stats quite not usable for me right now, since I can't find why the daily ones would show me like 2 android versions for the past few months (day by day) while the active installations have been rising for at least 4 versions.

I encounter this weird behaviour in any app in the developer console for almost every sorting parameter. I should add that it is visible in the GUI, so it's not a fetching problem.

I can't find the logic of it. Any clues?

@nelenkov
Copy link
Contributor

If it is the same in the console, there is not much you can do about it. And your should really treat those numbers more as reference, than as absolutes, they are known to fluctuate or be outright wrong sometimes (eventually get fixed/normalized).

As for using a DB, look at the *Table.java files in andlytics for some hints. Depending on your purposes though, this may not be the best schema design for you.

@AndyScherzinger
Copy link
Member

I am closing this issue to keep the issue log as short as possible.

Discussion on this topic/issue can still go on :)

@cfecherolle
Copy link
Author

So. Hello again, I'm still working on my project, and I have the weirdest error right now.

Since this morning (GMT +1) I can't seem to make successful requests to Google anymore.
And... the JSON response I have is : http://img11.hostingpics.net/pics/553207what.png (screenshot)

What the hell is happening? Is Google trolling me or something...? I'm quite worried now.

@AndyScherzinger
Copy link
Member

Usual but painful API changes, see #314

@cfecherolle
Copy link
Author

Sorry I'm posting this so late, but I've made a tutorial to fetch data with PHP, if anyone is interested in using/studying it, here's the link:

http://neko-spirit.fr/public/tutorial/tuto.php

I also wanted to thank you again for your help guys, I really appreciated it.

@nelenkov
Copy link
Contributor

nelenkov commented May 9, 2013

You should set your client to follow redirects automatically, it seems this is not the default in newer versions.

@weasr
Copy link

weasr commented Oct 9, 2013

hello cfecherolle
can you please share a different tutorial link the one you posted is not working ?

thanks allot

@weasr
Copy link

weasr commented Oct 9, 2013

@cfecherolle Hello cfecherolle can you please share a different tutorial link the one you posted is not working ?

@cfecherolle
Copy link
Author

Hi, the server on which I was hosting it has now been down for a few
months, and I have currently no other place to put it online... Plus, the
version I had must be outdated now. If you have any ideas about what to do
with it, just tell me and I'll try!

2013/10/9 weasr notifications@github.com

@cfecherolle https://github.com/cfecherolle Hello cfecherolle can you
please share a different tutorial link the one you posted is not working ?


Reply to this email directly or view it on GitHubhttps://github.com//issues/288#issuecomment-25980751
.

@weasr
Copy link

weasr commented Oct 14, 2013

@cfecherolle thanks allot for your respond , I would appreciate if you could send me the file to my email address , since I'm currently working on a similar project and want to have some progress on, many thanks in advance. my email is : axel.rewdas@gmail.com

@cfecherolle
Copy link
Author

Hello again @weasr ! I've put it back online on my own student web space (I hadn't thought of this possibility before!)
You can find this (outdated) tutorial here now: http://wwwetu.utc.fr/~cfechero/developerconsole_tutorial/

It might help others, so instead of giving it to you by email, I'd rather post it here :)

@weasr
Copy link

weasr commented Oct 15, 2013

thanks allot cfecherolle, , much appreciated that you took the time to put the tutorial online again, and agree it might help others as well :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants