-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add feature to get historical data for a particular location #3
Comments
I think there may be a way to do this by using Requests to fill up the form on this page, and then somehow programmatically clicking the download button to get the CSV locally onto the user's computer, into the same directory as their project. |
Hello this seems interesting can I work on it? |
@Sam-damn yeah sure! That'd be awesome💯 This is a pretty major feature to add, so I think a new feature branch is probably a good idea for it. |
@Sam-damn I've created a new branch - Good luck! |
alrighty I’ll checkout the branching model so I can familiarize myself with the process and begin! Thanks for the info |
so i have been researching how this can be accomplished using the requests library , in order to submit a form and get the history data file u would have to emulate what requests your own browser is sending upon pressing the submit button , so i inspected the network tab in the developer tools in my browser and saw the POST request my browser sends upon clicking submit button and i tried to emulate it exactly using requests but problem is all it sends me back as a response is an IP and status , where as using a browser it generates a downloadable file as save as dialog, do u have any idea if i should be performing GET request on this IP that is returned back to me (i'am not sure if that is even possible since GET requests are usually performed on a URL)? overall i feel like this task could be accomplished using headless browser tool such as selenium but selenium requires other dependencies that cannot be listed as python packages, what are your thoughts on this? |
@Sam-damn I've used selenium before, and I'm not opposed to it being used for this feature. What do you mean by 'other dependencies that cannot be listed as python packages' ? Meanwhile, I'll do some research into whether Requests can actually be used for this at all. |
@Milind220 one of selenium dependencies is a web driver interface which is usually a binary which needs to be installed manually and so it cannot be listed as a pip package, nevertheless I think there’s a python package that helps with this. If u find anything about wether requests can be used for this do let me know 😁. |
@Sam-damn After doing some research I'm confident that requests could be used for this. Here are some links to YouTube videos that do similar things (you can refer to them if you need to) To fill up the form to access the downloads, this video of logging into websites using Requests would help. It's a similar task that we need: https://www.youtube.com/watch?v=bM50i7sKwwM To download the files: https://www.youtube.com/watch?v=UMuO2_BVFwY Lemme know what you find when you try it out! Also did you have any luck with that other python package? |
I have made a lot of progress using selenium, however much like in requests , I got stuck at the same stage where a save as dialog appears “you have chosen to save this file” this dialog is an operating system window and since it’s not an element within the browser it hence it cannot be accessed using selenium , I have to tried to bypass this by changing the settings of the web driver profile to suppress this dialog box and to allow for an automatic save to a custom location but this doesn’t work for some reason. Ps : this dialog box seems to be a common problem as I observed from many stack overflow questions |
As for using requests library , this same issue becomes even harder to solve because in order to download a file using requests u would need a URL to perform a GET Request on ,which we don’t have , and unfortunately the links u sent me do not Tackle this issue , nevertheless I will keep trying using selenium And keep you updated. And If downloading the actual csv file doesn’t work , as a last resort we can simply web-scrape the data from The Table element (which appears after filling the search bar and before submitting the form) and then simply construct a csv file from that data and then pass it to pandas or do whatever we want with it , however I’m not sure whether that data table is complete or not. |
@Sam-damn Now that you mention it, I think webscraping the table element is genius! It appears to be the easiest solution to this problem. I checked it out for a few locations, and the table is 100% complete for all the parameters. Great thinking man! Let's try this:
|
@Sam-damn Actually, if you manage to webscrape the table with Selenium, that's fine too. I suppose we can ask users to download the WebDriver on their own, or perhaps setup a shell script to download it separately (idk if that's possible, but just an idea) EDIT: I found this package which could help us out with the WebDriver part. It downloads the WebDriver on the spot, which would allow us to add selenium as a regular dependency. |
Alright then i will focus on the scraping then and i will keep you updated, also what a coincidence i actually came across that package two days ago and been using it , its quite handy! |
@Sam-damn hahaha that's great. Let me know how it goes! |
@Sam-damn Any progress with that? This is a pretty exciting feature for us to add - it would create a lot of opportunity for expansion and usage of the package. Historical data is very important for researchers, and this would make it really simple for them to get data. I'm hoping to get some professors from my university to use it if we can get this to work! |
@Milind220 It’s almost finished! I got it working nicely now , and I tested it a lot , hopefully it will work for everyone , currently Iam just organizing the file and making it more readable and adding the docs (the methods docs and class )and stuff and the packages etc... Also I apologize for the delay , we have a pretty bad electricity situation here, so I have been working on it whenever I could 👍🏻👍🏻 |
@Sam-damn No problem at all! Your work has been top notch :) |
@Sam-damn Hey, check your email! |
@Milind220 i sent a reply 😁 |
Currently only live data can be fetched given a location. Most users would find more utility in large datasets of historical data.
I don't think that the WAQI api is capable of providing historical data, but the WAQI website does have a resource for downloading CSV's and Excel sheets of historical data - Perhaps it could be possible to download and read the CSV from there, programmatically.
The text was updated successfully, but these errors were encountered: