Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented _get_time_for_url in GenericClient #3863

Closed
wants to merge 11 commits into from

Conversation

abhijeetmanhas
Copy link
Contributor

Description

Fixes #3715
After this PR, there is no need to implement _get_time_for_url() function in most of the many clients.
Currently I have implemented _get_time_for_url() in GenericClient so that it is not needed to be implemented separately in every client.
Made a new member crawler in GenericClient which is used to get dates.

It might not be the best way, so I need suggestions from maintainers.

@pep8speaks
Copy link

pep8speaks commented Mar 5, 2020

Hello @abhijeetmanhas! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-03-11 15:00:43 UTC

@abhijeetmanhas
Copy link
Contributor Author

abhijeetmanhas commented Mar 5, 2020 via email

@nabobalis
Copy link
Contributor

autopep8 . I will not use it in the future.

On Thu, 5 Mar 2020 at 22:03, Nabil Freij @.> wrote: @.* commented on this pull request. ------------------------------ In sunpy/net/dataretriever/client.py <#3863 (comment)>: > @@ -116,13 +117,11 @@ class GenericClient(BaseClient): is mainly designed for downloading data from FTP and HTTP type data sources, although should in theory be general enough to get data from any web service. - What reformatter? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3863?email_source=notifications&email_token=AKI5PG7ZWHQRJBA2XBBQEUDRF7H5ZA5CNFSM4LCNREZ2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCYEY3QY#discussion_r388412775>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKI5PG6C6F3QYIDIKMRYE63RF7H5ZANCNFSM4LCNREZQ .

I am not aware that autopep8 also does documentation strings.

@abhijeetmanhas abhijeetmanhas force-pushed the bye_get_time branch 3 times, most recently from 356a976 to 2764016 Compare March 8, 2020 14:58
@abhijeetmanhas
Copy link
Contributor Author

abhijeetmanhas commented Mar 8, 2020

Implemented for all clients which use scraper. Fixes #3814 too.

@abhijeetmanhas abhijeetmanhas marked this pull request as ready for review March 8, 2020 15:03
@abhijeetmanhas abhijeetmanhas requested review from a team as code owners March 8, 2020 15:03
@abhijeetmanhas
Copy link
Contributor Author

@nabobalis I have made the changes suggested by you, and it have passed all checks.

Copy link
Member

@dpshelio dpshelio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, I would keep scraper instead of crawler. So it's more understandable where the object comes from.

@abhijeetmanhas
Copy link
Contributor Author

abhijeetmanhas commented Mar 9, 2020 via email

@abhijeetmanhas
Copy link
Contributor Author

@dpshelio done the changes you suggested. Any other changes expected as of now?

@abhijeetmanhas
Copy link
Contributor Author

@dpshelio I have done the suggested changes.

else:
scraper = self.scraper
almost_day = TimeDelta(1 * u.day - 1 * u.millisecond)
times = [TimeRange(t0, t0 + almost_day) for t0 in map(scraper._extractDateURL, urls)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this assuming one day? The base class should not be making assumptions about the form of the results returned by the clients.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly my doubts were regarding this. What does the start time and end time in the response table represent? Since the data files we get through any query are always associated with timestamps, and if there are multiple files on the same date, what should set the start and end time to? Should they be timeranges of one day, or shoud timeranges should be divided ? (say there are 10 files for one day, so one day is divided in 10 timeranges?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't think it should assume one day. I think the proper way to fix this is to modify the individual _extractDateURL methods to return an interval (ie. start time and end time) instead of currently just returning a start time. That way the length of an interval can be specified on a per-client basis.

@ayshih ayshih added dataretriever net Affects the net submodule labels Apr 13, 2020
@nabobalis nabobalis added this to the 2.1 milestone Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
net Affects the net submodule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GenericClient.search can't know URLs' timeranges without additional function
7 participants