Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Great RHESSI Cleanup of 1.0 #2808

Merged
merged 28 commits into from Nov 13, 2018

Conversation

@Cadair
Copy link
Member

commented Oct 25, 2018

I noticed that importing sunpy.timeseries was importing sunpy.map (which is stupid) and I traced that to sunpy.instr.rhessi, which was also being imported by sunpy.net which meant that sunpy.net was also importing sunpy.map!! This lead me to completely refactor sunpy.instr.rhessi so that all the data download code was in sunpy.net.dataretriever.sources.rhessi (i.e. only exposed through Fido) and then I removed all the code from sunpy.instr.rhessi.

fixes #2382, fixes #215, fixes #2672 and fixes #2231

[This also removes a few functions in sunpy.instr.rhessi or moves them into net. I think we should probably deprecate everything as a removal in our next 0.9 release]

Think the best option to remove the python-dateutil module at this point is to put dateutil.rrule into extern/. We should decide if that is something we want to do, or if we are happy with the dep.

@pep8speaks

This comment has been minimized.

Copy link

commented Oct 25, 2018

Hello @Cadair! Thanks for submitting the PR.

@sunpy-bot

This comment has been minimized.

Copy link

commented Oct 25, 2018

Thanks for the pull request @Cadair! Everything looks great!

@Cadair Cadair added this to the 1.0 milestone Oct 25, 2018

@Cadair Cadair referenced this pull request Oct 25, 2018

Merged

A dependency cleanup #2807

from sunpy.extern.six.moves.urllib.error import URLError


__all__ = ['get_obssumm_dbase_file', 'parse_obssumm_dbase_file',

This comment has been minimized.

Copy link
@Cadair

Cadair Oct 25, 2018

Author Member

get_obsumm_dbase_file, get_obssum_filename and get_obssumm_file should all be marked as deprecated in 0.9 and say "use Fido".

This comment has been minimized.

Copy link
@nabobalis

nabobalis Oct 25, 2018

Contributor

I can make a note and do a PR for that unless you really want to.

This comment has been minimized.

Copy link
@Cadair

Cadair Oct 25, 2018

Author Member

@ehsteve Does this seem reasonable to you, or are there reasons to keep these as independent functions?

This comment has been minimized.

Copy link
@ehsteve

ehsteve Oct 30, 2018

Member

I think we should keep that functionality here for two reasons with Fido calling it.

  1. It is easier for someone to find the code for a particular instrument rather than it being buried in Fido.
  2. I think there is more information in the dbase files that could be used for other reasons besides Fido.
@Cadair

This comment has been minimized.

Copy link
Member Author

commented Oct 26, 2018

I am really of the opinion that parse_observing_summary_hdulist should be in RHESSITimeseries at this point, seeing how that's what it is written for. Which does make me think we should move the summary base parser into net so that instr.rhessi can just be the backprojection stuff.

@ehsteve @ayshih @samaloney opinions?

@samaloney

This comment has been minimized.

Copy link
Contributor

commented Oct 27, 2018

Would agree with moving parse_observing_summary_hdulist to RHESSITimeseries. Not sure about moving the summary base parser to net, while it's not used in another way, does it really belong in net?

yashkgp and others added some commits Jan 22, 2018

All the data search and download code in instr.rhessi is now in net
This makes things a lot cleaner and prevents sunpy.net importing sunpy.instr
@Cadair

This comment has been minimized.

Copy link
Member Author

commented Oct 30, 2018

This now also includes #2714

@Cadair Cadair changed the title All the data search and download code in instr.rhessi is now in net The Great RHESSI Cleanup of 1.0 Oct 30, 2018

@Cadair

This comment has been minimized.

Copy link
Member Author

commented Oct 30, 2018

Would agree with moving parse_observing_summary_hdulist to RHESSITimeseries. Not sure about moving the summary base parser to net, while it's not used in another way, does it really belong in net?

Probably not, I think I am happy enough to keep them both in instr.rhessi for now.

@Cadair Cadair requested review from ayshih, ehsteve and samaloney Oct 30, 2018

@ehsteve

This comment has been minimized.

Copy link
Member

commented Oct 30, 2018

As I noted above my opinion is that we keep all instrument specific code in instr and other parts of SunPy make use of it. I think it would be easier from a developer perspective to ask someone from an instrument to fix a particular instrument specific problem and not have them have to learn how Fido works (or other part of SunPy).

@Cadair

This comment has been minimized.

Copy link
Member Author

commented Oct 30, 2018

A general point on this:

As I noted above my opinion is that we keep all instrument specific code in instr and other parts of SunPy make use of it.

This is not the case at the moment, and probably will never completely be the case. I.e all the map sources, timeseries sources and net sources are all "instrument specific" (especially timeseries which has a bunch of file parsing routines in it).

I feel like the instr subpackage is really poorly defined, it has a bunch of disparate code in it that feels like it doesn't really belong. The two bits of instr I can see are the fermi rhessi and lyra code for parsing event lists, and the goes emission measure stuff. Then a few extras like aia prep and a weird iris helper.

I half feel that we should have a better idea of what we want to allow in instr and what should go in other places and what should go in affiliated packages.


A little more specific to this PR:

I am happy to have the file parsing code live in instr because I can see that a) sometimes we might need to parse a file not into a map or a timeseries and b) sometimes you might want to parse a file for other purposes other than creating a map or timeseries.

However, I strongly feel that file download should be handled exclusively by net (having just looked through instr, I now realise this is a bigger task than just rhessi). Partly this is a dependency issue, in my mind it is the net submodule that should be the one that has dependencies related to network access and file download, not instr or anywhere else. Secondly, I think it just makes conceptual sense!!

Therefore, on this specific PR, I am happy to move parse_obssumm_dbase back into instr. However, functions starting with get_ I really do think should live in the submodule designed for the downloading of data. The way I have implemented get_observing_summary_dbase_file is as a static method, so it is trivial to still use it in isolation if that is what people want to do. (it's just RHESSIClient.get_observing_summary_dbase_file("2011/01/01") for instance).

Edit: parse_obsumm_dbase never moved out of instr.

@ehsteve

This comment has been minimized.

Copy link
Member

commented Oct 30, 2018

@Cadair all good points and I'm fine with proceeding with this PR as it stands but the other consideration is how we might support future affiliated packages. Say there is a rhessi package. it would make more sense for the RHESSI team to proivde all of the baseline download codes (and maintain them) and we just use them in SunPy. Having things in instr is a way to model that potential future.

@Cadair

This comment has been minimized.

Copy link
Member Author

commented Oct 30, 2018

With regards to net, affiliated packages can happily register clients with Fido. This is something I am planning on doing with the DKIST user tools early next year, so I imagine it will motivate me to make sure it is well supported. Also I believe it is something heliopy are considering doing as well.

@Cadair

This comment has been minimized.

Copy link
Member Author

commented Oct 31, 2018

@ehsteve if you are ok with this as it is can you review and approve? 👼

@ehsteve

ehsteve approved these changes Nov 7, 2018

Cadair added some commits Nov 7, 2018

@Cadair

This comment has been minimized.

Copy link
Member Author

commented Nov 7, 2018

Ok, I think this is ready for a final review...

@Cadair Cadair added the [Review] label Nov 8, 2018

@dpshelio
Copy link
Member

left a comment

All good with me. I've had just a couple of minor points.

This method allows clients to customise the timerange displayed for
each URL.
It should return a sunpy.time.TimeRange object per URL.

This comment has been minimized.

Copy link
@dpshelio

dpshelio Nov 10, 2018

Member

a TimeRange? what if you don't know from the URL what's the obs-end?

This comment has been minimized.

Copy link
@Cadair

Cadair Nov 12, 2018

Author Member

🤣 you wrote this function, I just added the docstring!

If you don't know the end time, I guess you make it up! It makes no sense to return anything other than a range here, all files will have some range, I would really hope you know what it is.

This comment has been minimized.

Copy link
@dpshelio

dpshelio Nov 12, 2018

Member

Really?? Then of course we know it!!
I'll have to look into it with more detail... but not know. Let's merge this and one day I'll get my head around the scrapper again.


class RHESSIClient(GenericClient):

def get_observing_summary_filename(self, time_range):

This comment has been minimized.

Copy link
@dpshelio

dpshelio Nov 10, 2018

Member

I thought we had this done using the scraper..

This comment has been minimized.

Copy link
@Cadair

Cadair Nov 12, 2018

Author Member

We had? I don't remember that. You got a link?

This comment has been minimized.

Copy link
@dpshelio

dpshelio Nov 12, 2018

Member

I think it's on one of the automated closed PRs... I got a list to work on these... so one day (hopefully before 1.0).

from sunpy.net.fido_factory import UnifiedResponse
from sunpy.net.dataretriever.client import QueryResponse

LCClient = rhessi.RHESSIClient()

This comment has been minimized.

Copy link
@dpshelio

dpshelio Nov 10, 2018

Member

Since we don't hae LC anymore... shouldn't we call this differently?

@dpshelio

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

More conflicts!! I won't be able to press any merge button today?? 😿

@nabobalis

This comment has been minimized.

Copy link
Contributor

commented Nov 13, 2018

ITS ALL ALIVE (BAR THE COVERAGE)

@nabobalis nabobalis merged commit ba9820c into sunpy:master Nov 13, 2018

9 of 10 checks passed

codecov/patch 36.84% of diff hit (target 43.78%)
Details
ci/circleci: egg-info-36 Your tests passed on CircleCI!
Details
ci/circleci: egg-info-37 Your tests passed on CircleCI!
Details
ci/circleci: figure-tests-36 Your tests passed on CircleCI!
Details
ci/circleci: html-docs Your tests passed on CircleCI!
Details
codecov/project 43.8% (+0.02%) compared to 03add11
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
giles Click details to preview the documentation build
Details
sunpy-bot All checks passed

@Cadair Cadair deleted the Cadair:rhessi_refactor branch Nov 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.