-
-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some JSOC data fails to download using Fido #3735
Comments
yeah looks like we should be skipping over missing files rather than raising an Exception and stopping. |
Hey , I would like to work on this. |
Go ahead @twentyse7en |
I don't think we can skip over files as we create a drms export request using the entire time range and other arguments, also I don't think jsoc even supports this. I do know that if you include an extra query argument in the export request |
Not a good idea but it does 'fix' the problem |
Maybe this hack is enough, thanks for mentioning. Waiting until upstream has any other suggestions |
Do we have the ability to search by Quality? if not we could add it as an |
Yes, ideally that skipping should be done on the VSO side in the GetData function that generates the list of URLs. Better also would be more info returned as to why files are marked as "missing". |
@ejm4567 To be clear this is our direct JSOC / drms client here, not going through VSO at all. The segments here: segments = a.jsoc.Segment('field') & a.jsoc.Segment('inclination') & a.jsoc.Segment('azimuth') & a.jsoc.Segment('disambig') are being handled as logical AND ( |
Thanks, I figured it was using the drms module. Given that the query was for an HMI series, the string of segment field names asked for implies to me that one, or more, of them are either not populated, or missing in the DRMS DB itself. The fix is best in the C code in DRMS before it returns to drms. |
One should always include a query clause: [?QUALITY >= 0?] to avoid records with no files associated. |
Thanks for the advice @pscherrer 😄 I think that the best way to solve this would be to:
|
* Add new Quality attr to jsoc * Add default Quality('? QUALITY >=0 ?) to all jsoc quers
* Add new Quality attr to jsoc.attrs * Add default equivalent to Quality('>=0') to all jsoc queries
We have had discussions about making a quality clause a default but (I) decided it would have side effects that could cause problems for some data series that are seldom used. Not all series have a QUALITY keyword and some of those that do do not use the top bit for a no-file flag. But all SDO normal science products do. So a test can be always there for AIA lev1 and related series and all HMI "observables" like M, B, Ic, Ld, V, sharps, and all MDI that remote users are likely to use. Only add the test in calls to jsoc_fetch and omit it for jsoc_info calls.
We fill gaps in most slotted series so the processing code cal tell that all intervals have been processed. The other 31 bits will tell why the record has no data file.
Get BlueMail for Android
…On Feb 5, 2020, 4:35 AM, at 4:35 AM, Stuart Mumford ***@***.***> wrote:
Thanks for the advice @pscherrer 😄
I think that the best way to solve this would be to:
* Implement a new `jsoc.attrs.Quality` attr.
* Add support for that where we build our query, and default it to `>=
0`
* Document this so it doesn't catch people out.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#3735 (comment)
|
I have been working on the bugs in observer location keywords and have some questions about the astropy and sunpy values.
Main error was omission of adding 499 seconds for light travel to 1 AU while subtracting time to SDO. Fixing that will bring errors almost below the 0.01 degree goal.
Will also fix the Stonyhurst coords that were never implemented since we never use them. Offset is 0.015 degrees with 24 hr period.
Less than 1/2 an AIA pixel. We do not even include them in the HMI past lev1.
In my cross checking I found that sunpy only doused the longitude but not the latitude, it just reports Carrington lat. Ok if on Earth.
Our code defaults to on Earth for both. I will fix this when we release fix for CRLN_OBS.
Am trying to find reason that astropy Carrington differs from new JSOC and JPL Horizons values by about 0.0064 degrees with an annual period with peak to peak about same 0.006. So for part, most, of the year the difference is > 0.01.
Who is best person to contact.
I was browsing in sunpy git issues for discussion of these coordinates when I saw the post that I commented on.
I see that there have been discussions about SDO location errors for several years
I was really annoyed that nobody bothered to tell us. I found the problem last July and have been sorting it out when I get a chance.
We are still discussing how to efficiently fix a billion records.
Get BlueMail for Android
…On Feb 5, 2020, 4:35 AM, at 4:35 AM, Stuart Mumford ***@***.***> wrote:
Thanks for the advice @pscherrer 😄
I think that the best way to solve this would be to:
* Implement a new `jsoc.attrs.Quality` attr.
* Add support for that where we build our query, and default it to `>=
0`
* Document this so it doesn't catch people out.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#3735 (comment)
|
Me! I'll write you an email. |
* Add new Quality attr to jsoc.attrs * Add default equivalent to Quality('>=0') to all jsoc queries
Some very applicable information on sunpy/drms#37 |
I think the only we can handle this cleanly is to have instrument specific missing data attributes e.g |
This message has no context for me.
but we do always suggest that unless you are interested in the reason
some data is not available that the query clause [?QUALITY>=0?] always
be included in a JSOC export request.
We have not done this automatically because some need the information
about why some data is not available and we have not agreed upon
a way to disable this clause in cases where someone wants a more
specific limit to data requested. Some want data with some QUALITY
bits ignored but some a cause for rejection. Since some outside
users bypass exportdata.html and use the cgi.bin API we can not
solve the problem by only doing it in exportdata.html. And
jsoc_fetch is used for many series that do not have a QUALITY keyword
or have a QUALITY but not using the pr4esent HMI and AIA conventions.
We resist putting special cases in the general code.
If remote users use jsoc_info FIRST to fetch the metadata and then
use direct access using the URLs that can be constructed from
the information in the metadata, so that jsoc_fetch can be
simply bypassed for the full resolution files desired there
would be no reason to request files that are not present.
If on-export processing is needed however jsoc_fetch must still be
used, but that is usually not done via a standard API in a
SSW, Python, or Matlab query.
If you want a specific
data is missing quantity just use a bit match to bit #31
in QUALITY on all requests and not hinder specific queries
that care about other bits in the case that the data is not
missing. It might make sense to simply add such a test in
all HMI and AIA queries from sunpy.
Depending on how the metadata is fetched in Python it may be
that QUALITY does not end up as a negative int unless the
metadata metadata in consulted where one finds that QUALITY
is a 32-bit value to be shown as Hex.
…On 7/15/2020 9:27 AM, Shane Maloney wrote:
I think the only we can handle this cleanly is to have instrument
specific missing data attributes e.g |attrs.jsoc.SDO_AIA_MISSING| and
|attrs.jsoc.SDO_HMI_MISSING| as I don't think it's possible to support
general queries on the quality flag with the supported operations.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3735 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC3BRFMBPINSUVSR3HE6UQDR3XKIFANCNFSM4KMFMEPA>.
|
The code I've pasted below works for me, so I think this is no longer an issue. If anyone disagrees, feel free to comment/re-open, or open a fresh issue. import astropy.units as u
import astropy.time
import sunpy
from sunpy.net import jsoc, fido_factory, Fido, attrs as a
import drms
series_name = "hmi.B_720s"
notifier = a.jsoc.Notify("d.stansby@ucl.ac.uk")
series = a.jsoc.Series(series_name)
segments = a.jsoc.Segment('field') & a.jsoc.Segment('inclination') & a.jsoc.Segment('azimuth') & a.jsoc.Segment('disambig')
attrs_time = a.Time('2017/09/06 05:40', '2017/09/06 06:30')
res = Fido.search(attrs_time, series, notifier, segments)
print(res)
# Takes a while and downloads ~180MB
dl_files = Fido.fetch(res) |
Description
I'm trying to download SDO HMI magnetic data for selected time ranges. Sometimes there are MISSING entries in the search results. If you try to download those "corrupted" UnifiedResponse objects, Fido fails
Expected behavior
Working download without any corrupted objects or ability to filter them out.
Actual behavior
Fido.fetch() crashes with error (see in the gist below)
Steps to Reproduce
This gist: https://gist.github.com/vit1-irk/19d9ebbc69281fd142316f52a2f019ce
Errors also happen when downloading from website
See: https://jsoc.stanford.edu/ajax/lookdata.html
Looks like the problem is on JSOC website, but maybe Sunpy should handle these MISSING entries itself
System Details
Included in the gist. Btw, sorry, but this time additional debugging info (like query dumps) didn't work out properly
The text was updated successfully, but these errors were encountered: