Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in requesting specific RGTs and cycles #166

Closed
icetianli opened this issue Jan 27, 2021 · 10 comments
Closed

Error in requesting specific RGTs and cycles #166

icetianli opened this issue Jan 27, 2021 · 10 comments
Assignees

Comments

@icetianli
Copy link
Contributor

icetianli commented Jan 27, 2021

I was trying to download different cycles of ground tracks of interest by passing a cycle list and a track list. I know a similar issue has been discussed before, but when I tried to get multiple RGTs of one cycle, the returned dataset contains all the tracks in between the RGT numbers of this track list, instead of only returning the two RGTs I am interested, e.g. tracks '0415', '0598' in cycle 7:

short_name = 'ATL06' 
spatial_extent = [-61.7189, -83.5349, -60.7723, -83.3023] 
date_range = ['2019-07-30','2020-11-28'] 
#region_a = ipx.Query(short_name, spatial_extent, date_range)
ipx.Query(short_name, spatial_extent, date_range, cycles='07', tracks=['0415','0598']).avail_granules(ids=True)

The above code will return:

[['ATL06_20200422180234_04150711_003_02.h5',
  'ATL06_20200426034538_04670711_003_02.h5',
  'ATL06_20200426175414_04760711_003_02.h5',
  'ATL06_20200430033716_05280711_003_02.h5',
  'ATL06_20200430174552_05370711_003_02.h5',
  'ATL06_20200504173733_05980711_003_02.h5']] 

When querying multiple RGTs from different cycles, I will receive KeyError: ‘feed’:

ipx.Query(short_name, spatial_extent, date_range, cycles=['06', '07'], tracks=['0415','0598']).avail_granules(ids=True)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-5-b506fd9b44e5> in <module>
----> 1 ipx.Query(short_name, spatial_extent, date_range, cycles=['06', '07'], tracks=['0415','0598']).avail_granules(ids=True)

/anaconda3/envs/icepyx/lib/python3.7/site-packages/icepyx/core/query.py in __init__(self, dataset, spatial_extent, date_range, start_time, end_time, version, cycles, tracks, orbit_number, files)
    167             self._reqparams.build_params()
    168             # update the list of available granules
--> 169             self.granules.get_avail(self.CMRparams, self.reqparams)
    170 
    171     # ----------------------------------------------------------------------

/anaconda3/envs/icepyx/lib/python3.7/site-packages/icepyx/core/granules.py in get_avail(self, CMRparams, reqparams)
    165             # print(results)
    166 
--> 167             if len(results["feed"]["entry"]) == 0:
    168                 # Out of results, so break out of loop
    169                 break

KeyError: 'feed'

Any idea why? I was also wondering if it’s possible to only return the tracks with specific cycle numbers and RGT numbers that I passed in the Query class. Thanks!

@JessicaS11
Copy link
Member

JessicaS11 commented Jan 28, 2021

The KeyError is coming from the lack of the appropriate key in the dictionary of results returned from NSIDC. I started tackling #79 accidentally in working through your issue, and if you run:
reg_c=ipx.Query(short_name, spat_ext, date_rng, cycles=['06','07'], tracks=['0415','0598']) on the surf_errs branch you'll at least now get a ValueError that surfaces the error coming from NSIDC:
["orbit_number must be an number or a range in the form 'number,number'.", '[7551,7734,8938] is not a valid class java.lang.Double'].

I'm not entirely sure what to make of it, since on the surface (CMRparams dict prints as: {'short_name': 'ATL06', 'version': '003', 'temporal': '2019-07-30T00:00:00Z,2020-11-28T23:59:59Z', 'bounding_box': '-61.7189,-83.5349,-60.7732,-83.3023', 'orbit_number': '7551,7734,8938,9121'} it appears that the submission is in the correct format. I'm hoping @nicholas-kotlinski @asteiker @wallinb @tsutterley or someone else at NSIDC can provide some insight. Otherwise I will try to dig into this more tomorrow/early next week.

A favor to ask: in the future, if you could copy-paste the code blocks directly into GitHub (an image of the error trace is fine) it would be super helpful for recreating the problem locally without having to retype everything. I do greatly appreciate you including a MWE to reproduce the problem!

@icetianli
Copy link
Contributor Author

icetianli commented Jan 29, 2021

Thanks @JessicaS11 for checking on this issue and I have updated the code blocks! Just tried surf_errs branch and it indeed shows ValueError: An error was returned from NSIDC in regards to your query: ["orbit_number must be an number or a range in the form 'number,number'.", '[7551,7734,8938] is not a valid class java.lang.Double']. It's interesting that it works fine if I only pass one cycle '06' or '07' each time, but not for both cycles. Looks like it's something to do with orbit_number calculation?

@dpyles97
Copy link

I am having similar issues as @icetianli. Both of these issues also extend to query's for a single RGT, like so:

short_name = 'ATL06'
spatial_extent = [-51.5122, 71.7328, -51.2279, 71.8399]
date_range = ['2018-10-28', '2020-10-27']
region_a = ipx.Query(short_name, spatial_extent, date_range, cycles=['03','04'], tracks=['0468'])
region_a.avail_granules(ids=True)

This query for RGT '0468' returns cycles '03' and '04', but also many other unwanted RGT's:

[['ATL06_20190428215904_04680305_003_01.h5',
  'ATL06_20190523075828_08410303_003_01.h5',
  'ATL06_20190527203501_09100305_003_01.h5',
  'ATL06_20190531202641_09710305_003_01.h5',
  'ATL06_20190621063429_12830303_003_01.h5',
  'ATL06_20190625062608_13440303_003_01.h5',
  'ATL06_20190625191102_13520305_003_01.h5',
  'ATL06_20190720051023_03380403_003_01.h5',
  'ATL06_20190724050206_03990403_003_01.h5',
  'ATL06_20190724174700_04070405_003_01.h5',
  'ATL06_20190728173843_04680405_003_01.h5']]

Similarly, a query containing more than two cycles returns the KeyError: ‘feed’:

short_name = 'ATL06'
spatial_extent = [-51.5122, 71.7328, -51.2279, 71.8399]
date_range = ['2018-10-28', '2020-10-27']
region_a = ipx.Query(short_name, spatial_extent, date_range, cycles=['03','04','05'], tracks=['0468'])
region_a.avail_granules(ids=True)
KeyError                                  Traceback (most recent call last)
<ipython-input-240-5996c2b70974> in <module>
      2 spatial_extent = [-51.5122, 71.7328, -51.2279, 71.8399]
      3 date_range = ['2018-10-28', '2020-10-27']
----> 4 region_a = ipx.Query(short_name, spatial_extent, date_range, cycles=['03','04','05','06'], tracks=['0468'])
      5 region_a.avail_granules(ids=True)

~\Anaconda3\lib\site-packages\icepyx\core\query.py in __init__(self, dataset, spatial_extent, date_range, start_time, end_time, version, cycles, tracks, orbit_number, files)
    167             self._reqparams.build_params()
    168             # update the list of available granules
--> 169             self.granules.get_avail(self.CMRparams, self.reqparams)
    170 
    171     # ----------------------------------------------------------------------

~\Anaconda3\lib\site-packages\icepyx\core\granules.py in get_avail(self, CMRparams, reqparams)
    165             # print(results)
    166 
--> 167             if len(results["feed"]["entry"]) == 0:
    168                 # Out of results, so break out of loop
    169                 break

KeyError: 'feed'

From my experience, this KeyError: ‘feed’ error is generated by the following three general queries:

  1. A query of 3 cycles and 1 RGT
  2. A query of 2 cycles and 2 RGT's
  3. A query of 1 cycle and 3 RGT's

OR

If the quantity of cycles/RGT's are increased in any of these three scenarios (i.e. 4 cycles and 2 RGT's, 2 cycles and 4 RGT's, etc.).
I hope these examples are helpful.

@JessicaS11
Copy link
Member

Thanks for all the detailed examples @icetianli and @dpyles97. I've figured out what the issue is, with the help of a previous error post on Discourse. Unfortunately, right now the limitation is from the CMR (Common Metadata Repository) end (one of the underlying tools that icepyx uses to actually order the correct data from NSIDC).

Behind the scenes, icepyx turns the cycles and tracks you submit into orbit numbers, which are submitted to CMR to find the right granules. Unfortunately, CMR only accepts single value or range inputs for orbit_number. Thus, a single cycle + one or two tracks produce valid inputs from CMR's perspective (but if the tracks are not contiguous, you get all of the granules between those two tracks, as Tian noted in her initial post:

the returned dataset contains all the tracks in between the RGT numbers of this track list, instead of only returning the two RGTs I am interested, e.g. tracks '0415', '0598' in cycle 7.

and Dakota noted in her examples.

Similarly, if you try to submit any combination of tracks * cycles that results in >2 orbit numbers (as helpfully noted in @dpyles97 generalized query notes), you get a feed error because CMR cannot handle the list of orbit numbers.

As you'll note from the Discourse post, as of October 2020 the CMR team was considering expanding orbit_numbers to accept list inputs. I've reached out to get an update on whether or not this is moving forward and if so, when the release might be. Once we hear back from them, we can decide how we'd like to move forward (obviously we'll need to stop submitting invalid requests; one option could be to create an internal workaround - this would work well with @tsutterley and @andypbarrett's work on #148, which will allow ordering by granule name).

@icetianli
Copy link
Contributor Author

Great, thanks @JessicaS11 for keeping us updated! I am happy to help with this when the orbit_numbers is solved by the CMR team.

@dpyles97
Copy link

dpyles97 commented Feb 1, 2021

Thanks for the update, @JessicaS11! Good to know - I hope we can get a work-around for these issues in the not too distant future.

@JessicaS11
Copy link
Member

Hello @dpyles97 and @icetianli. I just merged #148 into development, which allows searches by orbital parameters. I haven't tried these updates with your specific use cases, but I wanted to reopen the conversation to let you know this functionality was available and reopen the conversation if we'd like to think about implementing a way to query/order data as you'd like.

@icetianli
Copy link
Contributor Author

This is awesome! Thanks @JessicaS11, will check this out! Now we can also add orbital parameters into the Visualization object as well.

@dpyles97
Copy link

Good to know @JessicaS11. This is great! Thanks for working on this.

@JessicaS11
Copy link
Member

@icetianli @dpyles97 I was looking for an example where I got a feed error to test a new error message. The bad news is I can't test the error message. The good news is that I was able to run all of the problem examples you provided with expected results, so I'm closing this issue. : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants