Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading Spaxels using .getSpaxels #725

Open
manduhmia opened this issue May 3, 2021 · 14 comments
Open

Loading Spaxels using .getSpaxels #725

manduhmia opened this issue May 3, 2021 · 14 comments
Assignees

Comments

@manduhmia
Copy link

Describe the bug
Error occurs when attempting to load a large number (~100) of Spaxels at once: using .getSpaxels(threshold = 0.8, lazy= False). An HTTP error usually occurs with trouble finding a specific spaxel's data. However, if I load this problem spaxel on its own, outside the .getSpaxels code using .load() the spaxel is loaded with no error. So it only occurs when attempting to load a larger number of spaxels on instantiation.

To Reproduce
Steps to reproduce the behaviour:

  1. Gather an aperture of over 100 spaxels --reffered to as AP
  2. Attempt "AP.getSpaxels(threshold = 0.8, lazy= False)"
  3. See error

Expected behaviour
The loading of all spaxel data on instantiation.

Screenshots
Screen Shot 2021-05-03 at 10 05 08 AM
Screen Shot 2021-05-03 at 10 07 36 AM
Screen Shot 2021-05-03 at 10 07 48 AM
Screen Shot 2021-05-03 at 10 07 57 AM

Desktop (please complete the following information):

  • OS: macOS Catalina version 10.15.7
  • Browser : Chrome
  • Version of Marvin : 2.6.0

Additional context
Add any other context about the problem here.

@albireox albireox self-assigned this May 3, 2021
@albireox
Copy link
Member

albireox commented May 3, 2021

Thanks for the report @manduhmia. I can have a look at this. Could you please send a minimal working example that causes the issue? I.e., a small script or full set of lines of code that triggers this problem? What version of Python are you using, and what DR/MPL?

@manduhmia
Copy link
Author

manduhmia commented May 3, 2021 via email

@albireox
Copy link
Member

albireox commented May 3, 2021

I can reproduce the error. The initial issue seems to be

[ERROR]: Traceback (most recent call last):
  File "/Users/albireo/.pyenv/versions/3.7.6/envs/marvin-test/lib/python3.7/site-packages/brain/api/api.py", line 201, in _checkResponse
    isbad = response.raise_for_status()
  File "/Users/albireo/.pyenv/versions/3.7.6/envs/marvin-test/lib/python3.7/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://sas.sdss.org/marvin/api/cubes/10214-9102/quantities/25/22/

which appears to indicate that the spaxel information cannot be found. @havok2063 can you check if there is a problem with the loading of the properties for that cube?

@manduhmia
Copy link
Author

manduhmia commented May 3, 2021 via email

@albireox
Copy link
Member

albireox commented May 4, 2021

Does this work when you try using a different cube/plate-ifu? I think it's expected that if there is a problem with the loading of the DAP properties for that plate-ifu it would affect all spaxels.

The error is with the API call to quantities/x/y, which retrieves the DAP properties. When you do cube[x, y] I think you only get the DRP spectrum, so that API call is not issued.

@manduhmia
Copy link
Author

manduhmia commented May 4, 2021 via email

@havok2063
Copy link
Collaborator

The API itself looks to be working ok. I can load the maps quantities for a spaxel via Marvin

cube = Cube('10214-9102')
maps = cube.getMaps()
s = maps[22,24]
s.maps_quantities

The API response works when I manually submit a post request

import requests
url = 'https://sas.sdss.org/marvin/api/cubes/10214-9102/quantities/26/22/'
r = requests.post(url, data={'release':"MPL-11"}, headers={'Authorization':f'Bearer {config.token}'})

It also seems to work ok if I reduce the size of the aperture

ap = cube.getAperture((28,28),(2,2,0.785),aperture_type='elliptical')
spax = ap.getSpaxels(threshold=0.8, lazy=False)

So it seems like there's a problem when we loop over a large number of spaxels. I tried the following where I grab the spaxels lazily and loop over to them to load them. It crashed after the 17th spaxel or so. On a second attempt it crashed after spaxel 14. In the traceback you can see the error about a 404 error url not found. However I can manually access it successfully with requests. We disabled the API rate limiting but maybe there's something similar going on? The second error in the traceback looks like an error in the error handling, and looks like a legacy Python 2-3 issue.

ap = cube.getAperture((28,28),(8,12,0.785), aperture_type='elliptical')
spax = ap.getSpaxels(threshold=0.8, lazy=True)
len(spax))  # 279 spaxels

for i in spax:
    print(i.plateifu, i.x, i.y)
    i.load('maps')

10214-9102 28 19
10214-9102 29 19
10214-9102 30 19
10214-9102 31 19
10214-9102 32 19
10214-9102 33 19
10214-9102 34 19
10214-9102 35 19
10214-9102 26 20
10214-9102 27 20
10214-9102 28 20
10214-9102 29 20
10214-9102 30 20
10214-9102 31 20
10214-9102 32 20
10214-9102 33 20
10214-9102 34 20
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
~/Work/github_projects/marvin_brain/python/brain/api/api.py in _checkResponse(self, response)
    200         try:
--> 201             isbad = response.raise_for_status()
    202         except requests.HTTPError as http:

~/anaconda3/lib/python3.7/site-packages/requests/models.py in raise_for_status(self)
    940         if http_error_msg:
--> 941             raise HTTPError(http_error_msg, response=self)
    942

HTTPError: 404 Client Error: Not Found for url: https://sas.sdss.org/marvin/api/maps/10214-9102/HYB10/MILESHC-MASTARSSP/quantities/34/20/

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~/Work/github_projects/Marvin/python/marvin/tools/maps.py in _get_spaxel_quantities(self, x, y, spaxel)
    461                                                             template=self.template.name,
--> 462                                                             params=params))
    463             except Exception as ee:

~/Work/github_projects/Marvin/python/marvin/tools/core.py in _toolInteraction(self, url, params)
    133         params = params or {'release': self._release}
--> 134         return marvin.api.api.Interaction(url, params=params)
    135

~/Work/github_projects/marvin_brain/python/brain/api/api.py in __init__(self, route, params, request_type, auth, timeout, headers, stream, datastream, send, base, verify)
     56         if self.url and send:
---> 57             self._sendRequest(request_type)
     58         elif not self.url and send:

~/Work/github_projects/marvin_brain/python/brain/api/api.py in _sendRequest(self, request_type)
    294             # Check the response if it's good
--> 295             self._checkResponse(self._response)
    296

~/Work/github_projects/marvin_brain/python/brain/api/api.py in _checkResponse(self, response)
    228                 self._closeRequestSession()
--> 229                 if str('api_error') in json_data:
    230                     apijson = json_data['api_error']

TypeError: a bytes-like object is required, not 'str'

During handling of the above exception, another exception occurred:

MarvinError                               Traceback (most recent call last)
<ipython-input-72-fb65f4645437> in <module>
      1 for i in origspax:
      2     print(i.plateifu, i.x, i.y)
----> 3     i.load('maps')
      4

~/Work/github_projects/Marvin/python/marvin/tools/spaxel.py in load(self, force)
    285
    286         for tool in ['cube', 'maps', 'modelcube']:
--> 287             self._load_tool(tool, force=(force is not None and force == tool))
    288
    289         self._set_radec()

~/Work/github_projects/Marvin/python/marvin/tools/spaxel.py in _load_tool(self, tool, force)
    331         setattr(self, quantities_dict,
    332                 getattr(getattr(self, '_' + tool), '_get_spaxel_quantities')(self.x, self.y,
--> 333                                                                              spaxel=self))
    334
    335     def getCube(self):

~/Work/github_projects/Marvin/python/marvin/tools/maps.py in _get_spaxel_quantities(self, x, y, spaxel)
    463             except Exception as ee:
    464                 raise marvin.core.exceptions.MarvinError(
--> 465                     'found a problem when checking if remote cube exists: {0}'.format(str(ee)))
    466
    467             data = response.getData()

MarvinError: found a problem when checking if remote cube exists: a bytes-like object is required, not 'str'.
You can submit this error to Marvin GitHub Issues (https://github.com/sdss/marvin/issues/new).
Fill out a subject and some text describing the error that just occurred.
If able, copy and paste the full traceback information into the issue as well.

@havok2063
Copy link
Collaborator

I looked through the web server logs and didn't see any errors, and there are no 404 errors at all. The last run failed (at spaxel item 20 or so):

...
10214-9102 34 20
10214-9102 35 20
10214-9102 36 20
10214-9102 25 21
crashes here

and that spaxel in the log has a 200 success code but just seems to stop

{address space usage: 4116484096 bytes/3925MB} {rss usage: 1036513280 bytes/988MB} [pid: 421880|app: 0|req: 2510/43178] 155.101.19.33 () {50 vars in 1154 bytes} [Tue May  4 08:07:11 2021] POST /marvin/api/maps/10214-9102/HYB10/MILESHC-MASTARSSP/quantities/36/20/ => generated 9445 bytes in 317 msecs (HTTP/1.0 200) 5 headers in 235 bytes (1 switches on core 0)
{address space usage: 4162154496 bytes/3969MB} {rss usage: 1013391360 bytes/966MB} [pid: 211542|app: 0|req: 9155/43179] 2001:1948:414:13::34 () {50 vars in 1115 bytes} [Tue May  4 08:07:12 2021] POST /marvin/api/cubes/10214-9102/quantities/25/21/ => generated 125730 bytes in 114 msecs (HTTP/1.0 200) 5 headers in 237 bytes (1 switches on core 0)

@manduhmia
Copy link
Author

manduhmia commented May 10, 2021 via email

@havok2063
Copy link
Collaborator

I'm trying to reproduce the error. What's weird is on the test server, I do not get the same error. Instead I see this

10214-9102 27 21
10214-9102 28 21
10214-9102 29 21
Sentry responded with an API error: RateLimited(None)
['failed to retrieve data using input parameters.']
10214-9102 30 21
['failed to retrieve data using input parameters.']
10214-9102 31 21
['failed to retrieve data using input parameters.']

which is probably the underlying error that isn't getting caught correctly in the production server. But both the production and test servers are using the same version of Marvin. When I check those spaxels, the quantities are still loaded properly even though there's an error "failed to retrieve parameters". So exactly why the error is occurring I'm not sure yet. I'd have to dig into a bit more.

In the meantime, you could try lazy loading the spaxels then break up the list into chunks < 100 and load the spaxels manually, e.g.

spax = ap.getSpaxels(threshold=0.8, lazy=True)

for i in spax:
    i.load('maps')

@manduhmia
Copy link
Author

manduhmia commented May 11, 2021 via email

@havok2063
Copy link
Collaborator

Interesting. Was it the same galaxy or a different one? Are you still encountering the error?

@manduhmia
Copy link
Author

manduhmia commented May 12, 2021 via email

@havok2063
Copy link
Collaborator

Hmm, ok. I find that very strange. That some galaxies work ok, others don't, the errors don't happen as frequently as before, and it's seemingly random. I'll have to think on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants