Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datamodel parameter dictionary being mishandled #563

Closed
havok2063 opened this issue Oct 31, 2018 · 5 comments
Closed

datamodel parameter dictionary being mishandled #563

havok2063 opened this issue Oct 31, 2018 · 5 comments
Assignees
Labels
bug a general bug or other breaking feature vetting indicates issues found during dedicated vetting sessions
Milestone

Comments

@havok2063
Copy link
Collaborator

Describe the bug
Bug appears when attempting to load a remote DR15 maps. Initializing and loading datamodel query parameters makes API calls for every release.

To Reproduce

  1. Use the marvindev environment file version of marvin, 2.3.0dev
from marvin.tools import Maps
maps = Maps('7977-12704')

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/tools/mixins/mma.py:139: MarvinUserWarning: local mode failed. Trying remote now.
  warnings.warn('local mode failed. Trying remote now.', MarvinUserWarning)
release MPL-4
making remote call to /public/marvin/api/query/getallparams/ MPL-4
length of keys 1676
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-6-773b22a9e533> in <module>()
----> 1 maps = Maps('7977-12704')

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/tools/maps.py in __init__(self, input, filename, mangaid, plateifu, mode, data, release, drpall, download, nsa_source, bintype, template, template_kin)
     96                                   mangaid=mangaid, plateifu=plateifu,
     97                                   mode=mode, data=data, release=release,
---> 98                                   drpall=drpall, download=download)
     99
    100         NSAMixIn.__init__(self, nsa_source=nsa_source)

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/tools/core.py in __init__(self, input, filename, mangaid, plateifu, mode, data, release, drpall, download)
    123
    124         # Load VACs
--> 125         from marvin.contrib.vacs.base import VACMixIn
    126         self.vacs = VACMixIn.get_vacs(self)
    127

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/contrib/vacs/__init__.py in <module>()
     21 for (module_loader, name, ispkg) in pkgutil.iter_modules([pkg_dir]):
     22     if name != 'base':
---> 23         importlib.import_module('marvin.contrib.vacs.{0}'.format(name), __package__)

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/importlib/__init__.py in import_module(name, package)
    124                 break
    125             level += 1
--> 126     return _bootstrap._gcd_import(name[level:], package, level)
    127
    128

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/contrib/vacs/mangahi.py in <module>()
     16 import marvin.tools
     17 from marvin.utils.general.general import get_drpall_table, get_drpall_row
---> 18 from marvin.utils.plot.scatter import plot as scatplot
     19
     20 from .base import VACMixIn

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/utils/plot/scatter.py in <module>()
     13 from marvin.utils.datamodel.dap import datamodel
     14 from marvin.core.exceptions import MarvinUserWarning
---> 15 from marvin.utils.datamodel.query.base import QueryParameter
     16 from marvin.utils.datamodel.dap.base import Property
     17 from marvin.utils.general import invalidArgs, isCallableWithArgs

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/utils/datamodel/query/__init__.py in <module>()
     12
     13 from .base import QueryDataModelList
---> 14 from .MPL import MPL4, MPL5, MPL6, MPL7, DR15
     15
     16 mpllist = [MPL4, MPL5, MPL6, MPL7, DR15]

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/utils/datamodel/query/MPL.py in <module>()
     28 EXCLUDE = ['modelcube', 'modelspaxel', 'redcorr', 'obsinfo', 'dapall'] + BASE_EXCLUDE
     29
---> 30 MPL4 = QueryDataModel(release='MPL-4', groups=groups(), aliases=['MPL4', 'v1_5_1', '1.1.1'], exclude=EXCLUDE, dapdm=datamodel['MPL-4'])
     31
     32

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/utils/datamodel/query/base.py in __init__(self, release, groups, aliases, exclude, **kwargs)
     55         self.bitmasks = get_maskbits(self.release)
     56         self._mode = kwargs.get('mode', config.mode)
---> 57         self._get_parameters()
     58         self._check_datamodels()
     59

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/utils/datamodel/query/base.py in _get_parameters(self)
     87             else:
     88                 self._mode = 'remote'
---> 89             self._get_parameters()
     90
     91     def _get_from_remote(self):

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/utils/datamodel/query/base.py in _get_parameters(self)
     81             self._cleanup_keys()
     82         elif self._mode == 'remote':
---> 83             self._get_from_remote()
     84         elif self._mode == 'auto':
     85             if config.db:

/Users/Brian/anaconda2/envs/marvindev/lib/python3.5/site-packages/marvin/utils/datamodel/query/base.py in _get_from_remote(self)
    122                 PARAM_CACHE.update(ii.getData())
    123                 self._check_aliases()
--> 124                 for key in PARAM_CACHE.keys():
    125                     self._keys = PARAM_CACHE[key] if key in PARAM_CACHE else []
    126                     print('length of keys', len(self._keys))

RuntimeError: dictionary changed size during iteration

Expected behavior
The datamodel should populate the available parameters cleanly.

@havok2063 havok2063 added bug a general bug or other breaking feature vetting indicates issues found during dedicated vetting sessions labels Oct 31, 2018
@havok2063 havok2063 added this to the Vetting milestone Oct 31, 2018
@havok2063 havok2063 self-assigned this Oct 31, 2018
@albireox
Copy link
Member

albireox commented Nov 7, 2018

Tried to reproduce this but was not able to, and the lines that fail doesn't seem to be in master. What branch does this happen in?

@albireox albireox self-assigned this Nov 7, 2018
@havok2063
Copy link
Collaborator Author

This is in the marvindev environment which uses the marvin 2.3.0dev pip release. This release is mapped to the vetround2 branch. In principle, you should be able to recreate it disabling your database and unsetting your MANGA_LOCALHOST environment variable.

@albireox
Copy link
Member

albireox commented Nov 8, 2018

Ok, was able to reproduce. I added PR #570 to try to fix this. Basically the problem is that in Python 3.6+ (or maybe only 3.7+, not sure) PARAM_CACHE.keys() is a dict_keys object that tracks if it has been modified so when you call _remove_query_params(), which changes the dictionary, it fails. By casting it a list it seems to work.

A few caveats:

  • This is a hotfix. Modifying a dictionary on the fly is never a good idea and I'm not sure the code is doing what it's supposed to do.
  • This happens when initialising a Maps. There is no reason for a Maps to have anything to do with queries. The problem seems to be that we are loading the code for all the VACs and one of them is calling something that calls something that eventually calls a query.
  • I'm not sure how the query data model works but it seems we should not need to do a remote query for MPLs we are not using but given that this only happens once at the beginning, and if we can avoid this happening on non-query related imports, I think we can deal with that later.

@havok2063
Copy link
Collaborator Author

Moving my comment here and expanding on your points.

The code might not be working as intended. In theory, the PARAM_CACHE dictionary should be populated once and only once with no modifications being done to it.

As for the Maps, following the traceback, it looks like the query datamodel is being used in the scatter plot code in the manga HI VAC. I use the DAP and Query datamodels to get the proper axis label formatting for scatter plots. I can move those imports into the code itself. Or I can change the VAC to not use the scatter plot at all.

I switched the datamodel to make one remote call to grab all available query parameters for all data releases rather than doing 6 different API calls, which adds to a person's rate limit. This happens when the query datamodel is initialized.

@albireox
Copy link
Member

albireox commented Nov 8, 2018

That sounds good to me. I'd vote to have the scatter import moved inside the code to avoid it being triggered until needed. It seems reasonably clean and simple.

@albireox albireox closed this as completed Nov 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug a general bug or other breaking feature vetting indicates issues found during dedicated vetting sessions
Projects
None yet
Development

No branches or pull requests

2 participants