Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re.compile not interpreted correctly when passed to the _search.py search method #657

Closed
3 tasks done
wrongkindofdoctor opened this issue Feb 7, 2024 · 2 comments
Closed
3 tasks done

Comments

@wrongkindofdoctor
Copy link

wrongkindofdoctor commented Feb 7, 2024

Here's a quick checklist in what to include:

  • Include a detailed description of the bug or suggestion

  • Output of intake_esm.show_versions()

  • Minimal, self-contained copy-pastable example that generates the issue if possible. Please be concise with code posted. See guidelines below on how to provide a good bug report:

Description

I am trying to pass a python re.compile object for one of the column entries in an intake catalog search following the example in the code comments. However, the search method expects values to be iterables in the query dict, and throws an error when trying to resolve the re.compile object.

What I Did

   for case_name, case_d in case_dict.items():
        path_regex = re.compile(r'({})'.format(case_name)). # Search for the case_name group in the path entries
        freq = case_d.varlist.T.frequency
        for v in case_d.varlist.iter_vars():
              cat_subset = cat.search(activity_id=case_d.convention,
                                   standard_name=v.standard_name,
                                   frequency=freq,
                                   realm=v.realm,
                                   path=path_regex
                                   )

The path_regex object passed to catalog _search.search method:

re.compile('(CMIP_Synthetic_r1i1p1f1_gr1_19800101-19841231)')

path_regex has the following attributes:

  • flags (int)
  • group_index (dict)
  • groups (int)
  • pattern (str)

Thus, values.pattern seems like it is what the search method should be using in the for value in values loop if values is an re.compile object
Stack trace


File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 55, in wrapper_function
    return vd.call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 150, in call
    return self.execute(m)
           ^^^^^^^^^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 222, in execute
    return self.raw_function(**d, **var_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/core.py", line 393, in search
    esmcat_results = self.esmcat.search(require_all_on=require_all_on, query=query)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/cat.py", line 385, in search
    results = search(
              ^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/_search.py", line 46, in search
    for value in values:
TypeError: 're.Pattern' object is not iterable

Version information: output of intake_esm.show_versions()

Paste the output of intake_esm.show_versions() here:

INSTALLED VERSIONS
------------------

cftime: 1.6.2
dask: 2023.9.1
fastprogress: 1.0.3
fsspec: 2024.2.0
gcsfs: None
intake: 0.7.0
intake_esm: 2024.2.6
netCDF4: 1.6.4
pandas: 2.1.0
requests: 2.31.0
s3fs: None
xarray: 2023.8.0
zarr: 2.16.1
@mgrover1
Copy link
Collaborator

mgrover1 commented Feb 28, 2024

@wrongkindofdoctor - can you try passing it as a list? Sorry for the delayed response here.

ex.

   for case_name, case_d in case_dict.items():
        path_regex = re.compile(r'({})'.format(case_name)). # Search for the case_name group in the path entries
        freq = case_d.varlist.T.frequency
        for v in case_d.varlist.iter_vars():
              cat_subset = cat.search(activity_id=case_d.convention,
                                   standard_name=v.standard_name,
                                   frequency=freq,
                                   realm=v.realm,
                                   path=[path_regex]
                                   )

@wrongkindofdoctor
Copy link
Author

@mgrover1 sorry for the late response. I just got around to testing passing the re.compile object as a list to cat.search, and this resolves the issue. Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants