Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

500 server error with wildcard facets #54

Closed
pochedls opened this issue Jan 31, 2020 · 7 comments
Closed

500 server error with wildcard facets #54

pochedls opened this issue Jan 31, 2020 · 7 comments
Assignees
Labels

Comments

@pochedls
Copy link

pochedls commented Jan 31, 2020

I am trying to get started with some simple queries and I noticed that if I don't give a value for "facets" I get a 500 Server Error:

from pyesgf.search import SearchConnection
conn = SearchConnection('https://esgf-node.llnl.gov/esg-search/', distrib=True)
ctx = conn.new_context(variable='tas', time_frequency='mon')
ctx.hit_count
...
HTTPError: 500 Server Error: 500 for url: https://esgf-node.llnl.gov/esg-search/search?format=application%2Fsolr%2Bjson&limit=0&distrib=false&type=Dataset&variable=tas&time_frequency=mon&facets=%2A

But if I set a value for facets (e.g., ctx = conn.new_context(variable='tas', time_frequency='mon', facets='null')), the search is returned successfully.

I think %2A, which appears to be the default value for facets should interpreted as a wildcard (*).

Is this expected behavior? Should I just specify some null value for facets (e.g., 0)?

@agstephens
Copy link
Contributor

Hi @pochedls , thanks for raising this. It looks like the underlying search interface has been modified and where we used a '*' we now need 'null'. You can patch your local copy with:

Lines 200-201 of: pyesgf/search/context.py

        if not ignore_facet_check:
            query_dict['facets'] = 'null'

I'll run the tests to check this doesn't have other unwanted consequences.

@pochedls
Copy link
Author

Thanks for taking a look @agstephens. I'm wondering if it would be possible to remove facets from the API request if it isn't specified. For example,

from pyesgf.search import SearchConnection
conn = SearchConnection('https://esgf-node.llnl.gov/esg-search/', distrib=True)
ctx = conn.new_context(variable='tas', time_frequency='mon')

would query using the following URL:

https://esgf-node.llnl.gov/esg-search/search?format=application%2Fsolr%2Bjson&limit=0&distrib=false&type=Dataset&variable=tas&time_frequency=mon

@lisi-w
Copy link

lisi-w commented Jun 10, 2020

Hi I am also having this issue. I believe it would be helpful to not require a facets=null inclusion, and just have that be the default instead of returning a 500 error whenever facets is not defined in the constructor (examples posted on readthedocs do not work, for example). Thanks!

@agstephens
Copy link
Contributor

@alaniwi: please report here on the issue identified with the Index Node functionality (i.e. the server-side issue).

@cehbrecht
Copy link
Collaborator

@soay do you have a comment on this?

@soay
Copy link
Member

soay commented Feb 8, 2021

First of all, we have not changed anything, search is still working with facets=*. However, it is not a good practice to use the facets parameter with '*' as Solr tries to load all facets and values. For CMIP6 we have new facets that are unique for each dataset (e.g. pid, citation) so the query was running into a timeout. That's the reason for the 500.

I've modified esg-search some months ago not to return the pid and citation, so facets=* queries should work now.

In any way, it's still bad practice to use this. This is not a Solr feature, it's just the bad esg-search design that allows it. Given the re-structuration of ESGF I'm not sure it it's worth to invest time to change esg-search ...?

@cehbrecht
Copy link
Collaborator

@soay thanks for clarifying 😄

I'm closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants