Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sticky category filter #145

Open
lhannest opened this issue Feb 27, 2018 · 10 comments
Open

Sticky category filter #145

lhannest opened this issue Feb 27, 2018 · 10 comments
Labels

Comments

@lhannest
Copy link

The category filter is sticky, and I assume others are as well. Maybe data is being re-ordered each time a query is run? For example if you do these queries in this order:

https://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1
This returns a disease
https://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1&category=gene
This appropriately returns a gene.
https://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1
This now returns a gene.
https://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1&category=disease
This returns a disease
https://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1
This now returns a disease.

This seems like a bug to me. The same query parameters should return the same data.

@lhannest lhannest added the bug label Feb 27, 2018
@cmungall
Copy link
Member

hmm definitely undesirable. Can you try running the service locally and report what the solr calls are (on the log)

@kshefchek
Copy link
Contributor

This reminds me that @tudorgroza emailed the same issue and I meant to turn it into a ticket, from Tudor:

" the search is for some reason 'stateful'. If I search for 'disease', all subsequent calls will return only disease, even if the category is not specified. If I change the category to 'gene', then again, all subsequent calls will return only genes. "

@lhannest
Copy link
Author

http://localhost:5000/api/search/entity/diabetes?rows=1&start=1

2018-02-27 15:49:48,791 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 15:49:48,791 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': [], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 15:49:49,054 - root - INFO - Docs found: 290
2018-02-27 15:49:49,055 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 15:49:49] "GET /api/search/entity/diabetes?rows=1&start=1 HTTP/1.1" 200 -

The entity returned is HP:0005978 with categoryPhenotype.

http://localhost:5000/api/search/entity/diabetes?rows=1&start=1&category=gene

2018-02-27 16:07:37,428 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 16:07:37,428 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': ['category:"gene"'], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 16:07:37,662 - root - INFO - Docs found: 44
2018-02-27 16:07:37,663 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 16:07:37] "GET /api/search/entity/diabetes?rows=1&start=1&category=gene HTTP/1.1" 200 -

The entity returned is MGI:99415 with category gene.

http://localhost:5000/api/search/entity/diabetes?rows=1&start=1

2018-02-27 16:09:05,565 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 16:09:05,566 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': ['category:"gene"'], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 16:09:05,851 - root - INFO - Docs found: 44
2018-02-27 16:09:05,852 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 16:09:05] "GET /api/search/entity/diabetes?rows=1&start=1 HTTP/1.1" 200 -

The entity returned is MGI:99415 with category gene.

http://localhost:5000/api/search/entity/diabetes?rows=1&start=1&category=disease

2018-02-27 16:10:42,990 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 16:10:42,991 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': ['category:"disease"'], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 16:10:43,400 - root - INFO - Docs found: 217
2018-02-27 16:10:43,406 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 16:10:43] "GET /api/search/entity/diabetes?rows=1&start=1&category=disease HTTP/1.1" 200 -

The entity returned is MONDO:0005148 with category disease.

http://localhost:5000/api/search/entity/diabetes?rows=1&start=1

2018-02-27 16:12:07,989 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 16:12:07,989 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': ['category:"disease"'], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 16:12:13,232 - root - INFO - Docs found: 217
2018-02-27 16:12:13,235 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 16:12:13] "GET /api/search/entity/diabetes?rows=1&start=1 HTTP/1.1" 200 -

The entity returned is MONDO:0005148 with category disease.

@lhannest
Copy link
Author

Yeah, it looks like the filter query is persisting somehow.

@lhannest
Copy link
Author

lhannest commented Feb 28, 2018

This is odd, I'm stepping through the code in entitysearch.py:

@ns.route('/entity/<term>')
@api.doc(params={'term': 'search string, e.g. shh, parkinson, femur'})
class SearchEntities(Resource):

    @api.expect(simple_parser)

    #@api.marshal_list_with(search_result)
    def get(self, term):
        """
        Returns list of matching concepts or entities using lexical search
        """
        import pudb; pudb.set_trace()
        args = simple_parser.parse_args()
        q = GolrSearchQuery(term,
                            **args)
        results = q.exec()
        return results

PuDB output:

>>> args
{'rows': 1, 'category': None, 'start': 1}
>>> term
'diabetes'

But when stepping into the constructor, PuDB output:

>>> fq
{'category': ['disease']}

@cmungall
Copy link
Member

ok, this is at the ontobio level

>>> p = {'category':'disease'}
>>> q = GolrSearchQuery('diabetes', **p)
>>> results = q.exec()
>>> q.fq
{'category': 'disease'}
>>> p = {}
>>> q = GolrSearchQuery('diabetes', **p)
>>> results = q.exec()
>>> q.fq
{'category': 'disease'}

@cmungall
Copy link
Member

class GolrSearchQuery(GolrAbstractQuery):
    """
    Queries over a search document
    """
    def __init__(self,
                 term=None,
[snip]
                  fq={},

I assumed this makes a fresh empty dict each time but apparently not?

@lhannest
Copy link
Author

"Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well."

http://docs.python-guide.org/en/latest/writing/gotchas/

I wouldn't have expected that!

@balhoff
Copy link
Member

balhoff commented Feb 28, 2018

😱

@deepakunni3
Copy link
Member

Today I Learned! 🐍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants