Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange API discrepency when lsid is genus #254

Closed
DuncanRowland opened this issue Apr 27, 2017 · 12 comments
Closed

Strange API discrepency when lsid is genus #254

DuncanRowland opened this issue Apr 27, 2017 · 12 comments

Comments

@DuncanRowland
Copy link

When the lsid is a genus (e.g. Malva), the fq filtering does not work as expected

https://records-ws.nbnatlas.org/occurrences/search?q=lsid:NHMSYS0000460576
Returns the correct value (I think, 19798 totalRecords)
https://records-ws.nbnatlas.org/occurrences/search?q=*:*&fq=lsid:NHMSYS0000460576
Only returns 62.

This is a problem because all easymap calls use fq filtering for lsid.
I could fix this (by making easymap do a straight q query),
but I think it would be better fixed in the API if it's returning the wrong results?

Example that does work (where lsid is a species, e.g. Fox)
https://records-ws.nbnatlas.org/occurrences/search?q=lsid:NHMSYS0000080188
https://records-ws.nbnatlas.org/occurrences/search?q=*:*&fq=lsid:NHMSYS0000080188
both return 165082.

@DuncanRowland
Copy link
Author

DuncanRowland commented Apr 27, 2017

{"stored":false,"dataType":"string","indexed":true,"multivalue":false,"name":"lsid"}
I'm wondering if it's because lsid is not stored, does that matter?

@DuncanRowland
Copy link
Author

DuncanRowland commented Apr 27, 2017

And this is a bit unexpected too:
https://records-ws.nbnatlas.org/occurrences/search?q=lsid:NHMSYS0000460576&facets=lsid&pageSize=0
Why would a query specifying a specific genus lsid contain multiple lsid in its facets?

@DuncanRowland
Copy link
Author

DuncanRowland commented Apr 28, 2017

The user who raised this issue adds another url that returns 62 for Malva
https://records.nbnatlas.org/occurrences/search?q=Malva&fq=taxon_name:Malva
Perhaps that sheds some light?

@JimBacon
Copy link

My passing thought on this is as follows, Suppose you have genus A having species A.x and A.y. There may be records that are just identified to genus level while others are identified to species level. If you request a list of occurrences of genus A there are two logical responses: either just the A records or A + A.x + A.y. If the latter then facets for each element of the sum would be reasonable.

@DuncanRowland
Copy link
Author

DuncanRowland commented Apr 28, 2017

Interesting. As I understand it, for lsid:genus:
q=A returns A+A.x+A.y
q=*:* & fq=A returns A
This might be by design? (although I think it goes against how I understood q/fq differed). If so, I can adapt EasyMap so that it does a q query instead of an fq (and hope that doesn't break it in some other way), or leave it and add a caveat for genus maps... I'll wait to see what David says.

@djtfmartin
Copy link
Contributor

djtfmartin commented Apr 28, 2017

Thanks @DuncanRowland @JimBacon. Yes, thats correct.

So for q=lsid:NHMSYS0000460576 we are doing a more expansive search which will return A + A.x + A.y as @JimBacon describes (using the nested set query with left/right values).

For fq, this will just dumb filter for records assigned the ID of NHMSYS0000460576, so records assigned to the genus only.

So yes, I'd recommend changing EasyMaps to use q instead of fq.

@DuncanRowland
Copy link
Author

p.s. @JimBacon for the sake of compatibility, I don't suppose you can remember what the original easymap did when presented with a tvk that was a genus?

@DuncanRowland
Copy link
Author

@djtfmartin OK David, I can do this. :)

@JimBacon
Copy link

I'm inclined to go with the original poster who said "I noticed that genus taxonomic keys only return a few records, where they used to return all species records". Additional documentation around this issue would be valuable. Thanks for your continuing contribution @DuncanRowland !

@DuncanRowland
Copy link
Author

OK, this is done.
Hopefully it won't introduce new errors.
I've not deleted the cache, but people with old genus maps will see them update in 30 days.

@DuncanRowland
Copy link
Author

DuncanRowland commented Apr 28, 2017

p.s.
This is not quite as I document in the QueryPrimer documentation.
https://docs.google.com/document/d/1FiVasGGZ3kRPnu5347GPAef7Tr5LvvghCS6x82xnfu4
Perhaps someone could add it at some point? It's possibly a caveat worth mentioning.
Atb -D.

@DuncanRowland
Copy link
Author

DuncanRowland commented Apr 28, 2017

Actually, no worries, I can do it. My NBN account it still working :)
-Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants