Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Search Results Issue #13

Closed
Tantumonium opened this issue Apr 30, 2020 · 10 comments
Closed

API Search Results Issue #13

Tantumonium opened this issue Apr 30, 2020 · 10 comments
Assignees

Comments

@Tantumonium
Copy link

Searching for "jqueryui" using the following string, not only is "jqueryui" is NOT displayed in the results, neither are any other libraries where "jqueryui" or "jQuery UI" are included as either keywords/tags or are used in the library's description.

https://api.cdnjs.com/libraries?search=jqueryui&output=human&fields=filename,homepage,version,keywords,description

Whereas by removing "description" from the "fields" list, and searching for "jqueryui" using the following string, "jqueryui" IS displayed in the results... as are all the other libraries where "jqueryui" or "jQuery UI" are mentioned.

https://api.cdnjs.com/libraries?search=jqueryui&output=human&fields=filename,homepage,version,keywords

@MattIPv4
Copy link
Member

@MattIPv4 MattIPv4 self-assigned this Apr 30, 2020
@Tantumonium
Copy link
Author

Thanks for the info about "output=human".

While we're on the topic of the API... is there any method to perform a more specific search query (e.g. just the name attribute, etc)? Something like the following?

https://api.cdnjs.com/libraries?name=jqueryui&fields=filename,version
https://api.cdnjs.com/libraries?keywords=jqueryui&fields=filename,version
https://api.cdnjs.com/libraries?description=jqueryui&fields=filename,version

I know I can search for and return a specific library (if I already know it's exact name)... but otherwise, the API only returns a lot of seemingly unrelated results.

Specifically in regard to "jqueryui"... a search on the CDNJS homepage, lists jqueryui first... followed by other libraries with jqueryui in the name... followed by results where jqueryui might be in the description/keywords/etc... but the API just dumps out everything in seemingly random order. While I can further filter this on my end... it seems like a more robust search query format would be very useful (if it doesn't already exist).

Originally posted by @redox in cdnjs/cdnjs#5688 (comment)

Why would searching for "twitter" return hogan.js?

@getsetbro because the search is currently using all attributes (and "twitter" is mentioned in the homepage attribute). We can restrict it to the name + description (or just name) if you guys think it's better. What are the most frequent usages of that API?

@MattIPv4
Copy link
Member

The API uses the browse endpoint for the index in Algolia to list all libraries. It is intended to act more as a collection of all the libraries we have rather than a powerful search tool.

Looking at the docs, I don’t see any easy way to change what attributes the query is compared against: https://www.algolia.com/doc/api-reference/api-methods/browse/

My recommendation would definitely be to prefer doing filtering yourself with a larger set of results from the API, so that you have more control over what’s happening.

@Tantumonium
Copy link
Author

Are you currently using the Algolia attributesToRetrieve browseparameter to handle your fields=filename,homepage,version,keywords parameter?

If so, could restrictSearchableAttributes be added to a future release?

@MattIPv4
Copy link
Member

MattIPv4 commented May 1, 2020

I've just taken a bit more of a look at this having woken up, and the website/API return the exact same results for a query, just in a different order.

The website seems to rank based on relevance to the search query, whereas the API ranks based on the generic ranking score of each library (mostly GitHub stars).

As the API is using browse instead of search, I'm not sure what can be done here to sort the results differently. As mentioned before, the API isn't meant to be a powerful search tool, it is meant to be an index of the libraries we have, which is why it is powered by browse.

@MattIPv4 MattIPv4 transferred this issue from cdnjs/cdnjs May 1, 2020
@Tantumonium
Copy link
Author

I appreciate you looking into it and for the explanation.

In case anyone else is interested... I ended up adding the following filter to my AJAX call... which cleans things up nicely.

if ( ( obj.name ).toLowerCase().indexOf( search.term ) >= 0 || ( obj.description ).toLowerCase().indexOf( search.term ) >= 0 ) { return obj; }

@MattIPv4
Copy link
Member

MattIPv4 commented May 1, 2020

I realise I didn't actually respond to your comments:

Are you currently using the Algolia attributesToRetrieve browseparameter to handle your fields=filename,homepage,version,keywords parameter?

We request all fields from Algolia and filter ourselves so that we can return null for extra fields requests & inject our own custom ones.

If so, could restrictSearchableAttributes be added to a future release?

I think that seems sensible to add, perhaps as a search_fields query param? What do you think?

@Tantumonium
Copy link
Author

I think that seems sensible to add, perhaps as a search_fields query param? What do you think?

I think that would make an excellent addition to the CDNJS API. :)

@MattIPv4
Copy link
Member

MattIPv4 commented May 3, 2020

I've added this in 8358610, though this hasn't been deployed yet and the new API server is only handling 50% of traffic, so I'm unsure when this will really be available for use.

@MattIPv4
Copy link
Member

This is now in production at 100% traffic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants