Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search API not returning exact matches for some queries #972

Closed
Kapeli opened this Issue May 31, 2015 · 11 comments

Comments

9 participants
@Kapeli
Copy link

Kapeli commented May 31, 2015

The Search API doesn't return exact matches for some queries. As far as I can tell, it happens with queries that have a small length.

For example, if you search for ib, the ib gem is not part of the search results:

https://rubygems.org/api/v1/search.json?query=ib

@ETetzlaff

This comment has been minimized.

Copy link

ETetzlaff commented Sep 6, 2015

I see this bug on production, however, when I pulled down the repo and tried to recreate, I was unable to.

To try to recreate I just added the ib gem to the Gemfile on the repo and ran the gemcutter task to load gems into the dev db and then plugged in your url.
http://localhost:3000/api/v1/search.json?query=ib

Results are in the screenshot below. I'm not sure this is an issue moving forward?

screen shot 2015-09-05 at 11 08 21 pm

@Kapeli

This comment has been minimized.

Copy link
Author

Kapeli commented Sep 6, 2015

Maybe the production server has some setting that limits the results a search/query can return, which debug doesn't?

@ETetzlaff

This comment has been minimized.

Copy link

ETetzlaff commented Sep 6, 2015

It's possible that the production db is doing something, but the query isn't necessarily all that glamorous or complicated so I doubt that a prod db setting is the case here. After reviewing the logic around it, the only explanation I could have is this (assuming the exact same logic is implemented in prod).

The query logic has a base clause that returns 'ib' in this case, however, there is another stipulation. The version of that record HAS to be indexed. If for whatever reason when the gem was created in the prod environment is not indexed, it will not return. I'm not 100% on the logic surrounding gem creation yet though.

@stratigos

This comment has been minimized.

Copy link

stratigos commented Oct 15, 2015

This does not seem to be an issue due to character length, as the following single-character-named gems all return an exact match when searched against a fresh checkout and fresh db dump (as of today):

l, Q, h, n, j, f, r, -, v,u,g,_, q,a,c,e,b,p,s,w,i, ... etc

While ib does return an exact match in localdev, the following two-character-named gems do not produce an exact match for me:

z1, lt, wb, fx, rm, ko

There may be more, but perhaps these examples will help uncover why exact matches are not being returned by /search. I checked this set against production, and none of these produce an exact match in prod.

I tested a few other two-char searches that I know exist in prod, and they produced the expected exact matches.


Considering: z1, lt, wb, fx, rm, ko

Following what @ETetzlaff found, it appears that these Rubygem instances' versions associations all have: versions.indexed=f. It seems that this could occur erroneously during upload of a new version, should S3 fail silently, or within a callback routine in Pusher or Indexer. Reading this process line by line, it appears that versions.indexed should always be set to true.

I wonder if the owner of gem ib successfully yanked a version/some versions, then failed during push/update later on, at which point its version was cached as not indexed in production.

Currently I am trying to figure out how to create a user and simulate pushing and re-versioning gems in my local environment. If anyone would like to pair up and assist, I would love a learning experience.

It is possible that the list I produced above is valid for versions.indexed=f (they should not be returning exact matches for /search). Please forgive any distractions I may be causing, this is my first day looking at this codebase 😸 .

@arthurnn

This comment has been minimized.

Copy link
Member

arthurnn commented Oct 17, 2015

we are adding ES to improve the our search, #1085 , Should fix this once that is done.

@vipulnsward

This comment has been minimized.

Copy link
Contributor

vipulnsward commented Apr 5, 2016

Looks like still an issue after #1085

@arthurnn

This comment has been minimized.

Copy link
Member

arthurnn commented Apr 5, 2016

interesting, that the webui returns the right results: https://rubygems.org/search?utf8=%E2%9C%93&query=ib

@indirect

This comment has been minimized.

Copy link
Member

indirect commented Jun 26, 2016

@sonalkr132 do you think you could look into this?

@sonalkr132

This comment has been minimized.

Copy link
Member

sonalkr132 commented Jun 27, 2016

Sure 👍

@sonalkr132

This comment has been minimized.

Copy link
Member

sonalkr132 commented Sep 14, 2016

#1349 was reverted because of #1395

@sonalkr132 sonalkr132 reopened this Sep 14, 2016

@sonalkr132 sonalkr132 added bug and removed starter labels Sep 15, 2016

@sonalkr132 sonalkr132 added this to the Search milestone Sep 15, 2016

@sonalkr132 sonalkr132 added the backlog label Dec 7, 2016

@sonalkr132

This comment has been minimized.

Copy link
Member

sonalkr132 commented Mar 12, 2019

Hi, we deployed elasticsearch to the search api. https://rubygems.org/api/v1/search.json?query=ib has ib at the top now.

Please let us know if this is still an issue.

@sonalkr132 sonalkr132 closed this Mar 12, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.