Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade to Solr 9.2 #1359

Closed
4 tasks done
rlskoeser opened this issue Apr 19, 2023 · 3 comments
Closed
4 tasks done

upgrade to Solr 9.2 #1359

rlskoeser opened this issue Apr 19, 2023 · 3 comments
Assignees
Labels
🛠️ chore One-off task or update

Comments

@rlskoeser
Copy link
Contributor

rlskoeser commented Apr 19, 2023

dev notes

  • depends on test parasolr against solr 9.2 parasolr#80
  • test geniza against solr 9 in CI on an upgrade branch
  • update geniza qa ansible playbook for solr 9 staging hosts
  • test geniza qa on solr 9 thoroughly
  • when successful, update geniza production playbook for solr 9
@rlskoeser rlskoeser added the 🛠️ chore One-off task or update label Apr 19, 2023
@blms blms self-assigned this Jun 26, 2023
blms added a commit that referenced this issue Jun 26, 2023
blms added a commit that referenced this issue Jun 26, 2023
- also bump isort in pre-commit hook
- also update comment for unit test workflow
blms added a commit that referenced this issue Jul 11, 2023
@blms
Copy link
Contributor

blms commented Sep 5, 2023

results from testing:

  • The search seems to work fine, in some cases performing better than the prod site even without the additional feature. (For example, this search is highlighted properly in the test site but not in production.)
  • I did get one strange outcome for shelfmark scoped search, where I think it may be evaluating type=edismax directly again after two correct results… the default parser issue coming up again somehow? (here’s that search on prod.)

@blms
Copy link
Contributor

blms commented Sep 14, 2023

@rlskoeser I think I see what's going on with shelfmark scoped search: the search is being evaluated as

'keyword_query': '{!type=edismax qf=$shelfmark_qf}"T-S 8J16.25"'

which would be inserted into

'q': '{!type=edismax qf=$keyword_qf pf=$keyword_pf v=$keyword_query}'

in place of $keyword_query, producing a sort of nested or wrapped query, with two type=edismax and two qfs in different parts of the query.

It seems like this kind of nesting should work according to the docs (under "boost" here), and it does work for the first couple of results, but it then begins to evaluate type as a part of the keyword query. Maybe this kind of wrapping only works with the boost parser and not edismax?

FWIW, I also noticed that on production, that search seems to produce identical results regardless of whether you include shelfmark:, up until about result number ~375, at which point the results are so barely relevant that it seems inconsequential:

  1. with shelfmark:
  2. without shelfmark:

Despite that, I would say in both cases this actually performs better than the same search on QA, because it finds other shelfmarks that are similar before producing irrelevant results. Seems like it's mostly just using the boosted shelfmark fields as in a normal keyword query.

Again, odd, as I can't find any documented solr changes that would cause this difference, and the query is exactly the same in the current prod code.

Also, as for your question on Slack about whether this happens for other scoped searches—it does not, as this is the only scoped search where we actually use this kind of syntax. I think we do it because shelfmark_qf refers to multiple fields so we need to use qf. In all the other scoped searches we just replace the term with its solr field name, which is always just a single field.

@rlskoeser
Copy link
Contributor Author

solr upgrade complete!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🛠️ chore One-off task or update
Projects
None yet
Development

No branches or pull requests

2 participants