-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
9635 solr improvements #10050
9635 solr improvements #10050
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I just added a "closes" tag for #9635 to the PR. Once the pr is merged, I want to hot-patch our prod. with the improvements here, and then close the issue, at least for now. As I said before, there's definitely more to research and investigate and I want to get back to it, but there may be more pressing things r/n. |
This comment has been minimized.
This comment has been minimized.
OK, one test is legitimately failing in this PR, will fix. |
…reindexing the parent dataverse when adding guestbooks, stops the update dataverse command from unnecessarily reindexing the underlying datasets in some cases, makes that reindex truly async. in the onSuccess part of the command. (#9635)
This comment has been minimized.
This comment has been minimized.
…minates reindexing the" (some things got checked in by mistake) This reverts commit 18cd587.
…rk: eliminates reindexing the parent dataverse when adding guestbooks, stops the update dataverse command from unnecessarily reindexing the underlying datasets in some cases, makes that reindex truly async. in the onSuccess part of the command. (#9635)
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
if (settingsWrapper.isTrueForKey(SettingsServiceBean.Key.DisableSolrFacets, false)) { | ||
return false; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: the following code - is it worth it to query solr just to see if you can query solr? Would a check of the last indexed time be a reasonable alternative? (Not sure how much this would save given it's a simple query).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this looks like it's yet another thing we can save/economize on.
We are already checking that this is the latest released version... so, yes, comparing the version published date against the indexed time stamp should be reliable enough.
These queries that just search for the version id cannot be that expensive; but there's definitely a lot of them in the logs.
guestbook.setEnabled(true); | ||
dataverse.getGuestbooks().add(guestbook); | ||
cmd = new UpdateDataverseCommand(dataverse, null, null, dvRequestService.getDataverseRequest(), null); | ||
commandEngine.submit(cmd); | ||
logger.info("Returned from command"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it just for debugging the async part - probably not needed, but at least should be fine()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually noticed it, and meant to drop it. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I haven't tested everything, but I have cherry-picked the nofollow links to QDR (and have the changes to the Guestbook page running there as well. The rest of the changes make sense.
Cool, thank you. I will refactor the "version searchable" test above, and drop the .info log line quickly. |
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
You can see the above on Dataverse internal at https://dataverse-internal.iq.harvard.edu/dataverse/sek11-15 (I added a couple of objects while facets were unavailable and it worked fine) |
@sekmiller |
This comment has been minimized.
This comment has been minimized.
@sekmiller |
This comment has been minimized.
This comment has been minimized.
@sekmiller and I talked and I just kicked this off for a second opinion on API tests: https://github.com/gdcc/api-test-runner/actions/runs/6935079103 Update: it failed with |
I still need to fix the API test above. |
… ways to improve detecting bad facet queries early on (and trying to decide if it's out of scope or not). #9635
…SS/dataverse into 9635-solr-search-improvements
This comment has been minimized.
This comment has been minimized.
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
The tests are passing now. |
What this PR does / why we need it:
The following application-side improvements/workarounds are being added as part of this PR:
nofollow
to all the facet links and such, that we don't want crawlers/bots to touch. This will obviously only help with the nicer bots that take hints and follow rules, but better than nothing.Which issue(s) this PR closes:
Closes #9635
Special notes for your reviewer:
Suggestions on how to test this:
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: