-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full-text search does not appear to update, ever #410
Comments
Mini-update here - we are working on establishing some base OS images & pipeline that make it easier for us to spin up new services (like ElasticSearch) -- but more importantly -- maintain and keep them secure in a sustainable way. Thanks for your patience, all, as we keep things rolling. |
It's been a few months. Any updates? |
server build is in progress. if things go well, hope to dark launch it tomorrow, wait it a bit, then formally announce it. |
Lovely. And just in time for the latest surge of new users. Thx!
…On Sat, Jul 1, 2023, 9:36 PM Preston Doster ***@***.***> wrote:
server build is in progress. if things go well, hope to dark launch it
tomorrow, wait it a bit, then formally announce it.
—
Reply to this email directly, view it on GitHub
<#410 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADHCPEW2YMJXVWW3HMUL6RDXODGCXANCNFSM6AAAAAAXR2GWAM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
fyi the full text indices are building now. looks like it will take somewhere between 12-16 hours (best guess) for them to process through the backlog. |
Looks like the index built way faster than expected. Full text search for historical and new toots should be good to go. We'll watch for a few days and then announce if everything continues to look good. |
I can confirm that searching for bulk-loaded content (> 5 days old) works great, but am having trouble searching for more recent posts. How long a built-in delay should we expect between when posts are created vs. when they become searchable? The fact that I can't search for content as much as 3 days old suggests that something may be stalled along the content ingestion pipeline. |
Hrm, I would expect "somewhat immediate", e.g. like maybe 10-15 minutes max. I'll poke and see if we can see what's up. |
To help with your debugging, here's a reproducible test case. ~40 minutes ago, I posted the following: https://hachyderm.io/@pevohr/110702125969608382 Yet I still get only 2 results when searching for the word "teeny":
Hence my hunch that something's going wrong when ingesting new posts to be indexed. |
I'm a bit confused about how text search is currently supposed to work, I'm currently only finding my starred posts, nothing else UPDATE I can search some of my posts but not others, not sure what the difference is |
How it's supposed to work is described accurately in the OP. The bug I've been asking @Preskton to chase is that it currently only seems to work for posts before they backfilled the index in early July. AFAICT, new activity since then isn't getting indexed at all. |
ah nothing since July was the part I didn't know |
Does that fit what you're seeing too? Then it's not just me. 😉 |
After a bit of debugging, thats right, nothing after July 2 |
Full-text search continues to be subject to some gremlin in our system that prevents updates. I'll be looking at this today to double-check configs & firewalls. |
Cross-linking this to mastodon/mastodon#20230, as this seems to be hitting us as well. Although it seems to be all indexing activity. |
Re-running |
I can confirm that content since the last bulk load in early July (including the the linked test post above) up through yesterday is all available: https://hachyderm.io/@pevohr/110985396128335885 However, real-time updates are still broken: https://hachyderm.io/@pevohr/110996075411410901 Replaying my "teeny" test search now gets me three posts instead of two, but today's fourth post doesn't appear. |
Sigh. Sometime recently all the bulk-loaded posts stopped being searchable too. Possibly an unintended side effect of the upgrade from 4.1.4 to 4.1.7? |
Hiya, @prohr - we had to disable FT search the other day as it was affecting sidekiq performance pretty badly. We are going to revisit after 4.2.0 which drops next week, as there are some significant changes there. |
Gotcha. Figured it was something like that. Have the core devs given you any guidance on how to tune this feature so it actually has a chance of ... working? |
Hi, wondering if there are any updates to this given that Mastodon 4.2.0 has been implemented now? |
The saga continues - we are attempting to re-enable it today. :) |
We've finished re-indexing -- took about 8 hours. I'm cautiously saying that we've re-enabled full-text search, but we'll watch it over the next week or so to judge impact on sidekiq queues. |
Confirmed that the old behavior seems to be be working well (including dynamic updates) if you add the in:library qualifier. Thanks! |
@Preskton thanks to everyone who worked on it, I really find search very useful! |
Background
From the mastodon website:
Previously, we offered full-text search against Hachyderm. Hachydermians have asked if we will reintroduce it. To do so, the infrastructure team will:
should we choose to reintroduce search, Hachydermians should be able to search for any keyword in toots per the Mastodon manual.
Related Issues
#385
#387
#300
#386
The text was updated successfully, but these errors were encountered: