enhance smoosh to cleanup search indexes when ddocs change #4718

rnewson · 2023-08-04T18:37:14Z

Overview

automate cleaning up search indexes (clouseau and nouveau) when design documents are updated.

Testing recommendations

TBD

Related Issues or Pull Requests

N/A

Checklist

Code is written and works correctly
Changes are covered by tests
Any new configurable parameters are documented in rel/overlay/etc/default.ini
Documentation changes were made in the src/docs folder
Documentation changes were backported (separated PR) to affected branches

rnewson · 2023-08-04T18:38:39Z

src/smoosh/src/smoosh_server.erl

@@ -342,6 +345,8 @@ find_channel(#state{} = State, [Channel | Rest], Object) ->
            find_channel(State, Rest, Object)
    end.

+stale_enough({?INDEX_CLEANUP, _}) ->
+    true;


ddoc_updated events are rare, unlike database updates, so choosing not to action one would leave unreferenced indexes on disk for a long time, perhaps indefinitely.

We're also triggering these during every shard compaction attempt, proportionally to the number of shards (so we'd roughly trigger once for each clustered db). That should be a low enough rate, it may be worth checking what it looks like a busy cluster.

rnewson · 2023-08-04T18:39:09Z

I need to add tests to this and intend to, just wanted to show the work so far.

nickva · 2023-08-07T15:14:00Z

Overall this looks like what we would want to happen: as soon as we update the ddoc we can clean up the old index data.

With the ddoc trigger one thing I was worried about is the case when we get a ddoc_updated on one node first, and not the others yet: we make a clustered call to get all the ddocs and get the signatures, what if that's stale since it happens concurrently with the ddoc_updated, would there be a chance we'd remove the new index file we just created? Maybe it's a low enough chance of happening that it's not a problem as long as it eventually sorts itself out. (We delete the newly building index file but then it gets rebuilt quickly anyway).

rnewson · 2023-08-07T15:59:17Z

that's an excellent point, we must ensure that cannot happen, at least not with any higher probability than the existing cleanup code for mrview.

nickva · 2023-08-17T23:57:30Z

src/fabric/src/fabric.erl

+cleanup_clouseau_indices(Dbs, ActiveSigs) ->
+    Fun = fun(Db) -> clouseau_rpc:cleanup(Db, ActiveSigs) end,
+    lists:foreach(Fun, Dbs).
+cleanup_nouveau_indices(Dbs, ActiveSigs) ->
+    Fun = fun(Db) -> nouveau_api:delete_path(nouveau_util:index_name(Db), ActiveSigs) end,
+    lists:foreach(Fun, Dbs).


These shouldn't crash if either of those are not enabled? If there is a chance we could wrap them in a try ... catch maybe?

clouseau_rpc:cleanup is a gen_server:cast so that's fine even if the target node doesn't exist.

all the nouveau_api functions have a send_if_enabled check and return {error, nouveau_not_enabled} when it's not enabled.

in either case I ignore the function result, but perhaps I should check that it's one of the two expected results?

Makes sense. The only worry was that we would throw an exception and prevent other indexes from getting cleaned up. Thanks for double-checking, I think it's fine as is, then.

nickva · 2023-08-17T23:58:57Z

that's an excellent point, we must ensure that cannot happen, at least not with any higher probability than the existing cleanup code for mrview.

We could punt changing the trigger mechanism for later and just opt to run the nouveau and dreyfus cleanup as is, alongside mrview indexes?

rnewson force-pushed the smoosh-cleanup-search branch from 7dac5cf to 8ec19b1 Compare August 4, 2023 18:37

rnewson commented Aug 4, 2023

View reviewed changes

enhance smoosh to cleanup search indexes when ddocs change

bb020f8

rnewson force-pushed the smoosh-cleanup-search branch from 8ec19b1 to bb020f8 Compare August 6, 2023 15:56

nickva reviewed Aug 17, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhance smoosh to cleanup search indexes when ddocs change #4718

enhance smoosh to cleanup search indexes when ddocs change #4718

rnewson commented Aug 4, 2023

rnewson Aug 4, 2023

nickva Aug 7, 2023

rnewson commented Aug 4, 2023

nickva commented Aug 7, 2023

rnewson commented Aug 7, 2023

nickva Aug 17, 2023

rnewson Aug 18, 2023

nickva Aug 18, 2023

nickva commented Aug 17, 2023

enhance smoosh to cleanup search indexes when ddocs change #4718

Are you sure you want to change the base?

enhance smoosh to cleanup search indexes when ddocs change #4718

Conversation

rnewson commented Aug 4, 2023

Overview

Testing recommendations

Related Issues or Pull Requests

Checklist

rnewson Aug 4, 2023

Choose a reason for hiding this comment

nickva Aug 7, 2023

Choose a reason for hiding this comment

rnewson commented Aug 4, 2023

nickva commented Aug 7, 2023

rnewson commented Aug 7, 2023

nickva Aug 17, 2023

Choose a reason for hiding this comment

rnewson Aug 18, 2023

Choose a reason for hiding this comment

nickva Aug 18, 2023

Choose a reason for hiding this comment

nickva commented Aug 17, 2023