Hi,
Problem
At the moment, validation timeout is hardcoded for the Meilisearch sink. My understanding is that other sinks use synchronous inserts and that means the timeout for the task success can be set in the UI and increased if they take a long time to resolve.
In Meilisearch, tasks are added to a queue and Sequin polls the tasks API to validate. The hardcoded values can be found in the wait_for_task function:
defp wait_for_task(%MeilisearchSink{} = sink, task_id) do
...
max_retries: 5,
...
delay = Sequin.Time.exponential_backoff(200, count, 10_000)
...
end
Only 5 retries (actually 4 from my observations in the logs) and a total validation timeout of 3,2 seconds. Before PR #2038, it was set to 10 retries so the backoff delay was 52,6 seconds.
Concretely, in my case, that means that task validation was working 100% of the time when adding the first rows in my Meilisearch index and as the index grows, indexing slows down (the rows contain a vector and a chunk of text). The result is an increased rate of task validation timeout with retries and right now I'm at 100% validation timeout, Meilisearch is incapable to complete its tasks under 3,2 seconds (it takes 10-12 to import 50 items at the moment).
That means that with Sequin's latest version, the Meilisearch sink should be considered broken as it only works in very optimal conditions with small batch sizes and empty indices.
Suggested solutions
My solution for now has been to fork the repo and readd 10 retries.
I would like to submit a PR to solve the issue upstream, here are a few ideas that I would like to discuss before doing it to make sure the PR is merged:
- change back
max_retries from 5 to 10. A quick fix for now that would make almost all configs work, as long as Meilisearch completes tasks under a minute.
- change the exponential backoff mechanism so it uses
timeout_seconds as the max value for the total validation timeout. That way, the user can set the timeout value in their config like they would do with any other sink. Best long term solution and could be implemented more carefully after (1) is implemented as a quick fix.
- add a new variable for Meilisearch configurations called
task_validation_timeout_seconds. I think it's more ambitious as the change needs to be propagated in the UI, in the docs and potentially other places I don't know. I think we should avoid it for now.
What do you think? I sent a PR for 1 (#2139) and for 2 (#2140).
Hi,
Problem
At the moment, validation timeout is hardcoded for the Meilisearch sink. My understanding is that other sinks use synchronous inserts and that means the timeout for the task success can be set in the UI and increased if they take a long time to resolve.
In Meilisearch, tasks are added to a queue and Sequin polls the tasks API to validate. The hardcoded values can be found in the
wait_for_taskfunction:Only 5 retries (actually 4 from my observations in the logs) and a total validation timeout of 3,2 seconds. Before PR #2038, it was set to 10 retries so the backoff delay was 52,6 seconds.
Concretely, in my case, that means that task validation was working 100% of the time when adding the first rows in my Meilisearch index and as the index grows, indexing slows down (the rows contain a vector and a chunk of text). The result is an increased rate of task validation timeout with retries and right now I'm at 100% validation timeout, Meilisearch is incapable to complete its tasks under 3,2 seconds (it takes 10-12 to import 50 items at the moment).
That means that with Sequin's latest version, the Meilisearch sink should be considered broken as it only works in very optimal conditions with small batch sizes and empty indices.
Suggested solutions
My solution for now has been to fork the repo and readd 10 retries.
I would like to submit a PR to solve the issue upstream, here are a few ideas that I would like to discuss before doing it to make sure the PR is merged:
max_retriesfrom 5 to 10. A quick fix for now that would make almost all configs work, as long as Meilisearch completes tasks under a minute.timeout_secondsas the max value for the total validation timeout. That way, the user can set the timeout value in their config like they would do with any other sink. Best long term solution and could be implemented more carefully after (1) is implemented as a quick fix.task_validation_timeout_seconds. I think it's more ambitious as the change needs to be propagated in the UI, in the docs and potentially other places I don't know. I think we should avoid it for now.What do you think? I sent a PR for 1 (#2139) and for 2 (#2140).