Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark ParadeDB against 1TB Dataset & Update Benchmarks #686

Closed
4 tasks
philippemnoel opened this issue Jan 10, 2024 · 3 comments · Fixed by #1028
Closed
4 tasks

Benchmark ParadeDB against 1TB Dataset & Update Benchmarks #686

philippemnoel opened this issue Jan 10, 2024 · 3 comments · Fixed by #1028
Assignees
Labels
ci/tests Issue related to our CI and/or testing frameworks docker Pull requests that update Docker code pg_search Issue related to `pg_search/` priority-2-medium Medium priority issue user-request This issue was directly requested by a user

Comments

@philippemnoel
Copy link
Collaborator

What
We are starting to get requests from users with larger databases, sometimes up to 1TB+. Right now, we're only benchmarking ParadeDB and pg_bm25 on a pretty small, ~8GB dataset. I propose that we set up a 1TB benchmarking test, which we could run manually or more rarely than the nightly benchmark runs, to test the system at greater scale.

Here are some relevant links I found:

Why
^

How
^

@philippemnoel philippemnoel added ci/tests Issue related to our CI and/or testing frameworks priority-2-medium Medium priority issue pg_search Issue related to `pg_search/` labels Jan 10, 2024
@GPF199541
Copy link

Our data use case is 10TB+, and we are looking forward to it very much

@philippemnoel
Copy link
Collaborator Author

Our data use case is 10TB+, and we are looking forward to it very much

Sweet! We’ll hopefully get to this before the end of January. I’ll keep you posted. Could you share a bit more about your use case in the meantime?

@philippemnoel philippemnoel added this to the pg_bm25 production ready milestone Feb 5, 2024
@philippemnoel philippemnoel added docker Pull requests that update Docker code pg_analytics user-request This issue was directly requested by a user labels Feb 14, 2024
@philippemnoel philippemnoel changed the title Benchmark ParadeDB against 1TB Dataset Benchmark ParadeDB against 1TB Dataset & Update Benchmarks Mar 5, 2024
@philippemnoel
Copy link
Collaborator Author

@neilyio is currently looking into this. We'll also update our benchmarks for pg_bm25, which are a bit outdated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/tests Issue related to our CI and/or testing frameworks docker Pull requests that update Docker code pg_search Issue related to `pg_search/` priority-2-medium Medium priority issue user-request This issue was directly requested by a user
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants