Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Redis table Engine/Function #50150

Merged
merged 35 commits into from Jun 14, 2023
Merged

Conversation

JackyWoo
Copy link
Contributor

@JackyWoo JackyWoo commented May 23, 2023

This PR try to close #46146

Changelog category (leave one):

  • New Feature

Implements storage in the Redis.
Usage : ENGINE = Redis(host:port[, db_index[, password[, pool_size]]]) PRIMARY KEY(key)

In current design, redis table engine/function must have only one column as primary key who is redis key. And the other columns will be serialized together as redis value.

Redis table engine/function now supports

  1. select, insert, update, delete, truncate operations.
  2. equals, in predictes pushdown.

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

  1. Add TableFunctionRedis
  2. Add table engine Redis
  3. Add RedisCommon which contains Redis related tools and types
  4. Support equals and in filter push down into Redis

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Tests

  • Integration tests

@JackyWoo JackyWoo marked this pull request as draft May 23, 2023 14:02
@JackyWoo JackyWoo changed the title Support Redis table Engine/Function Add Redis table Engine/Function May 23, 2023
@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label May 23, 2023
@robot-ch-test-poll3
Copy link
Contributor

robot-ch-test-poll3 commented May 23, 2023

This is an automated comment for commit 49082bf with description of existing statuses. It's updated for the latest CI running
The full report is available here
The overall status of the commit is 🟢 success

Check nameDescriptionStatus
CI runningA meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR🟢 success
Docs CheckBuilds and tests the documentation🟢 success
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here🟢 success
Mergeable CheckChecks if all other necessary checks are successful🟢 success
Push to DockerhubThe check for building and pushing the CI related docker images to docker hub🟢 success
Style CheckRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report🟢 success

@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-feature Pull request with new product feature label May 23, 2023
@helifu
Copy link
Contributor

helifu commented May 30, 2023

Sorry, I'm interested in this feature, but could you please share the usage scenarious of it? Thanks in advance :)

@JackyWoo
Copy link
Contributor Author

@helifu n my production scenario, there is a need for flexibility in querying as well as a requirement for high query throughput. I want to promote query throughput Redis.

Usage:

  1. Insert-into-select data to Redis and query thought Redis directly.
  2. Insert data to Redis and query throught ClickHouse.

@helifu
Copy link
Contributor

helifu commented May 31, 2023

Ah, that's cool. But it seems that redis can't offer high throughput since it's a key-value storage IIRC.

@JackyWoo
Copy link
Contributor Author

JackyWoo commented May 31, 2023

It will work in key based query scenarios.

@JackyWoo JackyWoo force-pushed the support_redis branch 3 times, most recently from b9cf222 to 31d81a5 Compare June 1, 2023 04:16
@pufit pufit self-requested a review June 1, 2023 23:23
@pufit pufit self-assigned this Jun 1, 2023
@JackyWoo JackyWoo marked this pull request as ready for review June 2, 2023 01:57
@JackyWoo
Copy link
Contributor Author

JackyWoo commented Jun 2, 2023

@pufit Please review it when you have a chance. There are some test errors but maybe unrelated with the PR.

@JackyWoo
Copy link
Contributor Author

JackyWoo commented Jun 8, 2023

@pufit May I ask when you have a chance to review th PR?

Copy link
Member

@pufit pufit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides some small questions, LGTM!

src/Storages/StorageRedis.cpp Outdated Show resolved Hide resolved
src/Storages/StorageRedis.cpp Show resolved Hide resolved
@JackyWoo
Copy link
Contributor Author

@pufit for the issue duplicated keys.

To perfectly resolve it, we need additional rehashing stat info from Redis, I submit an issue to Redis for it. And now we have two ways:

  1. Just leave it and add notifications in docs. For duplication is a vary rare case and we can fix it when Redis can provide rehasing stat info.
  2. Keep two scan results in memory and filter them every scan.

I prefer the second one, but any way I will take your opinion?

@pufit
Copy link
Member

pufit commented Jun 12, 2023

@pufit for the issue duplicated keys.

To perfectly resolve it, we need additional rehashing stat info from Redis, I submit an issue to Redis for it. And now we have two ways:

  1. Just leave it and add notifications in docs. For duplication is a vary rare case and we can fix it when Redis can provide rehasing stat info.
  2. Keep two scan results in memory and filter them every scan.

I prefer the second one, but any way I will take your opinion?

Well, actually, the first option looks much more straightforward. All duplications can be handled on ClickHouse side (using DISTINCT, for example). And we can improve this behavior in the future if we consider it necessary.

@JackyWoo
Copy link
Contributor Author

JackyWoo commented Jun 13, 2023

@pufit Thank you, docs is modified.

docs/en/engines/table-engines/integrations/redis.md Outdated Show resolved Hide resolved
docs/en/engines/table-engines/integrations/redis.md Outdated Show resolved Hide resolved
docs/en/engines/table-engines/integrations/redis.md Outdated Show resolved Hide resolved
@pufit pufit merged commit 91d794c into ClickHouse:master Jun 14, 2023
23 of 40 checks passed
@pufit
Copy link
Member

pufit commented Jun 14, 2023

Failed tests:
test_async_query_sending - #50480
02479_mysql_connect_to_self - #50911

@gingerwizard
Copy link
Contributor

@JackyWoo late to see this but started testing it. I see you always use MGET command and never HMGET - the latter feels more natural for table data. Anyway we could support this as MGET means the values must be a string?

@JackyWoo
Copy link
Contributor Author

JackyWoo commented Jul 7, 2023

@gingerwizard Assume that there is a table with schama (key, v1, v2, v3) in which key is ClickHouse primary key. I map key to Redis key and the other columns are packed to Redis value. So MGET is enough. You can start with docs.

PS: key, v1, v2, v3 can be any ClickHouse data type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-feature Pull request with new product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add redis table function and Redis table engine.
7 participants