Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: v3 write API and line protocol validation #25048

Closed
wants to merge 6 commits into from

Conversation

hiltontj
Copy link
Contributor

@hiltontj hiltontj commented Jun 7, 2024

Closes #25033

Summary

This PR leverages a set of changes introduced experimentally into influxdb3_core to enable the v3 line protocol write API proposed in #24979.

You can see the result in an end-to-end test here, that performs a write to the /api/v3/write API, and then performs a set of queries to verify that the data can be queried, and that the new line protocol is validated for correctness in the written series key.

Detailed Changes

Extend the Catalog to support tables with series key

After #25031, the TableDefinition in the catalog is just a wrapper around the Schema type from the core schema crate. Therefore, a lot of the following functionality stems from the changes made to that type to support the series key in influxdb3_core.

  • The catalog can support v1/v2 and `v3 tables simultaneously; the difference between them is:
    1. v3 has a schema-level metadata entry that stores the series key members
    2. v3 can contain Key columns, but not Tag columns, while v1/v2 is the inverse
  • A new order is enforced for columns in a schema. For v1/v2:
    tags (in lexicographical order) -> fields -> time
    
    For v3:
    series keys (in user-defined order) -> fields -> time
    

Split the write path to enable validation and buffering of v1 and v3 writes

The code that parsed and validated incoming writes was previously written as a series of nested function calls. This PR refactored that code by:

Add a new /api/v3/write API

This is the API that leverages the new write path, and can be used to perform writes using the new write protocol. It accepts the same parameters as the existing /api/v3/write_lp API.

This commit points at the series key branch on influxdb3_core, and refac-
tors the code to make use of it throughout.

TableDefinitions in the catalog are updated to support the series key
column type, and will cause panics when de/serializing table snapshots
that contain cross-contaminated column types, i.e., that have both tags
and series key columns.

This does not implement the v3 write API or any of the stack that
performs the writes to use these new TableDefinition features.

Several tests are broken by this commit, so may be fixed in a future
commit, or with ammendments, but this is a stopping point before pursuing
further changes.
A end-to-end test was added to test writing to the v3
write API, along with tests for failure modes to check
the validation of incoming writes.
@hiltontj
Copy link
Contributor Author

Closed, see #25066

@hiltontj hiltontj deleted the hiltontj/write-v3-lp branch July 12, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an /api/v3/write API to use v3 line protocol
1 participant