Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): data deduplication #3524

Merged
merged 54 commits into from
Jul 19, 2023
Merged

feat(core): data deduplication #3524

merged 54 commits into from
Jul 19, 2023

Conversation

ideoma
Copy link
Collaborator

@ideoma ideoma commented Jun 29, 2023

Related questdb/roadmap#3

This PR adds functionality to enable insert deduplication on the storage level. With the syntax as

CREATE TABLE no_ts_dups_table (timestamp ts, l long) timestamps(ts) PARTITION BY DAY WAL DEDUPLICATE UPSERT KEYS(ts)

-- or
CREATE TABLE no_ts_dups_table (timestamp ts, l long) timestamps(ts) PARTITION BY DAY WAL DEDUP UPSERT KEYS(ts)

-- disabling
ALTER TABLE no_ts_dups_table DEDUP DISABLE

-- re-enabling
ALTER TABLE no_ts_dups_table DEDUP UPSERT KEYS(ts)

The deduplication in this PR is limited to the dedicated timestamp column only and is supported only for WAL tables. Deduplication will replace all matching rows with the latest version from the insert. For example given the table

ts d
08:00 1
08:00 2
09:00 3

and inserts (in one ore several commits)

ts d
08:00 4
08:00 5
09:00 6
10:00 7.
10:00 8

The result will be

ts d
08:00 5
08:00 5
09:00 6
10:00 8

@ideoma ideoma marked this pull request as draft June 29, 2023 10:21
 Conflicts:
	core/src/test/java/io/questdb/test/griffin/O3SquashPartitionTest.java
@ideoma
Copy link
Collaborator Author

ideoma commented Jul 19, 2023

[PR Coverage check]

😍 pass : 373 / 404 (92.33%)

file detail

path covered line new line coverage
🔵 io/questdb/cairo/wal/seq/MetadataServiceStub.java 0 2 00.00%
🔵 io/questdb/PropServerConfiguration.java 0 1 00.00%
🔵 io/questdb/cairo/DefaultCairoConfiguration.java 0 1 00.00%
🔵 io/questdb/cairo/O3OpenColumnJob.java 2 3 66.67%
🔵 io/questdb/cairo/TableWriterMetadata.java 11 14 78.57%
🔵 io/questdb/cairo/O3PartitionJob.java 28 35 80.00%
🔵 io/questdb/cairo/TableWriter.java 103 114 90.35%
🔵 io/questdb/griffin/SqlCompiler.java 48 51 94.12%
🔵 io/questdb/griffin/engine/ops/AlterOperation.java 17 18 94.44%
🔵 io/questdb/griffin/SqlParser.java 39 40 97.50%
🔵 io/questdb/griffin/engine/ops/AlterOperationBuilder.java 15 15 100.00%
🔵 io/questdb/cairo/wal/WalWriterMetadata.java 1 1 100.00%
🔵 io/questdb/cairo/wal/WalWriter.java 1 1 100.00%
🔵 io/questdb/cairo/TableReaderMetadata.java 3 3 100.00%
🔵 io/questdb/cutlass/text/CairoTextWriter.java 1 1 100.00%
🔵 io/questdb/tasks/O3CallbackTask.java 2 2 100.00%
🔵 io/questdb/cairo/wal/seq/TableSequencerImpl.java 1 1 100.00%
🔵 io/questdb/cairo/O3CopyJob.java 12 12 100.00%
🔵 io/questdb/cairo/wal/seq/SequencerMetadata.java 1 1 100.00%
🔵 io/questdb/cairo/TableUtils.java 6 6 100.00%
🔵 io/questdb/cairo/O3CallbackJob.java 1 1 100.00%
🔵 io/questdb/cairo/security/ReadOnlySecurityContext.java 1 1 100.00%
🔵 io/questdb/cutlass/text/ParallelCsvFileImporter.java 1 1 100.00%
🔵 io/questdb/std/Vect.java 7 7 100.00%
🔵 io/questdb/cairo/TableColumnMetadata.java 7 7 100.00%
🔵 io/questdb/cutlass/line/udp/LineUdpParserImpl.java 1 1 100.00%
🔵 io/questdb/cairo/AbstractRecordMetadata.java 1 1 100.00%
🔵 io/questdb/cutlass/line/tcp/TableStructureAdapter.java 1 1 100.00%
🔵 io/questdb/griffin/model/CreateTableModel.java 5 5 100.00%
🔵 io/questdb/cairo/GenericRecordMetadata.java 2 2 100.00%
🔵 io/questdb/griffin/SqlKeywords.java 51 51 100.00%
🔵 io/questdb/cairo/security/AllowAllSecurityContext.java 1 1 100.00%
🔵 io/questdb/tasks/O3PartitionTask.java 3 3 100.00%

@bluestreak01 bluestreak01 merged commit cfb4eb7 into master Jul 19, 2023
21 checks passed
@bluestreak01 bluestreak01 deleted the dedup branch July 19, 2023 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants