Skip to content

[IOTDB-3611] Support "Modify Time Series Encoding and Compression Type" interface/command#6884

Closed
lpf4254302 wants to merge 81 commits intoapache:masterfrom
lpf4254302:support_alter_timeseries
Closed

[IOTDB-3611] Support "Modify Time Series Encoding and Compression Type" interface/command#6884
lpf4254302 wants to merge 81 commits intoapache:masterfrom
lpf4254302:support_alter_timeseries

Conversation

@lpf4254302
Copy link
Copy Markdown
Contributor

Description

Target:
Support "Modify Encoding Type and Compression Type" command
Application scenarios:
In IoTDB application projects, reasonable setting of encoding and compression algorithms can effectively reduce disk space occupancy and reduce server costs in disguise. Modifying encoding and compression algorithms is an ideal method.
Example:
The physical quantity root.sg1.device_1.m1, the initial encoding type is PLAIN, and the data characteristics are straight-up data 1,2,3,4,5.....
After a long period of accumulation, the data occupies about 1G of hard disk space. Now after modifying the encoding type to TS_2DIFF, the data occupying the hard disk space can be reduced to less than 100M.

plan selection

About the selection of single/multiple physical quantity modification

Option 1: A single command only modifies the encoding type and compression type of a single physical quantity
Option 2: A single command supports batch modification of physical quantity encoding type and compression type
Conclusion: Option 1 is selected for the first version, and Option 2 is supported after the command verification is passed.

About the selection of the command affecting the data range

Option 1: Affect the newly inserted data after modification
Option 2: Affect all sealed, unsealed and newly inserted data after modification
Conclusion: The first version chooses option 2. After the command is executed successfully, you can immediately see the disk changes

Merge related code changes

Before developing the function, it is necessary to modify the original merged code to adapt to this function
1. When the same physical quantity has different encoding types or compression types in the tsfile file, the merged tsfile will be damaged, and this problem needs to be fixed
2. Since the merge process is a mutually exclusive operation, it is necessary to increase the lock control

Modify the encoding type and compression type positive process

1. Verify the request parameters before execution, including some non-null verification, cluster status verification, etc.
2. Modify the encoding type and compression type in the schema
3. Void the schema cache
4. Find the storage group
5. Perform modification operations by virtual storage group
6. Force close working TsFileProcessors
7. Get rewriteLock
8. Generate alter log
9. Rewrite the ordered and unordered tsfile data operations separately
9.1. (Un)sequenceListByTimePartition
9.2. Traverse tsFileResource
9.3. Filter unexecuted tsFileResource (recovery)
9.4. Generate targetTsFileResource
9.5. Rewrite tsFileResource
9.5.1. Acquire tsFileResource read lock
9.5.2, read device list
9.5.3. startChunkGroup
9.5.4, read chunks
9.5.5. If the measurement is not modified by the target, write the chunk directly
9.5.6. If it is a measurement modified by the target, read pages and points one by one, re-encode and compress them before writing
9.5.7. endChunkGroup
9.5.8. endFile
9.5.9. Release the tsFileResource read lock
9.6. Rename the file to .tsfile->.alter.old .alter->.tsfile
9.7. Replace tsFileResource and targetTsFileResource
9.8. Delete the original tsfile related files (.tsfile .resource .mods)
10. Delete alter log
11. Release rewriteLock

Recovery operation after schema modification

mlog adds AlterTimeSeriesPlan and implements recovery method at the same time

Service restart recovery operation

1. Determine whether RecoverAlter is required before RecoverCompaction
2. Execute recoverAlter before initCompaction

recoverAlter method flow
1. Analyze alter.log to get a list of unfinished tsfiles
2. Check the list of unfinished tsfiles and perform pre-repair operations
3. Rewrite the tsfile operation

Pre-Recovery Action Policy
Incomplete tsfile status:
1. There is no .tsfile
1.1, .alter.old exists and .alter exists - wait for completion
1.2, only .alter.old exists - system exception
1.3, only exists .alter - system exception
2. There is .tsfile
2.1, exists. alter - writing
2.2, exist.alter.old - wait for delete
2.3, does exist - not started

Need to continue to improve the content

1. Aligned time series data rewrite optimization
2. Implementation of RSchemaRegion, SchemaRegionSchemaFileImpl related methods
3. Support for clusters
4. The tsfile rewrite operation is changed to asynchronous execution
5. Support batch modification of physical quantities

This PR has:

  • [√ ] been self-reviewed.
    • [√ ] concurrent read
    • [√ ] concurrent write
    • [√ ] concurrent read and write
  • [√ ] added documentation for new or modified features or behaviors.
  • [√ ] added Javadocs for most classes and all non-trivial methods.
  • [√ ] added comments explaining the "why" and the intent of the code wherever would not be obvious
    for an unfamiliar reader.
  • [√ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold
    for code coverage.
  • [√ ] added integration tests.
  • been tested in a test IoTDB cluster.

Key changed/added classes (or packages if there are too many classes) in this PR

PlanExecutor
LocalSchemaProcessor
SchemaRegionSchemaFileImpl
DataRegion
org.apache.iotdb.db.engine.alter
IoTDBSqlParser.g4
CrossSpaceCompactionTask.java
SingleSeriesCompactionExecutor.java
InnerSpaceCompactionTask.java
TsFileManager
MeasurementMNode

…eries

# Conflicts:
#	antlr/src/main/antlr4/org/apache/iotdb/db/qp/sql/SqlLexer.g4
#	schema-engine-rocksdb/src/main/java/org/apache/iotdb/db/metadata/schemaregion/rocksdb/mnode/RMeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/conf/IoTDBConfig.java
#	server/src/main/java/org/apache/iotdb/db/engine/storagegroup/DataRegion.java
#	server/src/main/java/org/apache/iotdb/db/metadata/idtable/entry/InsertMeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/metadata/mnode/IMeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/metadata/mnode/MeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/metadata/schemaregion/ISchemaRegion.java
#	server/src/main/java/org/apache/iotdb/db/metadata/schemaregion/SchemaRegionMemoryImpl.java
#	server/src/main/java/org/apache/iotdb/db/mpp/plan/constant/StatementType.java
#	server/src/main/java/org/apache/iotdb/db/mpp/plan/parser/ASTVisitor.java
#	server/src/main/java/org/apache/iotdb/db/qp/executor/PlanExecutor.java
#	server/src/main/java/org/apache/iotdb/db/qp/logical/Operator.java
#	server/src/main/java/org/apache/iotdb/db/qp/physical/PhysicalPlan.java
1.altering record cache need clear when task complete
2.altering record cache can only clear with storageGroupName
3.method getCompactionTaskFutureCheckStatusMayBlock no longer to do timeout check
…eries

# Conflicts:
#	server/src/main/java/org/apache/iotdb/db/conf/IoTDBConfig.java
#	server/src/main/java/org/apache/iotdb/db/engine/compaction/performer/impl/ReadPointCompactionPerformer.java
#	server/src/main/java/org/apache/iotdb/db/metadata/mnode/IMeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/mpp/plan/constant/StatementType.java
lpf4254302 and others added 5 commits September 30, 2022 10:09
…eries

# Conflicts:
#	antlr/src/main/antlr4/org/apache/iotdb/db/qp/sql/IoTDBSqlParser.g4
#	schema-engine-rocksdb/src/main/java/org/apache/iotdb/db/metadata/schemaregion/rocksdb/mnode/RMeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/conf/IoTDBConfig.java
#	server/src/main/java/org/apache/iotdb/db/engine/StorageEngine.java
#	server/src/main/java/org/apache/iotdb/db/engine/compaction/performer/impl/ReadPointCompactionPerformer.java
#	server/src/main/java/org/apache/iotdb/db/engine/storagegroup/DataRegion.java
#	server/src/main/java/org/apache/iotdb/db/engine/storagegroup/dataregion/StorageGroupManager.java
#	server/src/main/java/org/apache/iotdb/db/metadata/LocalSchemaProcessor.java
#	server/src/main/java/org/apache/iotdb/db/metadata/idtable/entry/InsertMeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/metadata/mnode/IMeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/metadata/mnode/MeasurementMNode.java
#	server/src/main/java/org/apache/iotdb/db/metadata/schemaregion/SchemaRegionMemoryImpl.java
#	server/src/main/java/org/apache/iotdb/db/metadata/schemaregion/SchemaRegionSchemaFileImpl.java
#	server/src/main/java/org/apache/iotdb/db/mpp/plan/constant/StatementType.java
#	server/src/main/java/org/apache/iotdb/db/mpp/plan/parser/ASTVisitor.java
#	server/src/main/java/org/apache/iotdb/db/qp/executor/PlanExecutor.java
#	server/src/main/java/org/apache/iotdb/db/qp/sql/IoTDBSqlVisitor.java
#	server/src/test/java/org/apache/iotdb/db/qp/physical/PhysicalPlanSerializeTest.java
@lpf4254302 lpf4254302 closed this by deleting the head repository Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants