Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOTDB-1140 optimize regular data encoding #2621

Merged
merged 7 commits into from
Feb 6, 2021
Merged

IOTDB-1140 optimize regular data encoding #2621

merged 7 commits into from
Feb 6, 2021

Conversation

wangchao316
Copy link
Member

@wangchao316 wangchao316 commented Feb 3, 2021

current regular data encoding algorithm:

Calculate the difference between two adjacent values. The smallest difference is used as the equal-frequency frequency.
Determine the data range of this batch of data based on the difference between the last value and the first value.
Traverse this batch of data, use a BitSet, compare the difference between two adjacent values with the same frequency, and save the value true by default,
If the value is not equal to the equal frequency, calculate the number of equal frequency differences and set the value to false at the corresponding position, indicating that the point is a missing point.

this algorithm only can identity missing point, if have error point , it will throw exception..

because BitSet only can do this thing, indicates whether the same frequency exists in a segment of data

But there is some optimize point..

If there is an abnormal value in a column of values, the algorithm is deviated if the difference is directly obtained to the minimum value.

sample: 1000,1100,1800,1400,1500...

current algorithm be do not use...

1800 is a error point, we should identity error point, revise data.

revise data should be : 1000,1100,1300,1400,1500

After discussion , solution:

  1. The value cannot be in regular encoding.
  2. time_encoder can not alter after service start.

@wangchao316
Copy link
Member Author

@jixuan1989 @qiaojialin hi , could you please review this PR?
Thanks...

@wangchao316 wangchao316 closed this Feb 5, 2021
@wangchao316 wangchao316 reopened this Feb 5, 2021
@sonarcloud
Copy link

sonarcloud bot commented Feb 5, 2021

Copy link
Member

@qiaojialin qiaojialin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

import org.apache.iotdb.tsfile.common.conf.TSFileDescriptor;
import org.junit.*;

import java.io.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's no advice to import star in java

@qiaojialin qiaojialin merged commit 064cd96 into apache:master Feb 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants