Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New File format #167

Open
qiaojialin opened this issue Nov 13, 2018 · 1 comment
Open

New File format #167

qiaojialin opened this issue Nov 13, 2018 · 1 comment

Comments

@qiaojialin
Copy link
Member

qiaojialin commented Nov 13, 2018

v0.8.0, add ChunkHeader and ChunkGroupFooter

@jixuan1989
Copy link
Member

jixuan1989 commented Nov 15, 2018

TsFile 文件结构变化记录
2018/11/13 黄向东、乔嘉林、刘昆、江天

    1. Thanos V1
    2. 2.3 文件结构
      TsFile thanos分支中RowGroup 的 metadata存在数据之前,需要在内存中缓存整个 RowGroup 的数据。结构如下图。

image

1. 2.3	接口

写RowGroup Metadata:RowGroupWriter.writeMedatadata()
单列数据写入内存:ChunkWriter.write(datapoint)
单列数据写入磁盘:ChunkWriter.flush()
1. 2.3 数据恢复
将不完整的RowGroup切掉,最后生成完整的File Metadata,此种方式会丢失一个RowGroup。
* 2. Thanos V2
1. 2.3 文件结构
为了减少RowGroup 数据内存占用,现将 RowGroup Metadata 移动到每个 RowGroup数据之后,Chunk与 Page 结构不变。

image

此种结构,内存中仅需缓存一个 Chunk 的数据,按需要可以随时将 Chunk 写入磁盘。每个Header前的byte用来标识接下来的一个metadata的种类。
1. 2.3 接口
在内存创建RowGroup Metadata:RowGroupWriter.startRowGroup()
单列数据写入内存:ChunkWriter.write(datapoint)
单列数据写入磁盘:ChunkWriter.flush()
RowGroup Metadata 写入磁盘:RowGroupWriter.writeMetadata()
1. 2.3 数据恢复
将最后一个不完整的Chunk 切掉,生成RowGroup Metadata,生成File Metadata,此种方式会丢失一个Chunk。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants