Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IOTDB-587] New TsFile version 2 #855

Merged
merged 147 commits into from Mar 31, 2020
Merged
Show file tree
Hide file tree
Changes from 129 commits
Commits
Show all changes
147 commits
Select commit Hold shift + click to select a range
5a4e225
fix a bug when recover the last crashed file
HTHou Jan 6, 2020
ec64d27
add exception for something gets wrong
HTHou Jan 6, 2020
68296fc
fix test errors
HTHou Jan 6, 2020
ba57aab
change the comment
HTHou Jan 7, 2020
e7d4626
add comment
HTHou Jan 7, 2020
835cbfa
Merge branch 'master' of https://github.com/apache/incubator-iotdb
HTHou Jan 7, 2020
8a5793a
new tsfile
HTHou Jan 7, 2020
ee2414a
Merge branch 'master' of https://github.com/apache/incubator-iotdb
HTHou Jan 13, 2020
4dab61b
refactor tsfile
HTHou Jan 14, 2020
dad198a
refactor tsfile
HTHou Jan 22, 2020
d17aa55
resolve conflicts
HTHou Feb 4, 2020
9edff86
Merge branch 'master' of https://github.com/apache/incubator-iotdb in…
HTHou Feb 4, 2020
17ead2e
refactor tsfile
HTHou Feb 10, 2020
b44955a
new interface and new tsfile structure
HTHou Feb 17, 2020
8c04837
Merge branch 'master' of https://github.com/apache/incubator-iotdb in…
HTHou Feb 17, 2020
aacdc9d
fix tsfile problems
HTHou Feb 21, 2020
3105712
fix some problems
HTHou Feb 23, 2020
84272b7
resolve conflicts
HTHou Feb 24, 2020
e0dc626
fix some of bugs
HTHou Feb 24, 2020
affde79
fix deviceMNode bugs
HTHou Feb 25, 2020
92094bd
Merge branch 'master' of https://github.com/apache/incubator-iotdb in…
HTHou Feb 25, 2020
3c7ea7b
fix bugs
HTHou Feb 28, 2020
544453d
new TsFile
HTHou Feb 28, 2020
f24ebf2
fix checkLocateStatus in SequenceReader
HTHou Feb 28, 2020
ef1ea1d
refine Schema
qiaojialin Mar 5, 2020
af3be5f
remove unused comment
qiaojialin Mar 5, 2020
ea64f9d
resolve conflicts
HTHou Mar 5, 2020
afadbe4
refactor TsFileSketchTool
HTHou Mar 5, 2020
c46b0bb
resolce conflict
qiaojialin Mar 6, 2020
c232424
remove duplicated mnode
qiaojialin Mar 6, 2020
4c39b0e
Add EmptyDeviceMNode
qiaojialin Mar 6, 2020
7ffbaf5
simplify TsFileMetadataUtils.getChunkMetadataList
qiaojialin Mar 7, 2020
464619e
simplify TsFileMetadataUtils.getChunkMetadataList
qiaojialin Mar 7, 2020
c2a8bee
fix a deviceMNode bug and restore Unit Tests
HTHou Mar 8, 2020
97f2bda
revolve conflict after merging master
qiaojialin Mar 9, 2020
6597f6d
changes to TsFileSequenceReader
JackieTien97 Mar 9, 2020
66226ee
changes to TsFileSequenceReader
JackieTien97 Mar 9, 2020
46fbb66
rename devicemetadata
qiaojialin Mar 9, 2020
7a5b3b0
fix cache key to string
qiaojialin Mar 9, 2020
2c2c93c
hadoop-connector adapt
JackieTien97 Mar 9, 2020
e0b1369
Merge branch 'new_TsFile' of https://github.com/apache/incubator-iotd…
JackieTien97 Mar 9, 2020
587c2a5
some changes in TsFileSequenceReader
JackieTien97 Mar 10, 2020
03e93e3
remove Schema in server
qiaojialin Mar 10, 2020
791bc26
Merge branch 'new_TsFile' of github.com:apache/incubator-iotdb into n…
qiaojialin Mar 10, 2020
9d77765
remove DeviceMNode and EmptyDeviceMNode
qiaojialin Mar 10, 2020
5ef0683
add version read and write
qiaojialin Mar 10, 2020
d85244f
fix null encoding in test
qiaojialin Mar 10, 2020
90dd71a
remove readAllChunkMetadats() in TsFileSequenceReader
qiaojialin Mar 10, 2020
51c3332
fix Statistics T
qiaojialin Mar 10, 2020
a7e238c
fix TsFileMetadataTest
HTHou Mar 10, 2020
f79da21
optimize readChunkMetadataInDevice
qiaojialin Mar 11, 2020
8da9e82
fix bugs in convertSpace2TimePartition
JackieTien97 Mar 11, 2020
fda05aa
fix bugs in tsfile tests
JackieTien97 Mar 11, 2020
735977e
resolve conflicts
qiaojialin Mar 11, 2020
9638a56
fix SessionExample
qiaojialin Mar 11, 2020
32c0100
fix SessionExample import
qiaojialin Mar 11, 2020
ca0ac00
fix null encoder
qiaojialin Mar 11, 2020
90c7048
fix getDeviceNameInRange in TsFileSequenceReader
qiaojialin Mar 11, 2020
05e10f5
fix inpartitionTimeRange update
qiaojialin Mar 11, 2020
5b65edd
fix spark and allow write without register device
qiaojialin Mar 11, 2020
39fbd52
fix sketchTool and sequenceRead example
HTHou Mar 11, 2020
f9221e2
fix schema and HDFSInputTest
qiaojialin Mar 11, 2020
6d26870
Merge branch 'new_TsFile' of github.com:apache/incubator-iotdb into n…
qiaojialin Mar 11, 2020
934fc9a
fix TimeRange and spark package name
qiaojialin Mar 11, 2020
01b594f
fix MetadataQuerier with Time range
qiaojialin Mar 12, 2020
a71086b
add filter and test for tsfile
qiaojialin Mar 12, 2020
dc3c8d1
fix TimeseriesMetadata Statistics error
qiaojialin Mar 12, 2020
f1f0777
diable timegenerator cache and add test
qiaojialin Mar 12, 2020
5d7bb8c
add same measurememts with different datatypes test
HTHou Mar 12, 2020
4460ee7
Merge branch 'new_TsFile' of https://github.com/apache/incubator-iotd…
HTHou Mar 12, 2020
17d377b
fix spark test
qiaojialin Mar 12, 2020
372503d
Merge branch 'new_TsFile' of github.com:apache/incubator-iotdb into n…
qiaojialin Mar 12, 2020
83701da
fix Timegenerator cache
qiaojialin Mar 12, 2020
e4ec1d2
remove comment
qiaojialin Mar 12, 2020
0b22aff
fix timegenerator cache bug
qiaojialin Mar 12, 2020
b3d7b7b
fix loop not end
qiaojialin Mar 12, 2020
85c7c47
fix javadoc
qiaojialin Mar 12, 2020
cbcb3d4
test mkdir
qiaojialin Mar 12, 2020
83f586b
fix windows test
qiaojialin Mar 12, 2020
e638e18
add same measurememts with different datatypes test in IoTDB
HTHou Mar 12, 2020
d73f16a
fix a bug in TsFileWriter
HTHou Mar 12, 2020
ea12c00
reformat sketchTool print result
HTHou Mar 12, 2020
76b9e2d
change test file name
qiaojialin Mar 13, 2020
616a604
close reader in test
qiaojialin Mar 13, 2020
96b78f4
resolve conflict
qiaojialin Mar 13, 2020
de3a174
update SeriesReader
JackieTien97 Mar 13, 2020
13a207e
Merge remote-tracking branch 'origin/master' into new_TsFile
qiaojialin Mar 13, 2020
60565b3
merge master
qiaojialin Mar 13, 2020
6bcdce6
resolve conflict
qiaojialin Mar 13, 2020
a29548c
fix StorageGroupProcessorTest
qiaojialin Mar 13, 2020
e1d4660
fix UnseqTsFileRecoverTest
qiaojialin Mar 13, 2020
d193a0c
some debugs
JackieTien97 Mar 14, 2020
f8bed88
Fix some bugs
JackieTien97 Mar 14, 2020
e0bb259
fix:pass merge test (#907)
zhanglingzhe0820 Mar 14, 2020
37f7a0e
Merge branch 'master' into new_TsFile
qiaojialin Mar 14, 2020
22dbf13
Bug fix
JackieTien97 Mar 14, 2020
7107358
Merge branch 'new_TsFile' of https://github.com/apache/incubator-iotd…
JackieTien97 Mar 14, 2020
01c9ff6
resolve conflicts
JackieTien97 Mar 15, 2020
0f9b797
remove unnecessary code
HTHou Mar 16, 2020
4d62d50
merge master
qiaojialin Mar 17, 2020
cc1c508
debug
JackieTien97 Mar 18, 2020
9a4a49c
merge master
qiaojialin Mar 18, 2020
0c45aa2
fuck the bug
JackieTien97 Mar 18, 2020
c3e1894
Merge branch 'new_TsFile' of https://github.com/apache/incubator-iotd…
JackieTien97 Mar 18, 2020
6454f4f
fix:static test bug (#922)
zhanglingzhe0820 Mar 19, 2020
7cc49b8
fix UnseqTsFileRecoverTest
qiaojialin Mar 19, 2020
6022284
fix restart hangup
qiaojialin Mar 19, 2020
71f4d50
fix class name
qiaojialin Mar 19, 2020
81768ba
Fix tsfile test bug (#924)
zhanglingzhe0820 Mar 19, 2020
0b67596
use file startTimeMap instead of exact time series start time to redu…
JackieTien97 Mar 19, 2020
228e2b8
add license
JackieTien97 Mar 19, 2020
63f2aae
adapt to hive-connector
JackieTien97 Mar 19, 2020
4c19e84
update merge recover (#925)
zhanglingzhe0820 Mar 20, 2020
1252168
throw exception in MergeFileTask when meet file broken
qiaojialin Mar 20, 2020
e8a4111
fix sonar bug
qiaojialin Mar 20, 2020
f74a588
fix conflict
qiaojialin Mar 20, 2020
ebe70fd
fix conflict
qiaojialin Mar 20, 2020
5ca6be4
enable cacheDeviceMetadata in TsFileSequenceReader
qiaojialin Mar 20, 2020
ea64fd7
fix init file metadata
qiaojialin Mar 23, 2020
a618d70
Add TimeSeriesMetadataCache in server
JackieTien97 Mar 24, 2020
16d5e30
Fix code smell
samperson1997 Mar 24, 2020
a81d8a4
Fix more code smell
samperson1997 Mar 24, 2020
93d72b2
Fix some potential bugs
samperson1997 Mar 24, 2020
6fee3f4
Merge branch 'master' into new_TsFile
qiaojialin Mar 25, 2020
800e20c
fix sonar
qiaojialin Mar 25, 2020
2ae74dd
fix restart
qiaojialin Mar 25, 2020
80f6284
change reader logger to debug
qiaojialin Mar 25, 2020
32d3f90
fix a TsFile path bug on macos
HTHou Mar 26, 2020
995db50
repair the restart bug
JackieTien97 Mar 27, 2020
933537c
improve cache
JackieTien97 Mar 27, 2020
506fa46
move StorageEngine init to the last for test
qiaojialin Mar 27, 2020
1ec96e8
Merge branch 'new_TsFile' of https://github.com/apache/incubator-iotd…
JackieTien97 Mar 27, 2020
cf5f441
resolve conflict
qiaojialin Mar 30, 2020
3286c98
remove unused code in TsFileResource
qiaojialin Mar 30, 2020
b472cf6
Change to use duplicated path and datatypes in LastQueryExecutor
wshao08 Mar 30, 2020
7cd97bc
Merge pull request #957 from wshao08/new_TsFile
JackieTien97 Mar 30, 2020
1572237
change to cache
JackieTien97 Mar 30, 2020
6d06e32
resolve conflicts
JackieTien97 Mar 30, 2020
a54da14
adapt to last
JackieTien97 Mar 30, 2020
70a0ab1
format
qiaojialin Mar 30, 2020
437cfa6
Merge remote-tracking branch 'origin/new_TsFile' into new_TsFile
qiaojialin Mar 30, 2020
1e9245d
enlarge TimeseriesMetadataCache, fix averageSize bug, optimize foreac…
qiaojialin Mar 30, 2020
5731dc2
clear deviceToSensors in AlignByDevice query
qiaojialin Mar 30, 2020
8ec481f
fix sonar
qiaojialin Mar 30, 2020
e653ab3
update tsfile format changelist
qiaojialin Mar 30, 2020
3481c3b
use bloomfilter in TimeseriesMetadataCache
qiaojialin Mar 30, 2020
e8b8ca8
update licenses info
HTHou Mar 30, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
48 changes: 28 additions & 20 deletions docs/Documentation-CHN/SystemDesign/7-Connector/3-Spark-TsFile.md
Expand Up @@ -19,16 +19,16 @@

-->

# Spark Tsfile 连接器
# Spark TsFile 连接器

## 设计目的

* 使用Spark SQL读取指定Tsfile的数据,以Spark DataFrame的形式返回给客户端
* 使用 Spark SQL 读取指定 TsFile 的数据,以 Spark DataFrame 的形式返回给客户端

* 使用Spark Dataframe中的数据生成Tsfile
* 使用 Spark DataFrame 中的数据生成 TsFile

## 支持格式
宽表结构:Tsfile原生格式,IOTDB原生路径格式
宽表结构:TsFile 原生格式,IoTDB 原生路径格式

| time | root.ln.wf02.wt02.temperature | root.ln.wf02.wt02.status | root.ln.wf02.wt02.hardware | root.ln.wf01.wt01.temperature | root.ln.wf01.wt01.status | root.ln.wf01.wt01.hardware |
|------|-------------------------------|--------------------------|----------------------------|-------------------------------|--------------------------|----------------------------|
Expand All @@ -39,14 +39,14 @@
| 5 | null | null | null | null | false | null |
| 6 | null | null | ccc | null | null | null |

窄表结构: 关系型数据库模式,IOTDB align by device格式
窄表结构: 关系型数据库模式,IoTDB align by device格式

| time | device_name | status | hardware | temperature |
|------|-------------------------------|--------------------------|----------------------------|-------------------------------|
| 1 | root.ln.wf02.wt01 | true | null | 2.2 |
| 1 | root.ln.wf02.wt02 | true | null | null |
| 2 | root.ln.wf02.wt01 | null | null | 2.2 |
| 2 | root.ln.wf02.wt02 | false | aaa | null |
| 2 | root.ln.wf02.wt01 | null | null | 2.2 |
| 2 | root.ln.wf02.wt02 | false | aaa | null |
| 3 | root.ln.wf02.wt01 | true | null | 2.1 |
| 4 | root.ln.wf02.wt02 | true | bbb | null |
| 5 | root.ln.wf02.wt01 | false | null | null |
Expand All @@ -55,47 +55,55 @@
## 查询流程步骤

#### 1. 表结构推断和生成
该步骤是为了使DataFrame的表结构与需要查询的Tsfile的表结构匹配
主要逻辑在src/main/scala/org/apache/iotdb/spark/tsfile/DefaultSource.scala中的inferSchema函数

该步骤是为了使DataFrame的表结构与需要查询的 TsFile 的表结构匹配

主要逻辑在 src/main/scala/org/apache/iotdb/spark/tsfile/DefaultSource.scala 中的 inferSchema 函数

#### 2. SQL解析
该步骤目的是为了将用户SQL语句转化为Tsfile原生的查询表达式

主要逻辑在src/main/scala/org/apache/iotdb/spark/tsfile/DefaultSource.scala中的buildReader函数
该步骤目的是为了将用户 SQL 语句转化为 TsFile 原生的查询表达式

主要逻辑在 src/main/scala/org/apache/iotdb/spark/tsfile/DefaultSource.scala 中的 buildReader 函数

SQL解析分宽表结构与窄表结构

#### 3. 宽表结构
宽表结构的SQL解析主要逻辑在src/main/scala/org/apache/iotdb/spark/tsfile/WideConverter.scala中

宽表结构的SQL解析主要逻辑在 src/main/scala/org/apache/iotdb/spark/tsfile/WideConverter.scala 中

该结构与Tsfile原生查询结构基本相同,不需要特殊处理,直接将SQL语句转化为相应查询表达式即可
该结构与 TsFile 原生查询结构基本相同,不需要特殊处理,直接将SQL语句转化为相应查询表达式即可

#### 4. 窄表结构
宽表结构的SQL解析主要逻辑在src/main/scala/org/apache/iotdb/spark/tsfile/NarrowConverter.scala中

SQL转化为表达式后,由于窄表结构与Tsfile原生查询结构不同,需要先将表达式转化为与device有关的析取表达式
,才可以转化为对Tsfile的查询,转化代码在src/main/java/org/apache/iotdb/spark/tsfile/qp中
宽表结构的SQL解析主要逻辑在 src/main/scala/org/apache/iotdb/spark/tsfile/NarrowConverter.scala中

SQL转化为表达式后,由于窄表结构与 TsFile 原生查询结构不同,需要先将表达式转化为与 device 有关的析取表达式
,才可以转化为对 TsFile 的查询,转化代码在src/main/java/org/apache/iotdb/spark/tsfile/qp中

#### 5. 查询实际执行
实际数据查询执行由Tsfile原生组件完成,参见:

实际数据查询执行由 TsFile 原生组件完成,参见:

* [Tsfile原生查询流程](../1-TsFile/4-Read.md)

## 写入步骤流程
写入主要是将Dataframe结构中的数据转化为Tsfile的RowRecord,使用Tsfile Writer进行写入

写入主要是将 Dataframe 结构中的数据转化为 TsFile 的 RowRecord,使用 TsFile Writer 进行写入

#### 宽表结构

其主要转化代码在如下两个文件中:

* src/main/scala/org/apache/iotdb/spark/tsfile/WideConverter.scala 负责结构转化

* src/main/scala/org/apache/iotdb/spark/tsfile/WideTsFileOutputWriter.scala 负责匹配spark接口与执行写入,会调用上一个文件中的结构转化功能
* src/main/scala/org/apache/iotdb/spark/tsfile/WideTsFileOutputWriter.scala 负责匹配 spark 接口与执行写入,会调用上一个文件中的结构转化功能

#### 窄表结构

其主要转化代码在如下两个文件中:

* src/main/scala/org/apache/iotdb/spark/tsfile/NarrowConverter.scala 负责结构转化

* src/main/scala/org/apache/iotdb/spark/tsfile/NarrowTsFileOutputWriter.scala 负责匹配spark接口与执行写入,会调用上一个文件中的结构转化功能
* src/main/scala/org/apache/iotdb/spark/tsfile/NarrowTsFileOutputWriter.scala 负责匹配 spark 接口与执行写入,会调用上一个文件中的结构转化功能

@@ -0,0 +1,35 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.apache.iotdb.hadoop.tsfile;

public class Constant {

private Constant() {

}

static final String DEVICE_1 = "device_1";

static final String SENSOR_PREFIX = "sensor_";
static final String SENSOR_1 = "sensor_1";
static final String SENSOR_2 = "sensor_2";
static final String SENSOR_3 = "sensor_3";

}