Introduction of Repository

Welcome to the repository of PreS Compaction.

This repository contains the Server code for the research, "Pre-Select files for compaction" (PreS).

We have integrated PreS into Apache IoTDB, an LSM-tree based time series database, which is an open-source platform with superior performance.

PreS Compaction Strategy

Pre-Select files for compaction strategy (PreS) is a dynamic compaction strategy to predict query patterns and select files for compaction. PreS is tailored for time series database. PreS can capture the incoming queries, analyze historical access information, extract the features of the captured queries, and generate samples using the temporal features of time series. Based on a machine learning model, PreS predicts the query patterns expected by users, which alleviates the issue of static methods failing to track query trends. The predicted query patterns will be used to guide the compaction process of LSM-tree. PreS predicts query patterns and evaluates compaction benefit to find a compromise between the number of files and read amplification, thereby enhancing the adaptability of compaction to queries and reducing query cost.

The following figure shows the architecture of PreS: Please refer to figure 3. The specific picture will be filled after the paper is accepted.

It consists of four core components: Query Collector, Query Pattern Predictor, Compaction Benefit Analyzer, and File Selector. For more specific details, please refer to the paper "PreS: A .....".

IoTDB

About Apache IoTDB:

IoTDB (Internet of Things Database) is a data management system for time series data, which can provide users specific services, such as, data collection, storage and analysis. Due to its light weight structure, high performance and usable features together with its seamless integration with the Hadoop and Spark ecology, IoTDB meets the requirements of massive dataset storage, high throughput data input, and complex data analysis in the industrial IoT field.

Main features of IoTDB are as follows:

Flexible deployment strategy. IoTDB provides users a one-click installation tool on either the cloud platform or the terminal devices, and a data synchronization tool bridging the data on cloud platform and terminals.
Low cost on hardware. IoTDB can reach a high compression ratio of disk storage.
Efficient directory structure. IoTDB supports efficient organization for complex time series data structure from intelligent networking devices, organization for time series data from devices of the same type, fuzzy searching strategy for massive and complex directory of time series data.
High-throughput read and write. IoTDB supports millions of low-power devices' strong connection data access, high-speed data read and write for intelligent networking devices and mixed devices mentioned above.
Rich query semantics. IoTDB supports time alignment for time series data across devices and measurements, computation in time series field (frequency domain transformation) and rich aggregation function support in time dimension.
Easy to get started. IoTDB supports SQL-Like language, JDBC standard API and import/export tools which is easy to use.
Seamless integration with state-of-the-practice Open Source Ecosystem. IoTDB supports analysis ecosystems such as, Hadoop, Spark, and visualization tool, such as, Grafana.
For the latest information about IoTDB(https://iotdb.apache.org/), please visit IoTDB official website.

Code description

Please refer to the paper for the design concept of the components. The specific content will be supplemented after the paper is accepted.

1 Query Collector

The Java classes you can refer to for specific code are QueryMonitorYaos, in package org.apache.iotdb.db.engine.compaction.

2 Query Pattern Predictor

The Java classes you can refer to for specific code are MLQueryAnalyzerYaos, in package org.apache.iotdb.db.engine.compaction;.

3 Compaction Benefit Analyzer

The Java classes you can refer to for specific code are YaosSizeCompactionSelector, in package org.apache.iotdb.db.engine.compaction.inner.sizetiered.

Function "selectLevelTask_byYaos_V1()" in the class describes how the compaction evaluator selects files from the disk and submits them for compaction.

4 File Selector

The Java classes you can refer to for specific code are YaosSizeCompactionSelector, in package org.apache.iotdb.db.engine.compaction.

With the periodic trigger of the compaction operation, the function "selectLevelTask()" will trigger all the operations of PreS and ultimately submit the selected files for compaction.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github		.github
.mvn/wrapper		.mvn/wrapper
antlr		antlr
cli		cli
client-cpp		client-cpp
client-py		client-py
cluster		cluster
code-coverage		code-coverage
compile-tools		compile-tools
cross-tests		cross-tests
distribution		distribution
docker		docker
docs		docs
example		example
external-api		external-api
flink-iotdb-connector		flink-iotdb-connector
flink-tsfile-connector		flink-tsfile-connector
grafana-connector		grafana-connector
grafana-metrics-example		grafana-metrics-example
grafana-plugin		grafana-plugin
hadoop		hadoop
hive-connector		hive-connector
influxdb-protocol		influxdb-protocol
integration		integration
isession		isession
jdbc		jdbc
library-udf		library-udf
licenses		licenses
metrics		metrics
openapi		openapi
rewriteFileTool		rewriteFileTool
server		server
service-rpc		service-rpc
session		session
site		site
spark-iotdb-connector		spark-iotdb-connector
spark-tsfile		spark-tsfile
test/e2e		test/e2e
testcontainer		testcontainer
thrift-cluster		thrift-cluster
thrift-influxdb		thrift-influxdb
thrift-sync		thrift-sync
thrift		thrift
tools		tools
tsfile		tsfile
zeppelin-interpreter		zeppelin-interpreter
.asf.yaml		.asf.yaml
.checkstyle		.checkstyle
.dockerignore		.dockerignore
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
.gitmodules		.gitmodules
ArchitectureOfPres.jpg		ArchitectureOfPres.jpg
CONTRIBUTING.md		CONTRIBUTING.md
Code Summary.md		Code Summary.md
DEPENDENCIES		DEPENDENCIES
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
LICENSE-binary		LICENSE-binary
NOTICE		NOTICE
NOTICE-binary		NOTICE-binary
README.md		README.md
README_ZH.md		README_ZH.md
RELEASE_NOTES.md		RELEASE_NOTES.md
asf.header		asf.header
checkstyle.xml		checkstyle.xml
codecov.yml		codecov.yml
java-google-style.xml		java-google-style.xml
jenkins.pom		jenkins.pom
mvnw.cmd		mvnw.cmd
mvnw.sh		mvnw.sh
pom.xml		pom.xml
实验辅助文本.txt		实验辅助文本.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

Introduction of Repository

PreS Compaction Strategy

IoTDB

Code description

1 Query Collector

2 Query Pattern Predictor

3 Compaction Benefit Analyzer

4 File Selector

About

Licenses found

Uh oh!

Releases

Packages

Languages

License

Licenses found

wangyao2/apache-iotdb-0.13.4-LSM-Research1

Folders and files

Latest commit

History

Repository files navigation

Introduction of Repository

PreS Compaction Strategy

IoTDB

Code description

1 Query Collector

2 Query Pattern Predictor

3 Compaction Benefit Analyzer

4 File Selector

About

Resources

License

Licenses found

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages