Skip to content

Data Layout On Disk

Hongzheng Shi edited this page Dec 3, 2018 · 2 revisions

This is to describe how data will be persisted on disk.

Schema

Schema should only be accessed from schema API to metastore.

Schema File

{root_path}/metastore/{table_name}/schema

Enum Cases

{root_path}/metastore/{table_name}/enums/{column_name}

Archiving cutoff

{root_path}/metastore/{table_name}/shards/{shard_id}/version

Backfill Progress (Fact table only)

{root_path}/metastore/{table_name}/shards/{shard_id}/redolog-offset

Snapshot Progress (Dimension table only)

{root_path}/metastore/{table_name}/shards/{shard_id}/redolog-offset

Batches Versions (Fact Only)

{root_path}/metastore/{table_name}/shards/{shard_id}/batches/{batch_id}

[Redologs](Redo-Logs

Retention

Purge based on archiving process.

Path on disk

{root_path}/data/{table_name}_{shard_id}/redologs/{creation_time}.redolog

Sample

/var/AresDB/data/myTable_0/redologs/1499971253.redolog
/var/AresDB/data/myTable_1/redologs/1499970221.redolog

Snapshot

Retention

Purge based on individual table.

Path on disk

{root_path}/data/{table_name}_{shard_id}/snapshots/
 -- {redo_log1}_{offset1}
	 -- {batchID1}
		-- {column1}.data
		-- {column2}.data
	 -- {batchID2}
		-- {column1}.data
		-- {column2}.data

 -- {redo_log2}_{offset2}
	 -- {batchID1}
		-- {column1}.data
		-- {column2}.data
	 -- {batchID2}
		-- {column1}.data
		-- {column2}.data

Sample

/var/AresDB/data/myTable_0/snapshots/1499970253.snapshot
/var/AresDB/data/myTable_1/snapshots/1499970221.snapshot

Archive Batches

Retention

Based on individual table.

Path on disk

{root_path}/data/{table_name}_{shard_id}/archive_batches/{batch_id}_{batch_version}-{backfill_seq_num}
{root_path}/data/{table_name}_{shard_id}/archive_batches/{batch_id}_{batch_version}-{backfill_seq_num}/{column}.data

NOTE: batch_id is UTC date

NOTE: batch_version is the cutoff seconds in unix time.

Sample

/var/AresDB/data/myTable_0/archive_batches/2017-07-19_1499971253/column1.data
/var/AresDB/data/myTable_0/archive_batches/2017-07-19_1499971253/column2.data
/var/AresDB/data/myTable_0/archive_batches/2017-07-19_1499971877/column1.data
/var/AresDB/data/myTable_0/archive_batches/2017-07-19_1499971877/column2.data
/var/AresDB/data/myTable_1/archive_batches/2017-07-20_1499971877/column1.data
/var/AresDB/data/myTable_1/archive_batches/2017-07-20_1499971877/column2.data
/var/AresDB/data/myTable_1/archive_batches/2017-07-20_1499974877/column1.data
/var/AresDB/data/myTable_1/archive_batches/2017-07-20_1499974877/column2.data
You can’t perform that action at this time.