This repository has been archived by the owner on Sep 27, 2019. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
36 changed files
with
907 additions
and
205 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -313,3 +313,6 @@ third_party/gflags/ | |
|
||
# Test output file logs to ignore | ||
stats_log | ||
|
||
test.txt | ||
Vagrantfile |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
## Files Added | ||
|
||
* [compressed_tile.cpp](https://github.com/rohit-cmu/compression-peloton/blob/master/src/storage/compressed_tile.cpp) | ||
* [compressed_tile.h](https://github.com/rohit-cmu/compression-peloton/blob/master/src/include/storage/compressed_tile.h) | ||
|
||
### Other Existing Files Modified | ||
|
||
Some additional functions have been added to support database compression. Here are the following files: | ||
|
||
* [tile.h](https://github.com/rohit-cmu/compression-peloton/blob/master/src/include/storage/tile.h) | ||
* [tile.cpp](https://github.com/rohit-cmu/compression-peloton/blob/master/src/storage/tile.cpp) | ||
* [tile_group.h](https://github.com/rohit-cmu/compression-peloton/blob/master/src/include/storage/tile_group.h) | ||
* [tile_group.cpp](https://github.com/rohit-cmu/compression-peloton/blob/master/src/storage/tile_group.cpp) | ||
* [data_table.cpp](https://github.com/rohit-cmu/compression-peloton/blob/master/src/storage/data_table.cpp) | ||
|
||
|
||
## Strategy for Compressing | ||
|
||
* Once a tile gets full (all slots occupied by tuples), the CompressTile is created. | ||
* We scan each column and sort it. | ||
* We compute the median, min and max, this gives us the minoffset(min-median) and maxoffset(max-median). | ||
* If these offsets can be represented in a smaller data type than the original data type, we compress the entire column. | ||
* The median is chosen as the base, and essentially only the offsets are stored in the column. | ||
* Currently we can compress SMALLINT and INTEGER and BIGINT. TINYINT is already 1 byte so we do not compress it. | ||
* We are yet to add support for decimal values. | ||
* This median is also stored as the metadata to later retrieve the original value | ||
|
||
## Testing | ||
|
||
* [Compression Correctness Test](https://github.com/rohit-cmu/compression-peloton/blob/master/test/sql/compression_sql_test.cpp) | ||
* [Compression Size Test](https://github.com/rohit-cmu/compression-peloton/blob/master/test/storage/compression_test.cpp) | ||
|
||
|
||
### Compression Correctness Test: | ||
* This test inserts 25 tuples | ||
* Each tuple is of the form (i, i*100) where i belongs to (0,25) | ||
* Since each tile group contains 1 tile and 10 tuples per tile, there are 3 tile groups formed. | ||
* The first 2 tile groups are full and get compressed. | ||
* The third tile group has 5 slots vacant and is still not full and is uncompressed. | ||
* Thus we now have compressed and uncompressed data. | ||
* We now perform a SELECT * on this and expect to correctly recieve the true value of the compressed data and uncompressed data. | ||
|
||
### Compression Size Test: | ||
* This test inserts 100 tuples | ||
* Each tuple is of the form [Integer, Integer, Decimal, Varchar] | ||
* Thus the tuple length is (4+ 4+ 8+ 8) = 24 bytes. | ||
* We currently support the compression of integers | ||
* The integers get compressed to TINYINT making the tuple sizes now (1+ 1+ 8+ 8) = 18 bytes | ||
* The test checks this decrease in size. | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.