MySQLOnRocksDB/mysql-5.6
forked from facebook/mysql-5.6

Loading…
Doubling space by primary key #14
|
|
spetrunia |
Remove unused code: support for storing part of index column in the v…
…
…alue part of RocksDB key-value pair. Initially, this space was used to store lengths of varchar columns (mem-comparable images of varchars were endspace-padded). Then, we switched to using mem-comparable encoding of varchars that stored length in the key part of the key-value pair, so support for index "tails" became dead code. I'm removing it now as I'm going to make changes in data storage (Issue #14), so I will likely break the dead code anyway. |
a4e83ad
|
|
|
spetrunia |
Fix issue #14: Doubling space by primary key
…
SQL table records are stored as (pk_keyval -> record_val) key-value pairs. When PK column datatype allows to decode PK column value from its mem-comparable form in pk_keyval, we don't need to store the column in record_val. |
037e3e9
|
I have made a fix for this. I pushed it into this branch:
https://github.com/MySQLOnRocksDB/mysql-5.6/tree/webscalesql-5.6.21.55-issue14
The branch has three revisions:
Fix the testsuite (It was broken by somebody who migrated to MySQLOnRocksDB repo. This is not related to this bug): 703e9a6
A patch that removes a piece of RocksDB-SE code that is no longer used: a4e83ad
The fix itself: 037e3e9 . There are no specific tests in the patch, the test coverage is that tests in t/rocksdb*.test and suite/rocksdb tests pass after the change.
Do I just push this to the main branch, or I need to submit patches to review into reviews.facebook.net ? (I haven't yet figured out how to do that)
Thanks Sergey, please submit patches to review.facebook.net (follow instructions at https://github.com/MySQLOnRocksDB/mysql-5.6/wiki).
|
|
spetrunia |
Fix issue #14: Doubling space by primary key
…
Summary: SQL table records are stored as (pk_keyval -> record_val) key-value pairs. When PK column datatype allows to decode PK column value from its mem-comparable form in pk_keyval, we don't need to store the column in record_val. Test Plan: ./mtr t/rocks*test && ./mtr --suite=rocksdb && ./mtr --suite=rocksdb_rpl Reviewers: hermanlee4, jonahcohen, maykov, yoshinorim Reviewed By: yoshinorim Differential Revision: https://reviews.facebook.net/D33117 |
c6ae5cd
|
|
|
spetrunia |
Issue #14: post-review fixes: spelling, better comments and variable …
…
…names Test Plan: N/A Reviewers: maykov, hermanlee4, yoshinorim, jonahcohen Reviewed By: jonahcohen Differential Revision: https://reviews.facebook.net/D33453 |
43c740d
|
|
|
spetrunia |
Fix issue #14: Doubling space by primary key
…
Summary: SQL table records are stored as (pk_keyval -> record_val) key-value pairs. When PK column datatype allows to decode PK column value from its mem-comparable form in pk_keyval, we don't need to store the column in record_val. Test Plan: ./mtr t/rocks*test && ./mtr --suite=rocksdb && ./mtr --suite=rocksdb_rpl Reviewers: hermanlee4, jonahcohen, maykov, yoshinorim Reviewed By: yoshinorim Differential Revision: https://reviews.facebook.net/D33117 |
7ea412e
|
|
|
spetrunia |
Issue #14: post-review fixes: spelling, better comments and variable …
…
…names Test Plan: N/A Reviewers: maykov, hermanlee4, yoshinorim, jonahcohen Reviewed By: jonahcohen Differential Revision: https://reviews.facebook.net/D33453 |
8d664e0
|
From @yoshinorim:
I just noticed primary keys are stored twice in RocksDB SE – as both key and value, even though
they are not needed to be stored as value.
How to repeat:
10 column primary key
create table r1 (
id1 bigint not null,
id2 bigint not null,
id3 bigint not null,
id4 bigint not null,
id5 bigint not null,
id6 bigint not null,
id7 bigint not null,
id8 bigint not null,
id9 bigint not null,
id10 bigint not null,
id11 bigint not null,
id12 bigint not null,
primary key (id1, id2, id3, id4, id5, id6, id7, id8, id9, id10)
) engine=rocksdb charset=latin1 collate=latin1_bin;
1 column primary key
create table r2 (
id1 bigint not null,
id2 bigint not null,
id3 bigint not null,
id4 bigint not null,
id5 bigint not null,
id6 bigint not null,
id7 bigint not null,
id8 bigint not null,
id9 bigint not null,
id10 bigint not null,
id11 bigint not null,
id12 bigint not null,
primary key (id1)
) engine=rocksdb charset=latin1 collate=latin1_bin;
Inserting 2 million rows on these tables. Values sequentially increasing from 1 to 2 million (all columns have same values per row) -- can be generated like this.
for(my $i=1; $i<= 2000000; $i++) {
for(my $j=0; $j < 12; $j++) {
if($j < 11) {
print "$i,";
}else {
print "$i\n";
}
}
}
then make compaction happen.
Size in KB:
id1-id10 pk:
353764
id1 pk:
212984
innodb
id1-id10 pk:
264241
id1 pk:
260046
So basically long primary keys increase on-disk space.
I also checked via gdb (setting breakpoint at rocksdb::WriteBatch::Put()) and confirmed.
insert into r1 values (1,1,1,1,1,1,1,1,1,1,1,1);
=>
(gdb) p key.size_
$1 = 84
(gdb) p value.size_
$2 = 96
insert into r2 values (1,1,1,1,1,1,1,1,1,1,1,1);
=>
(gdb) p key.size_
$3 = 12
(gdb) p value.size_
$4 = 96
I expected value.size_ were 12 and 84 respectively. Since both key fields and value fields are stored, duplicate fields can be removed from value fields.