New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

24bit integers for ziplists (patch) #469

Closed
grisha opened this Issue Apr 20, 2012 · 9 comments

Comments

Projects
None yet
2 participants
@grisha
Contributor

grisha commented Apr 20, 2012

This is a small patch to store integers that can fit in 24 bits using 3 bytes which makes ziplists even more memory efficient:

https://github.com/grisha/redis/tree/ziplist_24bit

https://github.com/grisha/redis/commit/2af8ebb25bb64bba385c495457a281b3298391a5

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Apr 20, 2012

Owner

I and Pieter like that... there is a code review effort in progress and it will probably enter 2.6. Thanks! More news soon.

Owner

antirez commented Apr 20, 2012

I and Pieter like that... there is a code review effort in progress and it will probably enter 2.6. Thanks! More news soon.

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Apr 23, 2012

Owner

Note to self: if we add this we should make sure to change the RDB version as well for 2.6 / unstable.

Owner

antirez commented Apr 23, 2012

Note to self: if we add this we should make sure to change the RDB version as well for 2.6 / unstable.

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Apr 23, 2012

Owner

Code reviewed, it's broken ;) However it is not ok that the Redis test can't detect such a trivial corruption.
The error is related to how two-complement representation works, when you take the three bytes you are not respecting this representation. I would use shifting of 8 bits to save it at a first glance.

However here the big problem is not that the patch is broken, it's trivial to fix, but that Redis tests were not able to check this trivial corruption. I'm adding a new test that can spot this errors easily.

Cheers,
Salvatore

Owner

antirez commented Apr 23, 2012

Code reviewed, it's broken ;) However it is not ok that the Redis test can't detect such a trivial corruption.
The error is related to how two-complement representation works, when you take the three bytes you are not respecting this representation. I would use shifting of 8 bits to save it at a first glance.

However here the big problem is not that the patch is broken, it's trivial to fix, but that Redis tests were not able to check this trivial corruption. I'm adding a new test that can spot this errors easily.

Cheers,
Salvatore

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Apr 23, 2012

Owner

I just added a test that can spot this issues trivially (unstable branch).

Owner

antirez commented Apr 23, 2012

I just added a test that can spot this issues trivially (unstable branch).

@grisha

This comment has been minimized.

Show comment
Hide comment
@grisha

grisha Apr 23, 2012

Contributor

Nice catch, obviously my specific case only used positive integers ;)

There are two ways this can be solved. The simplest is to replace INT24_MIN with 0 and not bother storing two's complements in 24 bits?

The other is some clever bitshifting to preserve that leftmost bit.
Something like: i32 = i32 | ((value & (1 << 31)) >> 7); // (looks ungly - is there a better way?)

Contributor

grisha commented Apr 23, 2012

Nice catch, obviously my specific case only used positive integers ;)

There are two ways this can be solved. The simplest is to replace INT24_MIN with 0 and not bother storing two's complements in 24 bits?

The other is some clever bitshifting to preserve that leftmost bit.
Something like: i32 = i32 | ((value & (1 << 31)) >> 7); // (looks ungly - is there a better way?)

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Apr 23, 2012

Owner

@grisha what about this instead?

@@ -299,9 +299,9 @@ static void zipSaveInteger(unsigned char *p, int64_t value, 
         memcpy(p,&i16,sizeof(i16));
         memrev16ifbe(p);
     } else if (encoding == ZIP_INT_24B) {
-        i32 = value;
+        i32 = value<<8;
         memrev32ifbe(&i32);
-        memcpy(p,&i32,sizeof(i32)-sizeof(int8_t));
+        memcpy(p,((unsigned char*)&i32)+1,sizeof(i32)-sizeof(int8_t));
     } else if (encoding == ZIP_INT_32B) {
         i32 = value;
         memcpy(p,&i32,sizeof(i32));
@@ -330,9 +330,9 @@ static int64_t zipLoadInteger(unsigned char *p, unsigned cha
         ret = i32;
     } else if (encoding == ZIP_INT_24B) {
         i32 = 0;
-        memcpy(&i32,p,sizeof(i32)-sizeof(int8_t));
+        memcpy(((unsigned char*)&i32)+1,p,sizeof(i32)-sizeof(int8_t));
         memrev32ifbe(&i32);
-        ret = i32;
+        ret = i32>>8;
     } else if (encoding == ZIP_INT_64B) {
Owner

antirez commented Apr 23, 2012

@grisha what about this instead?

@@ -299,9 +299,9 @@ static void zipSaveInteger(unsigned char *p, int64_t value, 
         memcpy(p,&i16,sizeof(i16));
         memrev16ifbe(p);
     } else if (encoding == ZIP_INT_24B) {
-        i32 = value;
+        i32 = value<<8;
         memrev32ifbe(&i32);
-        memcpy(p,&i32,sizeof(i32)-sizeof(int8_t));
+        memcpy(p,((unsigned char*)&i32)+1,sizeof(i32)-sizeof(int8_t));
     } else if (encoding == ZIP_INT_32B) {
         i32 = value;
         memcpy(p,&i32,sizeof(i32));
@@ -330,9 +330,9 @@ static int64_t zipLoadInteger(unsigned char *p, unsigned cha
         ret = i32;
     } else if (encoding == ZIP_INT_24B) {
         i32 = 0;
-        memcpy(&i32,p,sizeof(i32)-sizeof(int8_t));
+        memcpy(((unsigned char*)&i32)+1,p,sizeof(i32)-sizeof(int8_t));
         memrev32ifbe(&i32);
-        ret = i32;
+        ret = i32>>8;
     } else if (encoding == ZIP_INT_64B) {
@grisha

This comment has been minimized.

Show comment
Hide comment
@grisha

grisha Apr 23, 2012

Contributor

Yes, this is much clearer and works in my tests. I reworked my patch, if that's of any help:

https://github.com/grisha/redis/commit/e3352e49ad9e410dd4e3979476df12738211802b

Thanks!

Contributor

grisha commented Apr 23, 2012

Yes, this is much clearer and works in my tests. I reworked my patch, if that's of any help:

https://github.com/grisha/redis/commit/e3352e49ad9e410dd4e3979476df12738211802b

Thanks!

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Apr 24, 2012

Owner

Thanks, I merged it and tried to go forward along the path of exploiting more bits, given that anyway we are going to update the RDB format.

This is the result, including your patch: https://github.com/antirez/redis/compare/zipenc

Cheers,
Salvatore

Owner

antirez commented Apr 24, 2012

Thanks, I merged it and tried to go forward along the path of exploiting more bits, given that anyway we are going to update the RDB format.

This is the result, including your patch: https://github.com/antirez/redis/compare/zipenc

Cheers,
Salvatore

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Apr 27, 2012

Owner

Closing :)

Owner

antirez commented Apr 27, 2012

Closing :)

@antirez antirez closed this Apr 27, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment