Skip to content

Conversation

andrewleech
Copy link
Contributor

Exposes basic compression support from uzlib. I originally wrote this nearly a year ago so don't really remember too much about it.

Pushed up to support #5590

Doesn't have any unit tests written... probably needs some first. Possibly some #define's to enable/disable the compress functionality?

memset(comp, 0, sizeof(*comp));

comp->dict_size = 32768;
comp->hash_bits = 12;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and previsou line could use a comment stating why this is chosen I think (I know nothing about gzip though)

Copy link
Contributor

@codefreax codefreax Feb 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to expose these settings in the API?

uzlib_compress(comp, bufinfo.buf, len);
zlib_finish_block(&comp->out);

printf("compressed from %u to %u raw bytes\n", len, comp->out.outlen);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use DEBUG_printf ?


printf("compressed from %u to %u raw bytes\n", len, comp->out.outlen);

mp_uint_t dest_buf_size = (comp->out.outlen + 6 + (3*sizeof(int)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark as earlier, what are these magic constants?

@stinos
Copy link
Contributor

stinos commented Feb 6, 2020

There's MICROPY_PY_UZLIB already, but I guess splitting that up in separate compress/decompress doesn't hurt and can be used to keep the original behavior (i.e. not enabling compression by default).

A bunch of tests would be nice indeed, and fairly easy to write (assuming asserting that decompress(compress(x)) == x is sufficient for proving it works)

@dpgeorge dpgeorge added the extmod Relates to extmod/ directory in source label Feb 6, 2020
Comment on lines 239 to 241
int mtime = 0;
memcpy(&dest_buf[i], &mtime, sizeof(mtime));
i += sizeof(mtime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Wikipedia, mtime must be 4 bytes. Use uint32_t here?

Comment on lines 216 to 217
struct uzlib_comp *comp = m_new_obj(struct uzlib_comp);
memset(comp, 0, sizeof(*comp));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Merge into m_new0(struct uzlib_comp, 1);?

Comment on lines 235 to 238
dest_buf[i++] = 0x1f;
dest_buf[i++] = 0x8b;
dest_buf[i++] = 0x08;
dest_buf[i++] = 0x00; // FLG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you add a few more comments here and below where there are more magic numbers?

Comment on lines 248 to 250
unsigned int crc = ~uzlib_crc32(bufinfo.buf, len, ~0);
memcpy(&dest_buf[i], &crc, sizeof(crc));
i += sizeof(crc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint32_t?

mp_obj_t data = args[0];
mp_buffer_info_t bufinfo;
mp_get_buffer_raise(data, &bufinfo, MP_BUFFER_READ);
int len = bufinfo.len;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is written to the output; should this be uint32_t (or whatever the right size is)?

@QAMU
Copy link

QAMU commented Apr 27, 2020

hello,
for this commit, how to compress data with micropython.
can I use uzlib.compress(data)?

@andrewleech
Copy link
Contributor Author

Hi @QAMU I think it worked like uzlib.gzip(data)
I haven't used it for some time though, not sure it's ever likely to get completed enough to merge.

@QAMU
Copy link

QAMU commented Apr 28, 2020

Hi @andrewleech,
thanks for your reply, can I use uzlib.decompress() to decompress data = uzlib.gzip()?

@andrewleech
Copy link
Contributor Author

yep the two should work together well

@harbaum
Copy link

harbaum commented Feb 22, 2021

It seems compression and decompression don't work with each other:

>>> uzlib.decompress(uzlib.gzip("Hello World!"))
compressed from 12 to 14 raw bytes
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: -3

@harbaum
Copy link

harbaum commented Feb 27, 2021

It seems compression and decompression don't work with each other:

They do if one removes the gzip header, crc and length fields and uses the correct wbits:

>>> uzlib.decompress(uzlib.gzip("Hello World!")[10:-8], -15)
compressed from 12 to 14 raw bytes
bytearray(b'Hello World!')

@harbaum
Copy link

harbaum commented Mar 1, 2021

Updated attempt here.

@QAMU
Copy link

QAMU commented Mar 1, 2021

to compress use:

import uzlib 
def compress(buffer):
    encoded = = uzlib.gzip(buffer)
    return encoded

to decompress:

import uzlib 
import uio
def decompress(buffer):
    file = uio.BytesIO(buffer)
    decoded = uzlib.DecompIO(file, 31)
    return decoded.read()

Copy link

@harbaum harbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to decompress:

import uzlib 
import uio
def decompress(buffer):
    file = uio.BytesIO(buffer)
    decoded = uzlib.DecompIO(file, 31)
    return decoded.read()

Why not:

def decompress(buffer):
  return uzlib.decompress(buffer[10:-8], -15)

Seems more lightweight for an embedded solution.

memset(comp->hash_table, 0, hash_size);

zlib_start_block(&comp->out);
uzlib_compress(comp, bufinfo.buf, len);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uzlib_compress as well as zlib_start_block internally (re-)allocate a buffer at comp->out which is never free'd. As a result the system runs out of memory after a few compression runs.

A free(comp->out.outbuf); after the subsequent memcpy solved this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uzlib_compress as well as zlib_start_block internally (re-)allocate a buffer at comp->out which is never free'd. As a result the system runs out of memory after a few compression runs.

A free(comp->out.outbuf); after the subsequent memcpy solved this.

Im trying to encode a string to get into the shorter sms data possible.,
questio is : how can I add this gzip uzlib extension to my current 1.14 toolchain for esp32?

@br0kenpixel br0kenpixel mentioned this pull request May 6, 2021
@andrewleech andrewleech force-pushed the gzip branch 3 times, most recently from f69e550 to 106f843 Compare July 13, 2021 08:06
@andrewleech
Copy link
Contributor Author

@harbaum Thanks for your additions in #6972, there is some great work there. I've rebased your changes on my original branch as a separate commit to keep your attribution.

In addition, I've added support for gzip header in the decompress(bytes, 31) function to allow it to work directly on the gzip compressed data.

The docs have been updated and I've added a basic compress unit test.

@andrewleech
Copy link
Contributor Author

Ok I've broken the unit tests with the last additions of gzip decompress and rebase onto current master... and it's too big for tiny ports - so probably does need a new feature flag. Either that, or just build & distribute in the dynamic C module version of the uzlib.

tannewt added a commit to tannewt/circuitpython that referenced this pull request Dec 29, 2021
clear out interrupt when freeing the timer
@dpgeorge
Copy link
Member

Superseded by #11905.

@dpgeorge dpgeorge closed this Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extmod Relates to extmod/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants