Skip to content

Consider providing lzma tarballs? #248

@QuLogic

Description

@QuLogic

I did a really simple test of unzipping and then re-compressing the directory with a gzipped tarball (tar zc) and an LZMA-compressed tarball (tar Jc):

-rw-r--r--. 1.2M Feb 23 03:18 110m_cultural.tar.gz
-rw-r--r--. 334K Feb 23 03:18 110m_cultural.tar.xz
-rw-r--r--. 1.2M Oct 27 13:58 110m_cultural.zip
-rw-r--r--. 4.0M Feb 23 03:18 110m_physical.tar.gz
-rw-r--r--. 1.6M Feb 23 03:18 110m_physical.tar.xz
-rw-r--r--. 4.0M Feb 21 19:28 110m_physical.zip
-rw-r--r--. 7.4M Feb 23 03:18 50m_cultural.tar.gz
-rw-r--r--. 2.5M Feb 23 03:18 50m_cultural.tar.xz
-rw-r--r--. 7.5M Oct 27 13:58 50m_cultural.zip
-rw-r--r--. 7.3M Feb 23 03:18 50m_physical.tar.gz
-rw-r--r--. 3.7M Feb 23 03:18 50m_physical.tar.xz
-rw-r--r--. 7.3M Oct 27 13:58 50m_physical.zip
-rw-r--r--. 202M Feb 23 03:19 10m_cultural.tar.gz
-rw-r--r--. 127M Feb 23 03:22 10m_cultural.tar.xz
-rw-r--r--. 202M Oct 27 13:58 10m_cultural.zip
-rw-r--r--.  54M Feb 23 03:18 10m_physical.tar.gz
-rw-r--r--.  35M Feb 23 03:19 10m_physical.tar.xz
-rw-r--r--.  54M Oct 27 13:58 10m_physical.zip

The gzip file doesn't really provide any advantage, but the xz provides a significant savings, shaving off between 70% (for the small files) and 40% (for the larger files).

It is a bit slower though (/tmp is tmpfs and in memory):

$ time unzip -q 10m_cultural.zip -d /tmp/asdfg
real	0m3.075s
user	0m2.931s
sys	0m0.130s

$ time tar xf 10m_cultural.tar.xz -C /tmp/asdfgh
real	0m9.101s
user	0m9.012s
sys	0m0.624s

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions