Skip to content
This repository has been archived by the owner on Nov 3, 2021. It is now read-only.

Bug 1042247 - Consolidate compression logic in get_about_memory.py #377

Merged
merged 1 commit into from Aug 21, 2014

Conversation

EricRahm
Copy link
Contributor

This makes compression of individual files opt-in and adds an option to compress the whole out_dir when completed. It also properly closes DMD files after use.

@EricRahm
Copy link
Contributor Author

@amccreight What do you think of switching the compression algo of cc/gc logs from xz to gz or bz2? I'd like to standardize the compression type across all reports retrieved by get_about_memory.py

I did a quick test compressing a cc log, these were my results.

  • xz (lzma)
    • 2.963s, 7.2% original file size
    • Pros: it has the best compression
    • Cons: it's ~3X slower than bz2, ~7X slower than gz, and has to be called externally
  • gzip
    • 0.433s, 11.4% original file size
    • Pros : super fast, built-in, the 'standard'
    • Cons: worst compression, even at -9, bugzilla does some weird double-compression w/ gz files
  • bzip2
    • 1.043s, 8.8% original file size
    • Pros: almost as good as xz, built-in, ~3X faster than xz
    • Cons: slightly larger files than xz, ~ 2X slower than gzip

Raw numbers:

erahm-25043:about-memory-5 ericrahm$ time xz -k cc-edges.4975.1408056643.log

real    0m2.963s
user    0m2.919s
sys 0m0.043s

erahm-25043:about-memory-5 ericrahm$ time gzip -k -9 cc-edges.4975.1408056643.log

real    0m0.433s
user    0m0.429s
sys 0m0.004s

erahm-25043:about-memory-5 ericrahm$ time bzip2 -k cc-edges.4975.1408056643.log

real    0m1.043s
user    0m1.035s
sys 0m0.007s
-rw-r--r--   1 ericrahm  staff   8709729 Aug 14 15:50 cc-edges.4975.1408056643.log
-rw-r--r--   1 ericrahm  staff    769477 Aug 14 15:50 cc-edges.4975.1408056643.log.bz2
-rw-r--r--   1 ericrahm  staff    996217 Aug 14 15:50 cc-edges.4975.1408056643.log.gz
-rw-r--r--   1 ericrahm  staff    628432 Aug 14 15:50 cc-edges.4975.1408056643.log.xz

@amccreight
Copy link

Whatever you think is best is fine with me. bzip2 does sound good, though
I've seen its author discourage its use. ;) I don't know how much size vs.
time matters for this case. I guess time is probably the bigger deal if
we're running that on the phones?

On Fri, Aug 15, 2014 at 12:37 PM, Eric Rahm notifications@github.com
wrote:

@amccreight https://github.com/amccreight What do you think of
switching the compression algo of cc/gc logs from xz to gz or bz2? I'd like
to standardize the compression type across all reports retrieved by
get_about_memory.py

I did a quick test compressing a cc log, these were my results.

  • xz (lzma)
    • 2.963s, 7.2% original file size
    • Pros: it has the best compression
    • Cons: it's ~3X slower than bz2, ~7X slower than gz, and has to be
      called externally
      • gzip
    • 0.433s, 11.4% original file size
    • Pros : super fast, built-in, the 'standard'
    • Cons: worst compression, even at -9, bugzilla does some weird
      double-compression w/ gz files
      • bzip2
    • 1.043s, 8.8% original file size
    • Pros: almost as good as xz, built-in, ~3X faster than xz
    • Cons: slightly larger files than xz, ~ 2X slower than gzip

Raw numbers:

erahm-25043:about-memory-5 ericrahm$ time xz -k cc-edges.4975.1408056643.log

real 0m2.963s
user 0m2.919s
sys 0m0.043s

erahm-25043:about-memory-5 ericrahm$ time gzip -k -9 cc-edges.4975.1408056643.log

real 0m0.433s
user 0m0.429s
sys 0m0.004s

erahm-25043:about-memory-5 ericrahm$ time bzip2 -k cc-edges.4975.1408056643.log

real 0m1.043s
user 0m1.035s
sys 0m0.007s
-rw-r--r-- 1 ericrahm staff 8709729 Aug 14 15:50 cc-edges.4975.1408056643.log
-rw-r--r-- 1 ericrahm staff 769477 Aug 14 15:50 cc-edges.4975.1408056643.log.bz2
-rw-r--r-- 1 ericrahm staff 996217 Aug 14 15:50 cc-edges.4975.1408056643.log.gz
-rw-r--r-- 1 ericrahm staff 628432 Aug 14 15:50 cc-edges.4975.1408056643.log.xz


Reply to this email directly or view it on GitHub
#377 (comment).

- Adds support for not compressing DMD and GC/CC logs
- GC/CC logs now use gzip compression
- Adds support for creating an archive of the report folder
rvandermeulen added a commit that referenced this pull request Aug 21, 2014
Bug 1042247 -  Consolidate compression logic in get_about_memory.py
@rvandermeulen rvandermeulen merged commit c733ade into mozilla-b2g:master Aug 21, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants