Skip to content

Compression Speedup#1565

Closed
Baunsgaard wants to merge 7 commits into
apache:mainfrom
Baunsgaard:CompressSpeed
Closed

Compression Speedup#1565
Baunsgaard wants to merge 7 commits into
apache:mainfrom
Baunsgaard:CompressSpeed

Conversation

@Baunsgaard
Copy link
Copy Markdown
Contributor

This PR adds speedups to the compression by adding:

  • a direct path compression,
  • remove nnz count bug
  • sparse combine optimization in cocoding
  • greedy cocode first combine parallel.

@Baunsgaard Baunsgaard force-pushed the CompressSpeed branch 2 times, most recently from 7c245c2 to 09ca38e Compare March 22, 2022 17:28
@Baunsgaard Baunsgaard force-pushed the CompressSpeed branch 3 times, most recently from 0686a6f to 9fa6f40 Compare April 6, 2022 11:14
This commit adds optimizations to the encoding combination algorithms
to allow faster sparse-sparse and sparse-dense combine.
This commit adds a few specializations to maps.
MapToZero for all zero mappings.
MapToCharPByte for 3 byte mappings (in between char and int).
general specialization lmm and change cost model for lmm
fix cost estimator on unknown dimensions (set to 16)

fix sparse TSMM in full rows CSR

remove memorizer on Offsets

clear soft reference in case of spark compression

replace shortcut and compressed multiply cost minimum rows processed

more likely to transpose

MM binary no decompression

fix single col table on compressed colgroup

transpose size in memory if compressed is equal to compressed size
@Baunsgaard Baunsgaard mentioned this pull request Apr 19, 2022
@Baunsgaard
Copy link
Copy Markdown
Contributor Author

Close Because of #1589

@Baunsgaard Baunsgaard closed this Apr 19, 2022
@Baunsgaard Baunsgaard deleted the CompressSpeed branch August 18, 2022 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant