Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13361][SQL] Add benchmark codes for Encoder#compress() in CompressionSchemeBenchmark #11236

Closed
wants to merge 1 commit into from

Conversation

maropu
Copy link
Member

@maropu maropu commented Feb 17, 2016

This pr added benchmark codes for Encoder#compress().
Also, it replaced the benchmark results with new ones because the output format of Benchmark changed.

@maropu
Copy link
Member Author

maropu commented Feb 17, 2016

@nongli This discussion comes from #10965;
Before I make pr to improve compression performance of columnar caches, I have some questions about InMemoryRelation. In the current codes of InMemoryRelation, ColumnBuilder buffers input tuples in heap space and compresses them in bulk. Do you have any plan to use off-heap for this compression processing in ColumnBuilder? IMO we can fix this by adding some functionality in ColumnVector you implemented for parquet vectorized decoding, and then by using the extended ColumnVector to buffer input tuples inside ColumnBuilder. What do you think?

@maropu
Copy link
Member Author

maropu commented Feb 17, 2016

Anyway, I'd like to making prs to improve compression performance in InMemoryRelation.
A goal of this activity is to make in-memory cache size approaching to parquet formatted data size.
As a first step, I'd like to use DeltaBinaryPackingValuesReader/Writer of parquet-column in IntDelta and LongDelta encoders because this efficient integer compression can be widely applied in many types such as SHORT, INT, and LONG... However, I have one technical issue; DeltaBinaryPackingValuesReader/Writer has internal buffer to compress/decompress data, so we need to copy the whole data into Spark internal buffer. It is a kind of overheads. To avoid this overhead, we can inline the parquet codes in Spark though, it has a maintenance issue.

In a second step, I have a plan to add codes to apply general-purpose compression algorithms like LZ4 and Snappy in the final step of ColumnBuilder#build. This is because byte arrays generated
by some type-specific encoders like DictionaryEncoding are compressible with these algorithms.
Parquet also applys compression just before writing data into disk.

Please give me some suggestion on this?

@maropu
Copy link
Member Author

maropu commented Feb 18, 2016

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Feb 18, 2016

Test build #51484 has finished for PR 11236 at commit 7021303.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Feb 23, 2016

I tried to implement IntDeltaBinaryPacking in compressionSchemes; this is the simplified version of IntDeltaBinaryPackingReader/Writer in parquet-column so as to calculate compressed size easily in gatherCompressibilityStats.
maropu@71bb944

The benchmark results are as follows;

Running benchmark: INT Decode(Lower Skew)
  Running case: PassThrough(1.000)
  Running case: RunLengthEncoding(1.002)
  Running case: DictionaryEncoding(0.500)
  Running case: IntDelta(0.250)
  Running case: IntDeltaBinaryPacking(0.068)

Intel(R) Core(TM) i7-4578U CPU @ 3.00GHz
INT Decode(Lower Skew):             Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
-------------------------------------------------------------------------------------------
PassThrough(1.000)                        285 /  360        235.7           4.2       1.0X
RunLengthEncoding(1.002)                  700 /  715         95.8          10.4       0.4X
DictionaryEncoding(0.500)                 763 /  782         88.0          11.4       0.4X
IntDelta(0.250)                           684 /  702         98.1          10.2       0.4X
IntDeltaBinaryPacking(0.068)              805 /  811         83.4          12.0       0.4X

Running benchmark: INT Decode(Higher Skew)
  Running case: PassThrough(1.000)
  Running case: RunLengthEncoding(1.337)
  Running case: DictionaryEncoding(0.501)
  Running case: IntDelta(0.250)
  Running case: IntDeltaBinaryPacking(0.182)

Intel(R) Core(TM) i7-4578U CPU @ 3.00GHz
INT Decode(Higher Skew):            Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
-------------------------------------------------------------------------------------------
PassThrough(1.000)                        690 /  716         97.3          10.3       1.0X
RunLengthEncoding(1.337)                 1127 / 1148         59.5          16.8       0.6X
DictionaryEncoding(0.501)                 836 /  856         80.2          12.5       0.8X
IntDelta(0.250)                           763 /  778         88.0          11.4       0.9X
IntDeltaBinaryPacking(0.182)              873 /  884         76.9          13.0       0.8X

The speeds of encoding/decoding get a little worse though, the compression ratios get much better.

@maropu
Copy link
Member Author

maropu commented Feb 23, 2016

@nongli ping

@maropu
Copy link
Member Author

maropu commented Feb 25, 2016

@nongli @rxin ping

@nongli
Copy link
Contributor

nongli commented Feb 26, 2016

LGTM,
thanks for writing these benchmarks.

I think moving forward, I agree that ColumnVector is a natural data structure to decode into, but we should probably not add this logic directly into those classes just from a code maintenance point of view. I think exploring the parquet encodings makes sense but let's start by benchmarking those and see if they have the right performance characteristics.

@rxin
Copy link
Contributor

rxin commented Feb 26, 2016

Merging this in master. Thanks.

@asfgit asfgit closed this in 1b39faf Feb 26, 2016
@maropu maropu deleted the CompressionSpike branch July 5, 2017 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants