-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modules/zstd: Add ZstdDecoder top level proc #1315
base: main
Are you sure you want to change the base?
Conversation
71be328
to
84b31f3
Compare
@lpawelcz can you add some markdown documentation on how to test the implementation against |
is the conflict in |
Would it make sense to add a README.md w/ a known limitation section in the zstd directory to document this? |
xls/modules/zstd/zstd_dec_test.cc
Outdated
this->ParseAndCompareWithZstd(frame.value()); | ||
} | ||
|
||
//class ZstdDecoderSeededTest : public ZstdDecoderTest, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason this is commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single test case from this test suite generate a random ZSTD frame with decodecorpus
utility. It is decoded with libzstd
and then the same encoded frame is processed through the simulation of ZSTD Decoder. The output of the simulation is gathered and compared against results from libzstd
decoding.
Such test case is repeated 50 times with generating ZSTD frames containing only RAW blocks and 50 times with generating only RLE blocks.
Currently, some of those test cases are failing and we are looking into those. For the time being we commented these tests out so that it would be easily visible if something unexpected will cause the CI to fail in this PR.
We added a
Yes, the conflict was caused by changes introduced with regards to #1308, #1202 and #1204. Fixed with rebase EDIT: |
xls/modules/zstd/BUILD
Outdated
data = [ | ||
":zstd_dec_verilog.ir", | ||
], | ||
#shard_count = 50, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that intentionally commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is useful for running ZstdDecoderSeededTest
which are now commented out as we work on fixing all cases tested there.
) | ||
|
||
cc_test( | ||
name = "zstd_dec_cc_test", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this currently PASS
👍 but with the following warning:
WARNING: //xls/modules/zstd:zstd_dec_cc_test: Test execution time (19.1s excluding execution overhead) outside of range for MODERATE tests. Consider setting timeout="short" or size="small".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test execution time will surely change after enabling back the ZstdDecoderSeededTest
. Thanks for pointing that out. We will make sure to set those attributes correctly.
@@ -0,0 +1,241 @@ | |||
// Copyright 2020 The XLS Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to fail w/ a timeout when introducing a deliberate failure (i.e: changing MAGIC_NUMBER
from u32:0xFD2FB528;
to u32:0xFD2FB527;
Target //xls/modules/zstd:zstd_dec_cc_test up-to-date:
bazel-bin/xls/modules/zstd/zstd_dec_cc_test
INFO: Elapsed time: 300.804s, Critical Path: 300.53s
INFO: 3 processes: 3 linux-sandbox.
INFO: Build completed, 1 test FAILED, 3 total actions
//xls/modules/zstd:zstd_dec_cc_test TIMEOUT in 300.0s
/usr/local/google/home/proppy/.cache/bazel/_bazel_proppy/fb962cb496438c85ace9ac1a1a0be573/execroot/com_google_xls/bazel-out/k8-fastbuild/testlogs/xls/modules/zstd/zstd_dec_cc_test/test.log
Executed 1 out of 1 test: 1 fails locally.
is that expected? or is there a way to make it fail earlier?
(attached the test.log)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would imagine that in the case of passing invalid ZSTD frame (e.g. wrong MAGIC_NUMBER
), decoder should spin in DECODE_MAGIC_NUMBER
state trying to detect it in incoming data packets and discarding the oldest byte in the buffer each time it failed. In your case decoder got stuck in ERROR
state which requires restarting the decoder and we will be fixing that.
When it comes to failing early we'll have to look closely if this is possible.
# ZSTD decoder | ||
|
||
The ZSTD decoder decompresses the correctly formed ZSTD frames and blocks. | ||
It implements the [RFC 8878](https://www.rfc-editor.org/rfc/rfc8878.html) decompression algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add paragraph breaks (extra newline) here and in other paragraphs below?
|
||
### Top level Proc | ||
This state machine is responsible for receiving encoded ZSTD frames, buffering the input and passing it to decoder's internal components based on the state of the proc. | ||
The states defined for the processing of ZSTD frame are as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state diagram with https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-diagrams#creating-mermaid-diagrams (instead of image files)?
* FEED_BLOCK_DECODER | ||
* DECODE_CHECKSUM | ||
|
||
After going through initial stages of decoding magic number and frame header, decoder starts the block division process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about inlining the description under of each state bullet point?
After transmitting all data required for current block, it loops around to the block header decoding state and when next block header is not found it decodes checksum when it was requested in frame header or finishes ZSTD frame decoding and loops around to magic number decoding. | ||
|
||
### ZSTD frame header decoder | ||
This part of the design starts with detecting the ZSTD magic number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/This part of the design/This module/ ? to match with other description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frame header decoding is actually implemented as a series of functions that are called in the top level proc. I'm not sure if module
would be a good word here.
* FEED_BLOCK_DECODER | ||
* DECODE_CHECKSUM | ||
|
||
After going through initial stages of decoding magic number and frame header, decoder starts the block division process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels like we're also somehow repeating what we say just below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the Top level Proc
paragraph we provide only a short description of each stage so that it would be easier to get gist of the flow. After that, each particular stage is described in detail.
xls/modules/zstd/README.md
Outdated
|
||
## Known Limitations | ||
|
||
* **[WIP]** Uses old version of `SequenceExecutor` (up-to-date version is available in [google/xls/pull/1295](https://github.com/google/xls/pull/1295)) due to [reported issues](https://github.com/google/xls/pull/1295#issuecomment-1943857515) with verilog generation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
![](img/ZSTD_decoder.png) | ||
|
||
## ZSTD decoder architecture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add link/pointed to the corresponding source(s) for each section?
This commit marks SimultaneousReadWriteBehavior enum and num_partitions function as public to allow for creating simpler tests that interact with RAM models. Internal-tag: [#53241] Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Internal-tag: [#54705] Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
This commit adds RAM printer block usefull for debugging HistoryBuffer inside SequenceExecutor. Internal-tag: [#54705] Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Add Proc responsible for handling ZSTD Sequence Execution step, which is described in: https://datatracker.ietf.org/doc/html/rfc8878#name-sequence-execution Internal-tag: [#54705] Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Internal-tag: [#52954] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
This commit adds a ZSTD decoder module that parses ZSTD frames. The provided tests examine the model using C++ API, which is a prerequisite for detailed tests using zstd library. Internal-tag: [#50221] Co-authored-by: Maciej Dudek <mdudek@antmicro.com> Co-authored-by: Pawel Czarnecki <pczarnecki@antmicro.com> Signed-off-by: Maciej Dudek <mdudek@antmicro.com> Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com> Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Required for spawning SequenceExecutor in ZSTD top level proc. Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
…ons public Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
…e==0 Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Internal-tag: [#52186] Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
@hongted and I where discussing if we should split the zstd CI from the main workflow w/ additional pre-conditions:
That way we can check-in if a given toolchain CL break the zstd modules (i.e: syntax change), but we can still check-in the zstd module w/ failing tests (so that we can more easily evaluate the PR w/ missing feature). What do you think? |
We use many |
I think those should be part of the zstd specific CI. |
This PR supersedes #1169
It is a part of #1211.
NOTE: this is based on #1314.
Please ignore all commits with
[TEMP]
in commit message when reviewing. Those are squashed commits of all previous PRs (links are available in the second line of each commit message)This PR adds a top level proc for ZSTD Decoder which integrates all its components in order to allow decoding of ZSTD frames. It includes C++ tests which:
decodecorpus
tool valid ZSTD frameslibzstd
andZstdDecoder
NOTE: currently it is possible to decode frames with simple block types as RAW and RLE. There are however still some issues with decoder implementation as not all tests are passing, hence the commented out tests with generating random frames.