Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different behavior on amd64 and arm #38

Closed
actcwlf opened this issue Nov 14, 2023 · 8 comments
Closed

Different behavior on amd64 and arm #38

actcwlf opened this issue Nov 14, 2023 · 8 comments

Comments

@actcwlf
Copy link

actcwlf commented Nov 14, 2023

Hi, I'm trying to use this package on arm devices, but I found inconsistent behavior on arm and amd64 platforms, what's the reason for this?

This is the test code

import constriction
import numpy as np

message = np.array([10, 10], dtype=np.int32)

entropy_model = constriction.stream.model.QuantizedLaplace(
            -20, 20 + 1
        )

encoder = constriction.stream.queue.RangeEncoder()
encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0.]))

compressed = encoder.get_compressed()
print(f"compressed representation: {compressed}")
print(f"(in binary: {[bin(word) for word in compressed]})")

decoder = constriction.stream.queue.RangeDecoder(compressed)
decoded = decoder.decode(entropy_model, np.array([10., 10.]), np.array([10.,  0.])) 
assert np.all(decoded == message) # (verifies correctness)

On amd64

compressed representation: [2042752375]
(in binary: ['0b1111001110000011110110101110111'])

On arm

thread '<unnamed>' panicked at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/probability-0.20.3/src/distribution/laplace.rs:22:9:
assertion failed: b > 0.0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "test_script.py", line 13, in <module>
    encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0]))
pyo3_runtime.PanicException: assertion failed: b > 0.0
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
    from apport.report import Report
  File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in <module>
    import apport.fileutils
  File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in <module>
    from apport.packaging_impl import impl as packaging
  File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 24, in <module>
    import apt
  File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
    import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'

Original exception was:
Traceback (most recent call last):
  File "test_script.py", line 13, in <module>
    encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0]))
pyo3_runtime.PanicException: assertion failed: b > 0.0

The full traceback is

thread '<unnamed>' panicked at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/probability-0.20.3/src/distribution/laplace.rs:22:9:
assertion failed: b > 0.0
stack backtrace:
   0:       0x7f9d20a7ac - std::backtrace_rs::backtrace::libunwind::trace::heaab0e590535aeb3
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:       0x7f9d20a7ac - std::backtrace_rs::backtrace::trace_unsynchronized::h89cc7ae9ebb707d7
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:       0x7f9d20a7ac - std::sys_common::backtrace::_print_fmt::h08c31be18fedf422
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:67:5
   3:       0x7f9d20a7ac - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hc38bcf44d9e857e3
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:44:22
   4:       0x7f9d226b04 - core::fmt::rt::Argument::fmt::ha5b752f9cd7ef4a3
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/fmt/rt.rs:138:9
   5:       0x7f9d226b04 - core::fmt::write::h9fac187ae7486f3c
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/fmt/mod.rs:1094:21
   6:       0x7f9d208358 - std::io::Write::write_fmt::h239e9fb6296b3a7f
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/io/mod.rs:1714:15
   7:       0x7f9d20a5e0 - std::sys_common::backtrace::_print::h52f67cfa8753b0ab
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:47:5
   8:       0x7f9d20a5e0 - std::sys_common::backtrace::print::hdea7481e2c957a93
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:34:9
   9:       0x7f9d20b958 - std::panicking::default_hook::{{closure}}::h7c36fa733369c49e
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:270:22
  10:       0x7f9d20b680 - std::panicking::default_hook::h303eee75f9a8f6a8
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:290:9
  11:       0x7f9d20bf1c - std::panicking::rust_panic_with_hook::h270c94381ec34744
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:707:13
  12:       0x7f9d20bda4 - std::panicking::begin_panic_handler::{{closure}}::h3653e3502bcc1625
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:597:13
  13:       0x7f9d20ac90 - std::sys_common::backtrace::__rust_end_short_backtrace::h6b8510f2f024eeeb
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:170:18
  14:       0x7f9d20bb34 - rust_begin_unwind
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
  15:       0x7f9d0fe4a4 - core::panicking::panic_fmt::ha96945d7a1b20293
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
  16:       0x7f9d0fe514 - core::panicking::panic::h8f06a2df29fa4962
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:117:5
  17:       0x7f9d11f308 - probability::distribution::laplace::Laplace::new::haefeb509845cc871
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/probability-0.20.3/src/distribution/laplace.rs:22:9
  18:       0x7f9d119a44 - constriction::pybindings::stream::model::QuantizedLaplace::new::{{closure}}::h5e7236efc8092f4a
                               at /home/user/repos/constriction/src/pybindings/stream/model.rs:544:44
  19:       0x7f9d161b7c - <constriction::pybindings::stream::model::internals::ParameterizableModel<(P0,P1),M,F> as constriction::pybindings::stream::model::internals::Model>::parameterize::h135fd1ad8427458f
                               at /home/user/repos/constriction/src/pybindings/stream/model/internals.rs:249:35
  20:       0x7f9d17476c - constriction::pybindings::stream::queue::RangeEncoder::encode::h7a9429b10f1006e7
                               at /home/user/repos/constriction/src/pybindings/stream/queue.rs:314:13
  21:       0x7f9d175934 - constriction::pybindings::stream::queue::_::<impl constriction::pybindings::stream::queue::RangeEncoder>::__pymethod_encode__::hb8e05b89a123d73c
                               at /home/user/repos/constriction/src/pybindings/stream/queue.rs:43:1
  22:       0x7f9d0ff204 - pyo3::impl_::trampoline::fastcall_with_keywords::{{closure}}::he0112aa3b1e49083
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.19.2/src/impl_/trampoline.rs:41:29
  23:       0x7f9d0ff10c - pyo3::impl_::trampoline::trampoline::{{closure}}::h9f46c71dbb4a44d4
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.19.2/src/impl_/trampoline.rs:181:54
  24:       0x7f9d15b494 - std::panicking::try::do_call::hd5234df960a8f95d
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:502:40
  25:       0x7f9d15b720 - __rust_try
  26:       0x7f9d15b268 - std::panicking::try::haeac5d692bf46066
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:466:19
  27:       0x7f9d115570 - std::panic::catch_unwind::h2b95655acc141454
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panic.rs:142:14
  28:       0x7f9d0fed68 - pyo3::impl_::trampoline::trampoline::h82d7e8a0b17fbbdf
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.19.2/src/impl_/trampoline.rs:181:9
  29:       0x7f9d140a94 - pyo3::impl_::trampoline::fastcall_with_keywords::h45c8d979f3ca1f26
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.19.2/src/impl_/trampoline.rs:41:13
  30:       0x7f9d174a48 - constriction::pybindings::stream::queue::_::<impl pyo3::impl_::pyclass::PyMethods<constriction::pybindings::stream::queue::RangeEncoder> for pyo3::impl_::pyclass::PyClassImplCollector<constriction::pybindings::stream::queue::RangeEncoder>>::py_methods::ITEMS::trampoline::h905700428ec8b158
                               at /home/user/repos/constriction/src/pybindings/stream/queue.rs:43:1
  31:           0x488db8 - _PyMethodDescr_FastCallKeywords
  32:           0x512714 - _PyEval_EvalFrameDefault
  33:           0x50b47c - _PyEval_EvalCodeWithName
  34:           0x50dd70 - PyEval_EvalCode
  35:           0x616f84 - <unknown>
  36:           0x617070 - PyRun_FileExFlags
  37:           0x61844c - PyRun_SimpleFileExFlags
  38:           0x6494c8 - <unknown>
  39:           0x649b00 - _Py_UnixMain
  40:       0x7f9db946e0 - __libc_start_main
                               at /build/glibc-4fr630/glibc-2.27/csu/../csu/libc-start.c:310
  41:           0x5aef1c - <unknown>
Traceback (most recent call last):
  File "test_script.py", line 13, in <module>
    encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0]))
pyo3_runtime.PanicException: assertion failed: b > 0.0
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
    from apport.report import Report
  File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in <module>
    import apport.fileutils
  File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in <module>
    from apport.packaging_impl import impl as packaging
  File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 24, in <module>
    import apt
  File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
    import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'

Original exception was:
Traceback (most recent call last):
  File "test_script.py", line 13, in <module>
    encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0]))
pyo3_runtime.PanicException: assertion failed: b > 0.0

Thanks!

@actcwlf actcwlf changed the title Different behavior on x86 and arm Different behavior on amd64 and arm Nov 14, 2023
@robamler
Copy link
Collaborator

robamler commented Nov 18, 2023

Thank you for reporting! On arm, are you using the precompiled constriction package from pypi (for new Macs with arm64 chips) or did you compile constriction from source? If you compiled from source, did you compile in --release mode? The error message that you see should appear in a debug build but it should not appear in a release build (and the official packages on pypi should all be release builds unless I messed something up in the CI).

In case you indeed compiled from source: I'd be happy to provide precompiled python packages for your platform going forward, but I'd need someone to test them, at least initially. Would you mind telling me what platform you're developing for and would you consider testing a precompiled python package on it?

Background

The second symbol in your example is encoded with a Laplace distribution with scale parameter b = 0.0, which is not a well-defined distribution (the pdf of a Laplace distribution is $p(x\mid\mu,b)=\frac {1}{2b}e^{-|x-\mu |/b}$, so $b=0$ would lead to a division by zero). The Laplace distribution is implemented in the probability crate, which checks for b > 0.0 using a should! macro, which is defined to be equivalent to debug_assert!. Unless special compiler flags are used, debug_assert! only checks in debug builds and is ignored in release builds.

Notes to Self

  • We should at least document that the Laplace distribution requires b > 0.0, and maybe the python API should check for it even in release builds. In fact, the provided example works on amd64 only by coincidence. It would fail to reconstruct the original message if we changed the second symbol in message to anything other than 10 because probability—justifiably—does not provide a valid quantile function for Laplace distributions with b = 0.0. [update on 2023-11-25: I can't reproduce this anymore. Maybe I tried it out in debug mode myself when I tested this originally]
  • I guess the readme should document that, when compiling python packages for production use from source, one should compile in --release mode. This may be obvious to rust developers but these instructions are mainly for developers who are at home in python land and not necessarily very familiar with rust.

robamler added a commit that referenced this issue Nov 19, 2023
Assert that the scale (or equivalent) parameters of a Gaussian, Laplace,
or Cauchy distribution is positive, and state this requirement in the
python API reference. Previously, encoding and decoding with a Laplace
distribution with zero scale did not raise any error but failed to
reproduce the original message in general, see first comment on #38.

Implements the first to-do item in #38.
robamler added a commit that referenced this issue Nov 19, 2023
This is probably what most people who build the python module want (and
those who *don't* want to build in `--release` mode are probably rust
developers, who know what the `--release` flag does).

Implements the second to-do item in #38.
@actcwlf
Copy link
Author

actcwlf commented Nov 24, 2023

Thanks for your reply!

I compiled from source following

compile in release mode (i.e., with optimizations) and run the benchmarks: cargo bench

and

build the Python module: poetry run maturin develop --features pybindings

Because I am pretty new to rust, I am not very sure whether I compiled the package in release mode at that moment.

Unfortunately I dont have the original device to test the solution now. If the behavior is reproduciable on other platform in debug mode, I think this may be the main reason.

By the way, when $b=0$, can I suppose the decode result is $\mu$ ? Is this a correct workaround?

@actcwlf
Copy link
Author

actcwlf commented Nov 24, 2023

I make another test on amd64

import constriction
import numpy as np

message = np.array([10, 5, 6], dtype=np.int32)

entropy_model = constriction.stream.model.QuantizedLaplace(
            -20, 20 + 1
        )

encoder = constriction.stream.queue.RangeEncoder()
encoder.encode(message, entropy_model, np.array([10., 10., 10.]), np.array([10., 0., 0.]))

compressed = encoder.get_compressed()
print(f"compressed representation: {compressed}")
print(f"(in binary: {[bin(word) for word in compressed]})")

decoder = constriction.stream.queue.RangeDecoder(compressed)
decoded = decoder.decode(entropy_model, np.array([10., 10., 10.]), np.array([10.,  0., 0.]))

print(decoded)
assert np.all(decoded == message) # (verifies correctness)

the result is

compressed representation: [2042752312  556892819]
(in binary: ['0b1111001110000011110110100111000', '0b100001001100011000001010010011'])
[10  5  6]

May be there is some fallback when $b=0$?

@robamler
Copy link
Collaborator

robamler commented Nov 25, 2023

You are right, leakily quantized Laplace distributions with $b=0$ do actually work in release builds. I can't reproduce my tests where these failed, maybe I accidentally tested in debug mode myself.

I compiled from source following
[...]

build the Python module: poetry run maturin develop --features pybindings

Because I am pretty new to rust, I am not very sure whether I compiled the package in release mode at that moment.

This compiles in debug mode (i.e., it will result in a python package that contains more checks and fewer run-time optimizations). This explains the difference in behavior between your two builds.

(For completeness, if you want to compile in release mode, use: poetry run maturin develop --features pybindings --release; I updated the readme accordingly.)

I'm a bit unsure what the conclusion should be. On the one hand, one could interpret a Laplace distribution with $b=0$ as a delta distribution, and one could argue that a delta distribution should work with our leaky quantizer in most cases (see below) because the quantizer integrates the delta distribution over a finite interval, and the leakiness adds some small nonzero probability mass even if the integral does not contain the delta peak. And it indeed seems to work in experiments.

On the other hand, there are two reasons against allowing $b=0$ for Laplace distributions:

  • parity with the rust API: while all provided precompiled python modules are compiled in release mode, users of the rust API will probably develop in debug mode, and there the probability crate (which is not under my control) forbids $b=0$ (and I think that's the right thing to do for a general purpose probability crate).
  • There's an edge case where quantizing a delta distribution is not well-defined: when the delta peak is right at the boundary between two bins. In this case, when the leaky quantizer integrates over one of the two adjacent bins, it's not clear whether the peak should be counted towards the probability mass. As far as I can tell, the way the Laplace distribution is implemented in the probability crate, it should evaluate 0.0 / 0.0 for this case on this line, which should result in NaN for both debug and release builds. Weirdly, I can't manage to trigger this issue in practice, so it seems like it's not an issue, but I want to understand why it's not an issue before relying on it.
    [update (2023-11-26): in this edge case, the delta peak is counted entirely towards the right bin. The NaN result from Laplace::distribution is cast into an integer here and here, which results in 0 as per rust reference so that the cdf still evaluates to zero at the delta peak. It still feels dangerous to rely on this behavior, especially if the easier solution is to just add a tiny amount of regularization to $b$.]

Request for feedback

To help me decide, could you give me some more background on why you tested with a Laplace distribution with $b=0$? Did this come up in a real use case or were you deliberately coming up with edge cases for stress testing? In other words, would it suffice for your use case if constriction enforced $b&gt;0$ and if this requirement was documented in the python API reference, or would it make your life easier if $b=0$ was accepted and interpreted as a delta distribution?

@actcwlf
Copy link
Author

actcwlf commented Nov 26, 2023

Thanks for your explanations! I am doing some research on Cool-Chic and encountered the case $b=0$. For my use case, interpreting a Laplace distribution as a Delta distribution is useful when $b=0$. But I think these two reasons against allowing $b=0$ are reasonable. Perhaps adding some options is a good alternative? For example, add a strict=true option in constriction.stream.model.QuantizedLaplace to enforce the check that $b&gt;0$ and strict=false for more flexible interpretation.

robamler added a commit that referenced this issue Nov 26, 2023
This came up in the context of #38.
robamler added a commit that referenced this issue Nov 26, 2023
@robamler
Copy link
Collaborator

Thanks for the background information. I like the idea about a strict=true argument to QuantizedLaplace, but I decided against it for the following reasons (this list is not meant as criticism, it's just to document and communicate my reasoning):

  • I think it would add unnecessary complexity to the python API where a simpler solution exists (see below);
  • it would still break the promise that the python API is a subset of the rust API, and would thus prevent users from porting python prototypes to exactly equivalent rust, at least for rust debug builds; and
  • it relies on somewhat brittle behavior that might break, e.g., with future versions of the probability crate (which are out of my control).

Proposed Solution

The problem of $b=0$ can be easily avoided by adding some small regularization (e.g., 1e-16) to the scale parameter. I added this recommendation to the python API documentation (e.g., here), and I added unit tests to assert that quantizing continuous distributions with even much smaller scales behaves as expected without any numerical issues.

Regarding Precompiled Packages for arm

Unrelated to the specific issue reported here, please let me know if you'd be willing to test a precompiled python package for your computer architecture so I can upload official pip packages to pypi going forward. This would make things easier also for other people with your compute architecture. I'd need to know your host triple to do this (i.e., what rustc --version --verbose says after "host").

@actcwlf
Copy link
Author

actcwlf commented Nov 27, 2023

Thanks for your reply! Adding a small regularization is a good idea. As in previous comment, I don't have access to that device for now. If I get more detailed information, I will update the information here.

@actcwlf
Copy link
Author

actcwlf commented Apr 9, 2024

Hi, I have got the original device, and I would like to know if you can provide a precompiled version now. I would be happy to do some testing. The system info is

rustc 1.77.1 (7cf61ebde 2024-03-27)
binary: rustc
commit-hash: 7cf61ebde7b22796c69757901dd346d0fe70bd97
commit-date: 2024-03-27
host: aarch64-unknown-linux-gnu
release: 1.77.1
LLVM version: 17.0.6

The device is a Jetson nano and the recommanded version of python is 3.6. I found release v0.2.4 support 3.6 but failed to install it. Is it feasible to compile a newer version to the platform?
Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants