Different behavior on amd64 and arm #38

actcwlf · 2023-11-14T02:14:00Z

Hi, I'm trying to use this package on arm devices, but I found inconsistent behavior on arm and amd64 platforms, what's the reason for this?

This is the test code

import constriction
import numpy as np

message = np.array([10, 10], dtype=np.int32)

entropy_model = constriction.stream.model.QuantizedLaplace(
            -20, 20 + 1
        )

encoder = constriction.stream.queue.RangeEncoder()
encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0.]))

compressed = encoder.get_compressed()
print(f"compressed representation: {compressed}")
print(f"(in binary: {[bin(word) for word in compressed]})")

decoder = constriction.stream.queue.RangeDecoder(compressed)
decoded = decoder.decode(entropy_model, np.array([10., 10.]), np.array([10.,  0.])) 
assert np.all(decoded == message) # (verifies correctness)

On amd64

compressed representation: [2042752375]
(in binary: ['0b1111001110000011110110101110111'])

On arm

thread '<unnamed>' panicked at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/probability-0.20.3/src/distribution/laplace.rs:22:9:
assertion failed: b > 0.0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "test_script.py", line 13, in <module>
    encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0]))
pyo3_runtime.PanicException: assertion failed: b > 0.0
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
    from apport.report import Report
  File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in <module>
    import apport.fileutils
  File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in <module>
    from apport.packaging_impl import impl as packaging
  File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 24, in <module>
    import apt
  File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
    import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'

Original exception was:
Traceback (most recent call last):
  File "test_script.py", line 13, in <module>
    encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0]))
pyo3_runtime.PanicException: assertion failed: b > 0.0

The full traceback is

thread '<unnamed>' panicked at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/probability-0.20.3/src/distribution/laplace.rs:22:9:
assertion failed: b > 0.0
stack backtrace:
   0:       0x7f9d20a7ac - std::backtrace_rs::backtrace::libunwind::trace::heaab0e590535aeb3
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:       0x7f9d20a7ac - std::backtrace_rs::backtrace::trace_unsynchronized::h89cc7ae9ebb707d7
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:       0x7f9d20a7ac - std::sys_common::backtrace::_print_fmt::h08c31be18fedf422
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:67:5
   3:       0x7f9d20a7ac - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hc38bcf44d9e857e3
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:44:22
   4:       0x7f9d226b04 - core::fmt::rt::Argument::fmt::ha5b752f9cd7ef4a3
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/fmt/rt.rs:138:9
   5:       0x7f9d226b04 - core::fmt::write::h9fac187ae7486f3c
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/fmt/mod.rs:1094:21
   6:       0x7f9d208358 - std::io::Write::write_fmt::h239e9fb6296b3a7f
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/io/mod.rs:1714:15
   7:       0x7f9d20a5e0 - std::sys_common::backtrace::_print::h52f67cfa8753b0ab
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:47:5
   8:       0x7f9d20a5e0 - std::sys_common::backtrace::print::hdea7481e2c957a93
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:34:9
   9:       0x7f9d20b958 - std::panicking::default_hook::{{closure}}::h7c36fa733369c49e
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:270:22
  10:       0x7f9d20b680 - std::panicking::default_hook::h303eee75f9a8f6a8
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:290:9
  11:       0x7f9d20bf1c - std::panicking::rust_panic_with_hook::h270c94381ec34744
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:707:13
  12:       0x7f9d20bda4 - std::panicking::begin_panic_handler::{{closure}}::h3653e3502bcc1625
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:597:13
  13:       0x7f9d20ac90 - std::sys_common::backtrace::__rust_end_short_backtrace::h6b8510f2f024eeeb
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/sys_common/backtrace.rs:170:18
  14:       0x7f9d20bb34 - rust_begin_unwind
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
  15:       0x7f9d0fe4a4 - core::panicking::panic_fmt::ha96945d7a1b20293
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
  16:       0x7f9d0fe514 - core::panicking::panic::h8f06a2df29fa4962
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:117:5
  17:       0x7f9d11f308 - probability::distribution::laplace::Laplace::new::haefeb509845cc871
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/probability-0.20.3/src/distribution/laplace.rs:22:9
  18:       0x7f9d119a44 - constriction::pybindings::stream::model::QuantizedLaplace::new::{{closure}}::h5e7236efc8092f4a
                               at /home/user/repos/constriction/src/pybindings/stream/model.rs:544:44
  19:       0x7f9d161b7c - <constriction::pybindings::stream::model::internals::ParameterizableModel<(P0,P1),M,F> as constriction::pybindings::stream::model::internals::Model>::parameterize::h135fd1ad8427458f
                               at /home/user/repos/constriction/src/pybindings/stream/model/internals.rs:249:35
  20:       0x7f9d17476c - constriction::pybindings::stream::queue::RangeEncoder::encode::h7a9429b10f1006e7
                               at /home/user/repos/constriction/src/pybindings/stream/queue.rs:314:13
  21:       0x7f9d175934 - constriction::pybindings::stream::queue::_::<impl constriction::pybindings::stream::queue::RangeEncoder>::__pymethod_encode__::hb8e05b89a123d73c
                               at /home/user/repos/constriction/src/pybindings/stream/queue.rs:43:1
  22:       0x7f9d0ff204 - pyo3::impl_::trampoline::fastcall_with_keywords::{{closure}}::he0112aa3b1e49083
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.19.2/src/impl_/trampoline.rs:41:29
  23:       0x7f9d0ff10c - pyo3::impl_::trampoline::trampoline::{{closure}}::h9f46c71dbb4a44d4
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.19.2/src/impl_/trampoline.rs:181:54
  24:       0x7f9d15b494 - std::panicking::try::do_call::hd5234df960a8f95d
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:502:40
  25:       0x7f9d15b720 - __rust_try
  26:       0x7f9d15b268 - std::panicking::try::haeac5d692bf46066
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:466:19
  27:       0x7f9d115570 - std::panic::catch_unwind::h2b95655acc141454
                               at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panic.rs:142:14
  28:       0x7f9d0fed68 - pyo3::impl_::trampoline::trampoline::h82d7e8a0b17fbbdf
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.19.2/src/impl_/trampoline.rs:181:9
  29:       0x7f9d140a94 - pyo3::impl_::trampoline::fastcall_with_keywords::h45c8d979f3ca1f26
                               at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-0.19.2/src/impl_/trampoline.rs:41:13
  30:       0x7f9d174a48 - constriction::pybindings::stream::queue::_::<impl pyo3::impl_::pyclass::PyMethods<constriction::pybindings::stream::queue::RangeEncoder> for pyo3::impl_::pyclass::PyClassImplCollector<constriction::pybindings::stream::queue::RangeEncoder>>::py_methods::ITEMS::trampoline::h905700428ec8b158
                               at /home/user/repos/constriction/src/pybindings/stream/queue.rs:43:1
  31:           0x488db8 - _PyMethodDescr_FastCallKeywords
  32:           0x512714 - _PyEval_EvalFrameDefault
  33:           0x50b47c - _PyEval_EvalCodeWithName
  34:           0x50dd70 - PyEval_EvalCode
  35:           0x616f84 - <unknown>
  36:           0x617070 - PyRun_FileExFlags
  37:           0x61844c - PyRun_SimpleFileExFlags
  38:           0x6494c8 - <unknown>
  39:           0x649b00 - _Py_UnixMain
  40:       0x7f9db946e0 - __libc_start_main
                               at /build/glibc-4fr630/glibc-2.27/csu/../csu/libc-start.c:310
  41:           0x5aef1c - <unknown>
Traceback (most recent call last):
  File "test_script.py", line 13, in <module>
    encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0]))
pyo3_runtime.PanicException: assertion failed: b > 0.0
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
    from apport.report import Report
  File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in <module>
    import apport.fileutils
  File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in <module>
    from apport.packaging_impl import impl as packaging
  File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 24, in <module>
    import apt
  File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
    import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'

Original exception was:
Traceback (most recent call last):
  File "test_script.py", line 13, in <module>
    encoder.encode(message, entropy_model, np.array([10., 10.]), np.array([10., 0]))
pyo3_runtime.PanicException: assertion failed: b > 0.0

Thanks!

The text was updated successfully, but these errors were encountered:

robamler · 2023-11-18T22:07:42Z

Thank you for reporting! On arm, are you using the precompiled constriction package from pypi (for new Macs with arm64 chips) or did you compile constriction from source? If you compiled from source, did you compile in --release mode? The error message that you see should appear in a debug build but it should not appear in a release build (and the official packages on pypi should all be release builds unless I messed something up in the CI).

In case you indeed compiled from source: I'd be happy to provide precompiled python packages for your platform going forward, but I'd need someone to test them, at least initially. Would you mind telling me what platform you're developing for and would you consider testing a precompiled python package on it?

Background

The second symbol in your example is encoded with a Laplace distribution with scale parameter b = 0.0, which is not a well-defined distribution (the pdf of a Laplace distribution is $p(x\mid\mu,b)=\frac {1}{2b}e^{-|x-\mu |/b}$, so $b=0$ would lead to a division by zero). The Laplace distribution is implemented in the probability crate, which checks for b > 0.0 using a should! macro, which is defined to be equivalent to debug_assert!. Unless special compiler flags are used, debug_assert! only checks in debug builds and is ignored in release builds.

Notes to Self

We should at least document that the Laplace distribution requires b > 0.0, and maybe the python API should check for it even in release builds. In fact, the provided example works on amd64 only by coincidence. It would fail to reconstruct the original message if we changed the second symbol in message to anything other than 10 because probability—justifiably—does not provide a valid quantile function for Laplace distributions with b = 0.0. [update on 2023-11-25: I can't reproduce this anymore. Maybe I tried it out in debug mode myself when I tested this originally]
I guess the readme should document that, when compiling python packages for production use from source, one should compile in --release mode. This may be obvious to rust developers but these instructions are mainly for developers who are at home in python land and not necessarily very familiar with rust.

Assert that the scale (or equivalent) parameters of a Gaussian, Laplace, or Cauchy distribution is positive, and state this requirement in the python API reference. Previously, encoding and decoding with a Laplace distribution with zero scale did not raise any error but failed to reproduce the original message in general, see first comment on #38. Implements the first to-do item in #38.

This is probably what most people who build the python module want (and those who *don't* want to build in `--release` mode are probably rust developers, who know what the `--release` flag does). Implements the second to-do item in #38.

actcwlf · 2023-11-24T07:49:05Z

Thanks for your reply!

I compiled from source following

compile in release mode (i.e., with optimizations) and run the benchmarks: cargo bench

and

build the Python module: poetry run maturin develop --features pybindings

Because I am pretty new to rust, I am not very sure whether I compiled the package in release mode at that moment.

Unfortunately I dont have the original device to test the solution now. If the behavior is reproduciable on other platform in debug mode, I think this may be the main reason.

By the way, when $b=0$, can I suppose the decode result is $\mu$ ? Is this a correct workaround?

actcwlf · 2023-11-24T07:59:07Z

I make another test on amd64

import constriction
import numpy as np

message = np.array([10, 5, 6], dtype=np.int32)

entropy_model = constriction.stream.model.QuantizedLaplace(
            -20, 20 + 1
        )

encoder = constriction.stream.queue.RangeEncoder()
encoder.encode(message, entropy_model, np.array([10., 10., 10.]), np.array([10., 0., 0.]))

compressed = encoder.get_compressed()
print(f"compressed representation: {compressed}")
print(f"(in binary: {[bin(word) for word in compressed]})")

decoder = constriction.stream.queue.RangeDecoder(compressed)
decoded = decoder.decode(entropy_model, np.array([10., 10., 10.]), np.array([10.,  0., 0.]))

print(decoded)
assert np.all(decoded == message) # (verifies correctness)

the result is

compressed representation: [2042752312  556892819]
(in binary: ['0b1111001110000011110110100111000', '0b100001001100011000001010010011'])
[10  5  6]

May be there is some fallback when $b=0$?

robamler · 2023-11-25T19:47:18Z

You are right, leakily quantized Laplace distributions with $b=0$ do actually work in release builds. I can't reproduce my tests where these failed, maybe I accidentally tested in debug mode myself.

I compiled from source following
[...]

build the Python module: poetry run maturin develop --features pybindings

Because I am pretty new to rust, I am not very sure whether I compiled the package in release mode at that moment.

This compiles in debug mode (i.e., it will result in a python package that contains more checks and fewer run-time optimizations). This explains the difference in behavior between your two builds.

(For completeness, if you want to compile in release mode, use: poetry run maturin develop --features pybindings --release; I updated the readme accordingly.)

I'm a bit unsure what the conclusion should be. On the one hand, one could interpret a Laplace distribution with $b=0$ as a delta distribution, and one could argue that a delta distribution should work with our leaky quantizer in most cases (see below) because the quantizer integrates the delta distribution over a finite interval, and the leakiness adds some small nonzero probability mass even if the integral does not contain the delta peak. And it indeed seems to work in experiments.

On the other hand, there are two reasons against allowing $b=0$ for Laplace distributions:

parity with the rust API: while all provided precompiled python modules are compiled in release mode, users of the rust API will probably develop in debug mode, and there the probability crate (which is not under my control) forbids $b=0$ (and I think that's the right thing to do for a general purpose probability crate).
There's an edge case where quantizing a delta distribution is not well-defined: when the delta peak is right at the boundary between two bins. In this case, when the leaky quantizer integrates over one of the two adjacent bins, it's not clear whether the peak should be counted towards the probability mass. As far as I can tell, the way the Laplace distribution is implemented in the probability crate, it should evaluate 0.0 / 0.0 for this case on this line, which should result in NaN for both debug and release builds. Weirdly, I can't manage to trigger this issue in practice, so it seems like it's not an issue, but I want to understand why it's not an issue before relying on it.
[update (2023-11-26): in this edge case, the delta peak is counted entirely towards the right bin. The NaN result from Laplace::distribution is cast into an integer here and here, which results in 0 as per rust reference so that the cdf still evaluates to zero at the delta peak. It still feels dangerous to rely on this behavior, especially if the easier solution is to just add a tiny amount of regularization to $b$.]

Request for feedback

To help me decide, could you give me some more background on why you tested with a Laplace distribution with $b=0$? Did this come up in a real use case or were you deliberately coming up with edge cases for stress testing? In other words, would it suffice for your use case if constriction enforced $b>0$ and if this requirement was documented in the python API reference, or would it make your life easier if $b=0$ was accepted and interpreted as a delta distribution?

actcwlf · 2023-11-26T03:39:34Z

Thanks for your explanations! I am doing some research on Cool-Chic and encountered the case $b=0$. For my use case, interpreting a Laplace distribution as a Delta distribution is useful when $b=0$. But I think these two reasons against allowing $b=0$ are reasonable. Perhaps adding some options is a good alternative? For example, add a strict=true option in constriction.stream.model.QuantizedLaplace to enforce the check that $b>0$ and strict=false for more flexible interpretation.

This came up in the context of #38.

See discussion in #38.

robamler · 2023-11-26T16:48:00Z

Thanks for the background information. I like the idea about a strict=true argument to QuantizedLaplace, but I decided against it for the following reasons (this list is not meant as criticism, it's just to document and communicate my reasoning):

I think it would add unnecessary complexity to the python API where a simpler solution exists (see below);
it would still break the promise that the python API is a subset of the rust API, and would thus prevent users from porting python prototypes to exactly equivalent rust, at least for rust debug builds; and
it relies on somewhat brittle behavior that might break, e.g., with future versions of the probability crate (which are out of my control).

Proposed Solution

The problem of $b=0$ can be easily avoided by adding some small regularization (e.g., 1e-16) to the scale parameter. I added this recommendation to the python API documentation (e.g., here), and I added unit tests to assert that quantizing continuous distributions with even much smaller scales behaves as expected without any numerical issues.

Regarding Precompiled Packages for arm

Unrelated to the specific issue reported here, please let me know if you'd be willing to test a precompiled python package for your computer architecture so I can upload official pip packages to pypi going forward. This would make things easier also for other people with your compute architecture. I'd need to know your host triple to do this (i.e., what rustc --version --verbose says after "host").

actcwlf · 2023-11-27T03:07:41Z

Thanks for your reply! Adding a small regularization is a good idea. As in previous comment, I don't have access to that device for now. If I get more detailed information, I will update the information here.

actcwlf · 2024-04-09T14:11:53Z

Hi, I have got the original device, and I would like to know if you can provide a precompiled version now. I would be happy to do some testing. The system info is

rustc 1.77.1 (7cf61ebde 2024-03-27)
binary: rustc
commit-hash: 7cf61ebde7b22796c69757901dd346d0fe70bd97
commit-date: 2024-03-27
host: aarch64-unknown-linux-gnu
release: 1.77.1
LLVM version: 17.0.6

The device is a Jetson nano and the recommanded version of python is 3.6. I found release v0.2.4 support 3.6 but failed to install it. Is it feasible to compile a newer version to the platform?
Thanks in advance!

actcwlf changed the title ~~Different behavior on x86 and arm~~ Different behavior on amd64 and arm Nov 14, 2023

robamler added a commit that referenced this issue Nov 26, 2023

Add unit tests for almost delta distributions

4568009

This came up in the context of #38.

robamler added a commit that referenced this issue Nov 26, 2023

Recommend regularization in python API doc

bc54f1c

See discussion in #38.

robamler closed this as completed Nov 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different behavior on amd64 and arm #38

Different behavior on amd64 and arm #38

actcwlf commented Nov 14, 2023 •

edited

Loading

robamler commented Nov 18, 2023 •

edited

Loading

actcwlf commented Nov 24, 2023 •

edited

Loading

actcwlf commented Nov 24, 2023

robamler commented Nov 25, 2023 •

edited

Loading

actcwlf commented Nov 26, 2023

robamler commented Nov 26, 2023

actcwlf commented Nov 27, 2023

actcwlf commented Apr 9, 2024

Different behavior on amd64 and arm #38

Different behavior on amd64 and arm #38

Comments

actcwlf commented Nov 14, 2023 • edited Loading

robamler commented Nov 18, 2023 • edited Loading

Background

Notes to Self

actcwlf commented Nov 24, 2023 • edited Loading

actcwlf commented Nov 24, 2023

robamler commented Nov 25, 2023 • edited Loading

Request for feedback

actcwlf commented Nov 26, 2023

robamler commented Nov 26, 2023

Proposed Solution

Regarding Precompiled Packages for arm

actcwlf commented Nov 27, 2023

actcwlf commented Apr 9, 2024

actcwlf commented Nov 14, 2023 •

edited

Loading

robamler commented Nov 18, 2023 •

edited

Loading

actcwlf commented Nov 24, 2023 •

edited

Loading

robamler commented Nov 25, 2023 •

edited

Loading