types: Fixes #146, VisibleString character set #147

Nicceboy · 2023-08-21T15:12:51Z

Resolves #146

There is standard compliant character set.

However, there is weird side-effect on UPER tests, and some will fail.
Is the codec somehow using the allowed character list as part of codec?

I could not figure out the reason yet.

For reference, array was generated with following to be correct:

python -c 'print("[{}]".format(", ".join("0x{:02X}".format(i) for i in range(0x20, 0x7F))))'

Nicceboy · 2023-08-21T16:00:25Z

Seems like CHARACTER_WIDTH of the list is used on encoding.
Would it be okay to just replace the expected values with new encoded values?

Edit: Also the whole set in to_index_string or index_map in String constraints. I assume that it is related to packing and might be just safe to replace with new values.

Nicceboy · 2023-08-21T16:40:47Z

I modified the VisibleString output values but there might be other data race condition similar to this issue: ferrilab/bitvec#229

When running cargo test, one test fails

failures:

---- uper::tests::extension_additions stdout ----
thread 'uper::tests::extension_additions' panicked at 'assertion failed: `(left == right)`

Diff < left / right > :
 [
     199,
<    93,
<    57,
<    17,
<    105,
<    82,
<    178,
>    91,
>    53,
>    9,
>    89,
>    50,
>    114,
     7,
     1,
     128,
     5,
     150,
<    154,
<    19,
<    233,
<    84,
>    150,
>    11,
>    217,
>    52,
 ]

', src/uper.rs:785:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    uper::tests::extension_additions

test result: FAILED. 76 passed; 1 failed; 2 ignored; 0 measured; 0 filtered out; finished in 2.62s

But alone it succeeds cargo test --lib uper::tests::extension_additions:

    Finished test [unoptimized + debuginfo] target(s) in 0.05s
     Running unittests src/lib.rs (target/debug/deps/rasn-8b41135ae5a97fd6)

running 1 test
test uper::tests::extension_additions ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 78 filtered out; finished in 0.13s

XAMPPRocky · 2023-08-21T16:42:03Z

Seems like CHARACTER_WIDTH of the list is used on encoding.

Both the width and the order are used in UPER.

Would it be okay to just replace the expected values with new encoded values?

Hmm, are you sure these are correct? These are adapted from tests in other ASN.1 projects, so I would expect that those values are the correct ones, and not the new values.

Nicceboy · 2023-08-21T16:44:06Z

Hmm, are you sure these are correct? These are adapted from tests in other ASN.1 projects, so I would expect that those values are the correct ones, and not the new values.

I do not know about UPER codec so it might be very time consuming to verify the code. But I can check if they use similar character sets for VisibleString encoding.

XAMPPRocky · 2023-08-21T16:59:08Z

@Nicceboy Yeah I would look at packages like asn1-tools from the python world, they're the most extensive.

Nicceboy · 2023-08-21T17:12:51Z

Seems like asn1rs uses similar set:

pub const VISIBLE_STRING_CHARACTERS: &'static str =
        " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~";

Also asn1tools:

VISIBLE_STRING = ''.join([chr(v) for v in range(32, 127)]) # 0x20, 0x7f, end value not inclusive

XAMPPRocky · 2023-08-21T17:45:23Z

@Nicceboy There must be something different though because their test is the same as the src/uper.rs one and has the original output not the new output.

https://github.com/eerimoq/asn1tools/blob/de25657f7c79100d1ba5312dd7474ff3e0d0ad2e/tests/test_uper.py#L1175-L1177

Nicceboy · 2023-08-21T17:51:19Z

Yeah, I tried it manually and new values are different indeed when compared to asn1tools output.

But since the character set changed, these values are also expected to change, since codec depends on it, so how can they be same with different character sets?

I am trying to see if asn1tools respects those character sets and maybe also test some other tool.

Nicceboy · 2023-08-21T19:54:17Z

I would assume that the new character list is correct, but codecs use them differently, and the asn1tools output is correct. Also asn1.io seems to produce same output.

If we look at the same asn1tools test file, there is also test for ia5_string with identical constraints for the linked VisibleString test ( "D ::= IA5String (SIZE (19..133)) "):

It uses the same character list as the "old" one here, #L1123-L1126.

('D',
 'HejHoppHappHippAbcde',
 b'\x03\x23\x2e\xa9\x1b\xf8\x70\x91\x87\x87\x09\x1a\x78\x70\x83\x8b'
 b'\x1e\x4c\xa0'),

The output is same for smaller VisibleString character list, #L1175-L1177 :

('A',
 'HejHoppHappHippAbcde',
 b'\x03\x23\x2e\xa9\x1b\xf8\x70\x91\x87\x87\x09\x1a\x78\x70\x83\x8b'
 b'\x1e\x4c\xa0'),

So this change should not actually change the output, if we trust these outputs.
Here the output changes when the character list is different.

But finding the difference in UPER implementation might be a bit too time consuming for me right now.

Nicceboy · 2023-08-22T18:21:33Z

I compared asn1tools and rans side by side with debugger, and I think there might a logic problem in rasn when encoding known-multiplier-strings. I also took a look for standard, but this might require confirmation @XAMPPRocky .

The main issue might be on using indexes when they should not be. rasn seems to always use just indexes, but on certain scenarios values should be used directly from the allowed character list.

This has gone unnoticed, because ia5 and VisibleString character list had full ASCII starting from zero, and the encoded index value is the same as the value itself.

Tests are also missing for PrintableString type, which would have caught this.

If you add similar test from asn1tools for PrintableString, it fails:

    #[test]
    fn printable_string() {
        round_trip_with_constraints!(
            uper,
            PrintableString,
            Constraints::new(&[Constraint::Size(Size::new(Bounded::Single(16)).into())]),
            PrintableString::from_bytes("0123456789abcdef".as_bytes()).unwrap(),
            &[0x60, 0xc5, 0x93, 0x36, 0x8d, 0x5b, 0x37, 0x70, 0xe7, 0x0e, 0x2c, 0x79, 0x32, 0xe6]
        );
    }

NumericString type is encoded correctly, because the allowed character list is so short, that is is expected to be encoded by using indexes.

You can see the difference also in asn1tools encode/decode mapping generation:

class NumericString(KnownMultiplierStringType):

    ALPHABET = bytearray(NUMERIC_STRING.encode('ascii'))
    ENCODE_MAP = {v: i for i, v in enumerate(ALPHABET)}
    DECODE_MAP = {i: v for i, v in enumerate(ALPHABET)}
    PERMITTED_ALPHABET = PermittedAlphabet(ENCODE_MAP,
                                           DECODE_MAP)


class PrintableString(KnownMultiplierStringType):

    ALPHABET = bytearray(PRINTABLE_STRING.encode('ascii'))
    ENCODE_MAP = {v: v for v in ALPHABET}
    DECODE_MAP = {v: v for v in ALPHABET}
    PERMITTED_ALPHABET = PermittedAlphabet(ENCODE_MAP,
                                           DECODE_MAP)


class IA5String(KnownMultiplierStringType):

    ALPHABET = bytearray(IA5_STRING.encode('ascii'))
    ENCODE_DECODE_MAP = {v: v for v in ALPHABET}
    PERMITTED_ALPHABET = PermittedAlphabet(ENCODE_DECODE_MAP,
                                           ENCODE_DECODE_MAP)

XAMPPRocky · 2023-10-10T08:21:41Z

@Nicceboy Can we make an issue for fixing that? Just so I don't forget about it.

Nicceboy · 2023-10-10T09:06:17Z

I can try to debug it again since I know a bit more PER now. If I can't figure it out, I will create issue.

XAMPPRocky · 2023-10-10T09:13:14Z

@Nicceboy Nice, thank you again for all your contributions to rasn! ❤️

…string tests

Nicceboy · 2023-10-11T08:56:23Z

@XAMPPRocky I think this can be reviewed now. The issue was on not checking whether alphabet list should be indexed or not. By default, both types and constraints were always indexed, but based on the standard, it depends on the highest value in that alphabet group. There are also tests for PrintableString.

XAMPPRocky · 2023-10-11T12:06:52Z

Thank you for your PR!

Nicceboy force-pushed the fix-visiblestring branch from 2bfe5fa to ceba49d Compare August 21, 2023 19:54

Nicceboy changed the title ~~draft: types: Fixes #146~~ draft: types: Fixes #146, VisibleString character set Aug 27, 2023

Nicceboy added 2 commits October 10, 2023 16:16

types: Fixes librasn#146

b202d84

visiblestring: remove dbg!

5c23752

Nicceboy force-pushed the fix-visiblestring branch from ceba49d to 5c23752 Compare October 10, 2023 13:17

per: add checks when restricted strings should be indexed, printable …

8826d8c

…string tests

Nicceboy changed the title ~~draft: types: Fixes #146, VisibleString character set~~ types: Fixes #146, VisibleString character set Oct 11, 2023

XAMPPRocky merged commit 15f8774 into librasn:main Oct 11, 2023
65 checks passed

Nicceboy deleted the fix-visiblestring branch October 20, 2023 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

types: Fixes #146, VisibleString character set #147

types: Fixes #146, VisibleString character set #147

Nicceboy commented Aug 21, 2023 •

edited

Loading

Nicceboy commented Aug 21, 2023 •

edited

Loading

Nicceboy commented Aug 21, 2023

XAMPPRocky commented Aug 21, 2023

Nicceboy commented Aug 21, 2023

XAMPPRocky commented Aug 21, 2023

Nicceboy commented Aug 21, 2023

XAMPPRocky commented Aug 21, 2023

Nicceboy commented Aug 21, 2023

Nicceboy commented Aug 21, 2023

Nicceboy commented Aug 22, 2023

XAMPPRocky commented Oct 10, 2023

Nicceboy commented Oct 10, 2023

XAMPPRocky commented Oct 10, 2023

Nicceboy commented Oct 11, 2023

XAMPPRocky commented Oct 11, 2023

types: Fixes #146, VisibleString character set #147

types: Fixes #146, VisibleString character set #147

Conversation

Nicceboy commented Aug 21, 2023 • edited Loading

Nicceboy commented Aug 21, 2023 • edited Loading

Nicceboy commented Aug 21, 2023

XAMPPRocky commented Aug 21, 2023

Nicceboy commented Aug 21, 2023

XAMPPRocky commented Aug 21, 2023

Nicceboy commented Aug 21, 2023

XAMPPRocky commented Aug 21, 2023

Nicceboy commented Aug 21, 2023

Nicceboy commented Aug 21, 2023

Nicceboy commented Aug 22, 2023

XAMPPRocky commented Oct 10, 2023

Nicceboy commented Oct 10, 2023

XAMPPRocky commented Oct 10, 2023

Nicceboy commented Oct 11, 2023

XAMPPRocky commented Oct 11, 2023

Nicceboy commented Aug 21, 2023 •

edited

Loading

Nicceboy commented Aug 21, 2023 •

edited

Loading