Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary data not recognized by YAML parser #39

Closed
tmoschou opened this issue Sep 21, 2017 · 9 comments
Closed

Binary data not recognized by YAML parser #39

tmoschou opened this issue Sep 21, 2017 · 9 comments
Labels
yaml Issue related to YAML format backend
Milestone

Comments

@tmoschou
Copy link

Fails to recognise explicit binary type tag !!binary (tag:yaml.org,2002:binary)

Also fails to decode to spec as defined in http://yaml.org/type/binary.html.

Binary data is serialized using the base64 format as defined by RFC2045 (MIME), with the following notes:

  • The content is not restricted to lines of 76 characters or less.

  • Characters other than the base64 alphabet, line breaks and white space are considered an error.

Sample test case the demonstrates the issues:

From "Example 2.23. Various Explicit Tags" from spec (1.1 and 1.2) - http://www.yaml.org/spec/1.2/spec.html

---
picture: !!binary |
 R0lGODlhDAAMAIQAAP//9/X
 17unp5WZmZgAAAOfn515eXv
 Pz7Y6OjuDg4J+fn5OTk6enp
 56enmleECcgggoBADs=
@Test
public void testBinary() throws IOException {
    final ObjectMapper mapper = new ObjectMapper(new YAMLFactory());
    try (final InputStream inputStream = getClass().getResourceAsStream("yaml1.3-example2.23.yaml")) {
        final JsonNode bean = mapper.readTree(inputStream);
        final JsonNode picture = bean.get("picture");
        assertEquals(JsonNodeType.BINARY, picture.getNodeType()); // fails
        final byte[] gif = picture.binaryValue(); // also fails 
        assertEquals(65, gif.length);
        final byte[] actualFileHeader = Arrays.copyOfRange(gif, 0, 6);
        final byte[] expectedFileHeader = new byte[]{'G', 'I', 'F', '8', '9', 'a'};
        assertArrayEquals(expectedFileHeader, actualFileHeader);
    }
}
@cowtowncoder
Copy link
Member

Thank you for reporting this, providing examples.
I'll have to see whether SnakeYAML (actual low-level decoder/encoder) provides enough information to expose it.

Note that Jackson databind does actually handle base64 encoding/decoding when using typed binding (to POJOS with byte[] properties); but JsonNode can not use such metadata.
But the default Base64Variant is one that does not force linefeeds (there are 3 or 4 different commonly used variants, not just one). Perhaps it would make sense to figure out a way to use different defaults for YAML codec; that should be doable.

@cowtowncoder cowtowncoder added the yaml Issue related to YAML format backend label Oct 24, 2017
@arulrajnet
Copy link

As per this https://bitbucket.org/asomov/snakeyaml/src/tip/src/test/java/org/yaml/snakeyaml/types/BinaryTagTest.java?fileviewer=file-view-default test class snakeYAML is already supporting this. Jackson YAMLParser lack of this support.

@cowtowncoder
Copy link
Member

@arulrajnet Thank you for the link. This will be useful if and when someone has time to work on this.

@cowtowncoder cowtowncoder changed the title (yaml) does not handle binary type correctly Binary data not recognized by YAML parser Nov 30, 2017
@cowtowncoder
Copy link
Member

Unfortunately test case uses higher level method of snakeYAML (load) which is not what Jackson parser uses (since it builds an in-memory structure instead of incremental). Tag is visible and easy to recognized, and Jackson has its own base64 codec which should work fine so that's not a big deal... but I'll see if snakeyaml might have lower level access to their codec.

@cowtowncoder
Copy link
Member

Turns out there is simple class Base64Coder that is used internally so I can just use that I think.
While it may not be as streamlined as jackson codec it would provide closest compatibility wrt yaml, most likely, so it seems like the best way to go for now.

@cowtowncoder cowtowncoder modified the milestones: 2.9.1, 2.9.3 Nov 30, 2017
@earonesty
Copy link

wow, yaml typing is evil and idiosyncratic.

@cowtowncoder
Copy link
Member

@earonesty Somehow it feels on-brand for YAML tho :-)

@earonesty
Copy link

There is no specification for the binary tag, no definition of encoding. Base64 is a convention tho, so using it is a safe choice.

@cowtowncoder
Copy link
Member

As per tests for #90, Jackson YAML format does actually support decoding from !!binary tagged values, assuming Base64 encoding, without requiring target type of byte[].
So deserializing into JsonNode (via ObjectMapper.readTree()) does produce BinaryNodes; and deserializing as java.lang.Object should produce byte[] values, I think.

So even if YAML spec did not specify encoding, it should work in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
yaml Issue related to YAML format backend
Projects
None yet
Development

No branches or pull requests

4 participants