Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base64BinaryType validation is (maybe) not compliant. #4

Open
iglosiggio opened this issue Jun 29, 2023 · 0 comments · May be fixed by #5
Open

Base64BinaryType validation is (maybe) not compliant. #4

iglosiggio opened this issue Jun 29, 2023 · 0 comments · May be fixed by #5

Comments

@iglosiggio
Copy link

Hi! We recently found ourselves on a weird situation where strings like !Y@Q#=$=^ validated as base64 without issues. We understand that these strings are valid per RFC2045 but ended up puzzled on the XML Schema description for base64Binary.

Let me quote both specifications:

XML Schema Part 2

The lexical forms of base64Binary values are limited to the 65 characters of the Base64 Alphabet defined in [RFC 2045], i.e., a-z, A-Z, 0-9, the plus sign (+), the forward slash (/) and the equal sign (=), together with the characters defined in [XML 1.0 (Second Edition)] as white space. No other characters are allowed.

RFC2045

               Table 1: The Base64 Alphabet

Value Encoding  Value Encoding  Value Encoding  Value Encoding
    0 A            17 R            34 i            51 z
    1 B            18 S            35 j            52 0
    2 C            19 T            36 k            53 1
    3 D            20 U            37 l            54 2
    4 E            21 V            38 m            55 3
    5 F            22 W            39 n            56 4
    6 G            23 X            40 o            57 5
    7 H            24 Y            41 p            58 6
    8 I            25 Z            42 q            59 7
    9 J            26 a            43 r            60 8
   10 K            27 b            44 s            61 9
   11 L            28 c            45 t            62 +
   12 M            29 d            46 u            63 /
   13 N            30 e            47 v
   14 O            31 f            48 w         (pad) =
   15 P            32 g            49 x
   16 Q            33 h            50 y

The encoded output stream must be represented in lines of no more than 76 characters each. All line breaks or other characters not found in Table 1 must be ignored by decoding software. In base64 data, characters other than those in Table 1, line breaks, and other white space probably indicate a transmission error, about which a warning message or even a message rejection might be appropriate under some circumstances.

Relevant code

Base64BinaryType.java:119

/**
 * computes the length of binary data.
 * 
 * This function also performs format check.
 * @return    -1        if format is illegal.
 * 
 */
private static int calcLength( final char[] buf ) {
    final int len = buf.length;
    int base64count=0, paddingCount=0;
    int i;

    for( i=0; i<len; i++ ) {
        if( buf[i]=='=' )    // decodeMap['=']!=-1, so we have to check this first.
            break;
        if( buf[i]>=256 )
            return -1;      // incorrect character
        if( decodeMap[buf[i]]!=-1 )
            base64count++;
    }

    // once we saw '=', nothing but '=' can be appeared.
    for( ; i<len; i++ ) {
        if( buf[i]=='=' ) {
            paddingCount++;
            continue;
        }
        if( buf[i]>=256 )
            return -1;      // incorrect character
        if( decodeMap[buf[i]]!=-1 )
            return -1;
    }

    // no more than two paddings are allowed.
    if( paddingCount > 2 )        return -1;
    // characters must be a multiple of 4.
    if( (base64count+paddingCount)%4 != 0 )    return -1;

    return ((base64count+paddingCount)/4)*3-paddingCount;
}

I do think that some facility for requesting stricter validation is needed. On the spec-compliance side of things I don't know how to position myself given these conflicting accounts (XML Schema depends on the RFC).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant