Skip to content

Intolerant of truncated data, inelegant failure mode #3

@Synchro

Description

@Synchro

I've noticed that this library has the same problem as the Tuupola Base62 encoder in that it is intolerant of corruption - even a single bit error will destroy the entire output. This is not true of PHP's built-in base64 encoders.

To be fair, this is an edge case, but URLs cannot always relied on to remain intact - for example it's common for email clients to truncate them, and if this happens, the URL will always break completely if it used this encoder.

I suspect it fails for exactly the same reason - the encoding treats the entire string as single arbitrary precision number, and so it results in the corruption we see. If the encoder used a chunk based approach (which is especially appropriate for base62 since it does not fit evenly into 8-bit chars), it would not suffer this fragility.

Steps to reproduce, backtrace or example script

$str = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567';
$e = \xobotyi\basen\Base62::encode($str);
$d = \xobotyi\basen\Base62::decode($e);
$t = \xobotyi\basen\Base62::decode(substr($e, 0, -1));
$b6 = base64_encode($str);
$b6d = base64_decode($b6);
$b6td = base64_decode(substr($b6, 0, -1));
echo 'Original string: ', $str, "\n";
echo 'Base62 Encoded string:  ', $e, "\n";
echo 'Decoded Base62 string:  ', $d, "\n";
echo 'Decoded Base62 with 1 char truncated: ', $t, "\n";
echo 'Base64 Encoded string:  ', $b6, "\n";
echo 'Decoded Base64 string:  ', $b6d, "\n";
echo 'Decoded Base64 with 1 char truncated: ', $b6td, "\n";

Example output:

Original string: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567
Base62 Encoded string:  4P1nFLX8O0GCD8E3Xwm0iDXG3fUXK4r0ErC9FylJ2b83x5frxW82yb3kAOYylUnbN1ljw2kWvQmiaHo5p
Decoded Base62 string:  abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567
Decoded Base62 with 1 char truncated: ���`�G�`�܎���"�d�3i��3y�`��S�d6ϙ�}��6��\&}7	�l�}O���!ۮG�
Base64 Encoded string:  YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWjAxMjM0NTY3
Decoded Base64 string:  abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567
Decoded Base64 with 1 char truncated: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions