…en in a legal dump) Turned out the munged corpus had a lot of duplicate keys which I had to repair. Which in turn reveals we do not handle very large number properly. I am leaving the failing tests for now as the proper solution is to fix them.
As far as I can tell any escaped string will be longer than the unescaped version, so we do not need to check the length repeatedly. We still do it after we do an sv_upgrade(), but that is the only one I think we actually need.
Previously we tried to parse the string in place as we converted it, however if the string is upgraded this can result in strangeness. This patch fixes things so we are always reading the source string and writing to the target string separately. If the target is upgraded the read point is unaffected.
This is probably a teeny bit slower than doing it otherwise, but it makes it a lot easier to add introspective type debug output which will come in a later patch.