Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using fastbase64 to encode a file and put it in JSON #10

Closed
jcidaragones opened this issue Mar 20, 2019 · 4 comments
Closed

Using fastbase64 to encode a file and put it in JSON #10

jcidaragones opened this issue Mar 20, 2019 · 4 comments

Comments

@jcidaragones
Copy link

Hello!

I'm coding a program that should transfer a 160Mb file through JSON for an API and I wanted to test this AVX2 approach to encode the content of the file, paste the encoded string into the JSON and send it to the API.

I first modified unit.c to open a file, pass the content of the file in binary to a variable, encoding and storing it to a variable and then paste it into a file again, after some tests I got the encoded file (except for Images) but I always had a garbage tail at the end of it so I simplified the problem and just modified the unit.c main like this:

` size_t len = 0;
size_t codedlen = 0;
printf("Encoding...\n");
char * dest1 = (char*)malloc(chromium_base64_encode_len(strlen(wikipediasource)));
codedlen = fast_avx2_base64_encode(dest1, wikipediasource, strlen(wikipediasource));

assert(strncmp(dest1, wikipediacoded, codedlen) == 0);
printf("Assert ok\n\nBASE64:\n%s\n", dest1);

char * dest2 = (char*)malloc(chromium_base64_decode_len(codedlen));
len = fast_avx2_base64_decode(dest2, dest1, codedlen);

assert(len == strlen(wikipediasource));
assert(strncmp(dest2, wikipediasource, strlen(wikipediasource)) == 0);
printf("Asserts ok\n\nDECODED TEXT:\n%s\n", dest2);
printf("\n\nSOURCE:\n%s\n",wikipediasource);`

No exception is thrown but printf of SOURCE (wikipediasource) differs from prinf of DECODE TEXT (dest2) (see image)
git

The garbage tail is always changing:
²²²²¦¦¦¦¦¦¦Ñr�¥&
²²²²
²²²²¦¦¦¦¦¦¦┐90¨+ì

I assumed that this was something related to printf or related to a variable type so I stored the decoded string into a file but unfortunately it saved the whole thing including the garbage tail.

So I'm not sure if this is how it's supposed to work and/or if I'm doing something wrong?? Could you give me some advice?

Many thanks in advance!

Best,

@aqrit
Copy link

aqrit commented Mar 20, 2019

The count returned from strlen() doesn't include the terminating NULL char.
Therefore you didn't include the NULL char in the encoded stream.

After decoding, printf() runs until it encounters a NULL char...

@lemire
Copy link
Owner

lemire commented Mar 20, 2019

Thank you @aqrit.

@jcidaragones: base64 is meant to encode arbitrary binary data. In C, a string must be NULL terminated. So if you see "abc", it actually spans 4 bytes, even if strlen tells you that there are 3 bytes.

You have two ways around your issue:

  1. Instead of printf("%s", string), do printf("%.*s", numberOfBytes, string).

  2. Or else, replace strlen(string) by strlen(string) + 1 to account for the fact that you want to preserve the NULL character.

@lemire lemire closed this as completed Mar 20, 2019
@lemire
Copy link
Owner

lemire commented Mar 20, 2019

I'm coding a program that should transfer a 160Mb file through JSON for an API and I wanted to test this AVX2 approach to encode the content of the file, paste the encoded string into the JSON and send it to the API.

You know about simdjson?

@jcidaragones
Copy link
Author

I'm coding a program that should transfer a 160Mb file through JSON for an API and I wanted to test this AVX2 approach to encode the content of the file, paste the encoded string into the JSON and send it to the API.

You know about simdjson?

Nope, but I'll check it out!

@aqrit @lemire Thanks for your answers now everything make sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants