Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Base45 (RFC9285) to Multibase table #123

Merged
merged 3 commits into from
Jun 5, 2024
Merged

Conversation

msporny
Copy link
Contributor

@msporny msporny commented Jun 3, 2024

This PR adds support for Base45 encoding (RFC9285), which is useful when encoding characters that will be placed into a QR Code. QR Codes have an optimized ALPHANUMERIC format that uses base45 encoding.

@davidlehn
Copy link
Contributor

See also: #64

Co-authored-by: David I. Lehn <dil@lehn.org>
Copy link
Member

@vmx vmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good for me, though I'd like to get approval also from other folks before merging.

@ben221199
Copy link
Contributor

Seems fine to me. However, what about upper/lowercase? If R is uppercase, isn't it an idea to register r too for lowercase?

@davidlehn
Copy link
Contributor

davidlehn commented Jun 4, 2024

Seems fine to me. However, what about upper/lowercase? If R is uppercase, isn't it an idea to register r too for lowercase?

There is a specific alphabet in the RFC. There is no lowercase encoding. I think R was also chosen because it is in that alphabet.

@ben221199
Copy link
Contributor

I don't see terms like "uppercase", "lowercase", "case-sensitive" or "case-insensitive" in the RFC. The only thing I see is a table with uppercase characters. This doesn't give me a clear indication that base 45 cannot be used with lowercase letters. It only signals some preference.

@davidlehn
Copy link
Contributor

I don't see terms like "uppercase", "lowercase", "case-sensitive" or "case-insensitive" in the RFC. The only thing I see is a table with uppercase characters. This doesn't give me a clear indication that base 45 cannot be used with lowercase letters. It only signals some preference.

Section 4.2 says:

The Alphanumeric mode is defined to use 45 characters as specified in this alphabet.

It then has a table with the characters. Section 6 says:

Implementations MUST reject any input that is not a valid encoding. For example, it MUST reject the input (encoded data) if it contains characters outside the base alphabet (in Table 1) when interpreting base-encoded data.

I think that's clear on what alphabet to use for this RFC. There are certainly other possible base45 alphabets and algorithms, but this RFC seems clear on what to use. What's the use case for doing otherwise?

@msporny
Copy link
Contributor Author

msporny commented Jun 4, 2024

I think R was also chosen because it is in that alphabet.

"R" was chosen specifically because lowercase "r" isn't a valid character in Base45, which is largely optimized for use in QR Codes, which means that the entire value we put in the QR Code needs to be Base45-encodable, and lowercase "r" is not.

We really wanted to use "Q" (for QR Code), but couldn't do so because "Q" was reserved due to early IPFS deployments and backwards-compat reasons.

Hope that helps explain some of the thinking behind the selection.

@rvagg rvagg merged commit 7257375 into multiformats:master Jun 5, 2024
@ben221199
Copy link
Contributor

Fair enough. I will implement it in my library now it is merged.

@rvagg Can we have test vectors for it?

@rvagg
Copy link
Member

rvagg commented Jun 5, 2024

@ben221199 #124

If you get their first, open a PR and ping me and I'll see if I can do a JS implementation. Or if you were to do the JS implementation (js-multiformats) I could do the Go one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants