Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base36 byte-encoding specification #65

Merged
merged 3 commits into from
May 22, 2020
Merged

Base36 byte-encoding specification #65

merged 3 commits into from
May 22, 2020

Conversation

ribasushi
Copy link
Contributor

As per our earlier decision we will start supporting base36 in all implementations, in order to resolve ipfs/kubo#7318

This adds the encoding/specification to the table of codecs. Another PR fixing up the test vectors and adding an actual program to generate them will follow.

The encoding uses the alphabet 0-9a-z case insensitively. The prefix K is chosen
to limit future clashes with english words based on https://en.wikipedia.org/wiki/Letter_frequency#Relative_frequencies_of_the_first_letters_of_a_word_in_the_English_language ( also the prefix K2 evokes lofty goals 😛 )

Typical encodings look like (generated by this program):

Nul Cid	lc: kdaznk
Nul Cid	UC: KDAZNK
Peer ID	lc: k51qzi5uqu5dj16qyiq0tajolkojyl9qdkr254920wxv7ghtuwcz593tp69z9m
Peer ID	UC: K51QZI5UQU5DJ16QYIQ0TAJOLKOJYL9QDKR254920WXV7GHTUWCZ593TP69Z9M
Raw UFS	lc: k2cwuec83ymcdya5erizp4urvb7ls2mytq6m2us1404b85f8qwk5tqtk
Raw UFS	UC: K2CWUEC83YMCDYA5ERIZP4URVB7LS2MYTQ6M2US1404B85F8QWK5TQTK
PB UFS	lc: k2jmtxs4gmv7e6433f6bula82cy8eg800nuu2yy7vrk4rd5u011bzxpf
PB UFS	UC: K2JMTXS4GMV7E6433F6BULA82CY8EG800NUU2YY7VRK4RD5U011BZXPF

/cc @aschmahmann who I can't add to reviewers

@Stebalien
Copy link
Member

LGTM. Can we add a quick spec? We also need to specify that zero padding is done with zeros.

@ribasushi
Copy link
Contributor Author

Nod, will write one once I finish go-multibase tests in a bit

Uses the alphabet 0-9a-z case insensitively. The prefix K is chosen
to limit future clashes with english words based on
https://en.wikipedia.org/wiki/Letter_frequency
@ribasushi ribasushi marked this pull request as ready for review May 22, 2020 00:50
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits but otherwise lgtm

multibase.csv Outdated Show resolved Hide resolved
rfcs/Base36.md Outdated Show resolved Hide resolved
@hugomrdias
Copy link
Member

i will implement this in js multiformats/js-multibase#53

@vmx vmx deleted the feat/base36_spec branch May 22, 2020 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Subdomain support for CIDs longer than 63
3 participants