This repository has been archived by the owner. It is now read-only.

Why do IPFS hashes start with "Qm"? #22

Closed
moreati opened this Issue Jul 26, 2015 · 27 comments

Comments

Projects
None yet
@moreati
Copy link

moreati commented Jul 26, 2015

Answer:

IPFS represents the hash of files and objects using Multihash format and Base58 encoding. The letters Qm happen to correspond with the algorithm (SHA-256) and length (32 bytes) used by IPFS.

TODO:

  • Is the prefix always 'Qm'? Yes, in current builds
@jbenet

This comment has been minimized.

Copy link
Member

jbenet commented Jul 27, 2015

Is the prefix always 'Qm'?

No, if objects were hashed with other functions, the prefix would be different. try out this binary: https://github.com/jbenet/go-multihash/tree/master/multihash

@moreati

This comment has been minimized.

Copy link
Author

moreati commented Jul 27, 2015

Thanks, when does IPFS use another function? So far I've only seen Qm.

@jbenet

This comment has been minimized.

Copy link
Member

jbenet commented Jul 27, 2015

@moreati we use sha256-256, but it's not hard for someone to re-compile ipfs to use another function as a default, or change the importer code to add a way to specify the multihash choice.

@moreati moreati closed this Aug 2, 2015

@RichardLitt RichardLitt reopened this May 2, 2016

@RichardLitt RichardLitt added the answered label May 2, 2016

@RichardLitt

This comment has been minimized.

Copy link
Member

RichardLitt commented May 2, 2016

I see this question a lot; reopening it and labelling it as 'answered' to increase visibility.

@RichardLitt

This comment has been minimized.

Copy link
Member

RichardLitt commented Oct 6, 2016

How is Qm the string? I'm a bit confused on that, because I don't see it in any of the tables on the Multihash repo.

@Kubuxu

This comment has been minimized.

Copy link
Member

Kubuxu commented Oct 6, 2016

It is base58-btc encode of the two bytes prefix of that multihash.

@JustinDrake

This comment has been minimized.

Copy link

JustinDrake commented Nov 22, 2016

I'm a bit confused by "The letters Qm happen to correspond with the algorithm (SHA-256) and length (32 bytes)"

Is it that Q corresponds to SHA-256, and m corresponds to 32?

@hsanjuan

This comment has been minimized.

Copy link

hsanjuan commented Nov 22, 2016

@JustinDrake If I got it right, multihashes start with a byte (0x12) which indicates the hashing algorithm, followed by another byte for length (0x20) . "Qm" letters are the result of those bytes encoded in base58.

Source: https://github.com/multiformats/go-multihash/blob/master/multihash.go#L146

@alexanderattar

This comment has been minimized.

Copy link

alexanderattar commented Apr 18, 2017

maybe I just haven't put all the pieces together, but I'm wondering if anyone here can off-hand explain the steps for how one might precompute the IPFS hash (Under the current build) given some JSON. Thanks in advance!

@RichardLitt

This comment has been minimized.

Copy link
Member

RichardLitt commented Apr 18, 2017

@alexanderattar You can do that using ipfs add -n, iirc. It doesn't add it to IPFS, it merely spits out the hash for you.

@alexanderattar

This comment has been minimized.

Copy link

alexanderattar commented Apr 18, 2017

ah thanks @RichardLitt! I guess my question is actually in regard to how I would approach this programmatically by running the JSON through the encoding algorithms without necessarily using the IPFS CLI. I am looking to take some JSON I have in JavaScript and precompute the hash before sending to IPFS if that helps explain the use-case.

@alexanderattar

This comment has been minimized.

Copy link

alexanderattar commented Apr 18, 2017

I just want to add that I have tried encrypting JSON via sha256 to get a hash such as 4f72333148622e4ae56e9c65d57aee47186cd6910ca080757ab72cc0c650f6bb and have prefixed this with 1220 and then taken the entire string with the prefix:

122000c75938d356b000b34e7f7885f8982f29d89af76c234a8d439486b40fdc5469

and after running that through a base58 encoding, I get a hash that resembles an IPFS hash with the prefixed Qm, but the hash is not consistent with was is returned from adding the same JSON to IPFS via the command-line. I am wondering if I am doing something wrong, or missing a step. Thanks again!

@Kubuxu

This comment has been minimized.

Copy link
Member

Kubuxu commented Apr 18, 2017

Try doing the add with --raw-leaves option instead but now you have to add a CID at the front. https://github.com/ipld/cid

Where raw block has multicodec of 0x55.

@alexanderattar

This comment has been minimized.

Copy link

alexanderattar commented Apr 18, 2017

Thanks @Kubuxu but I just want to reiterate that I am not looking to use the ipfs CLI, but rather go through the necessary encryption and encoding steps to get from JSON to a IPFS hash. So far my method has been:

Take the JSON and encode via SHA256 to get the digest, then prefix the digest with 1220 as described here, so the entire hex string is composed of the prefix plus the digest and then base58 encoded. Does any of this approach sound incorrect?

@Kubuxu

This comment has been minimized.

Copy link
Member

Kubuxu commented Apr 18, 2017

IPFS by default also wraps the file you give it into some metadata used by ipfs itself. That is why it is different. --raw-leaves utilizes CID to communicate to others that data under the hash has no wrapping.

@alexanderattar

This comment has been minimized.

Copy link

alexanderattar commented Apr 19, 2017

Wow, I was not aware of that, but that does explain why the hash is different. Is there documentation on the metadata IPFS wraps the file in before generating the SHA256 digest? I tried using the --raw-leaves flag which indeed gives me a different hash:

{"Name":"myfile.txt","Hash":"zb2rhnyuQdBJVhb3j7FAL1NRUrQu4TMkb7zED9S5sh2YCKd62"}

but I have not found any documentation on what algorithms and processing the data goes through to produce this hash.

@madavieb

This comment has been minimized.

Copy link

madavieb commented May 23, 2017

@lautarodragan

This comment has been minimized.

Copy link

lautarodragan commented May 29, 2018

@alexanderattar have you had any luck generating the hash simulating the metadata?

@alexanderattar

This comment has been minimized.

Copy link

alexanderattar commented May 29, 2018

Hi @lautarodragan, I recently revisited this and got some help from someone who was working on something similar using the js-ipfs implementation. Check out this thread: ipfs/js-ipfs#1205. Hope it helps!

@lautarodragan

This comment has been minimized.

Copy link

lautarodragan commented Jul 4, 2018

Thanks @alexanderattar! Don't know why I didn't see your response earlier, but it's a lot of help. That test you wrote sheds some light on the inner workings of the IPFS hashs. I'll give it a try!

@alexanderattar

This comment has been minimized.

Copy link

alexanderattar commented Jul 5, 2018

@NiKiZe

This comment has been minimized.

Copy link

NiKiZe commented Jul 17, 2018

It would be good to have an example implementation of (lets call it IPFS multihash) ipfsmh
the goal would be for it to work just as sha256sum or md5 does on a file, and it would be usable as digest.

using ipfs add -n file requires ipfs init so it is not an option to use if one only wants to get the hash pre-add.

Background:
what I would actually want this to be used for is to have packages (sourcecode for programs) be downloadable via ipfs, but would use legacy http for fallback. this way it could be populated on the fly without the ones creating the packages having to use ipfs in any way.

@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Jul 17, 2018

@NiKiZe that sounds like a great idea! Currently, that tool would have to depend on the go-ipfs repo itself but we should be able to extract the requisite components into separate repos eventually.

@NiKiZe

This comment has been minimized.

Copy link

NiKiZe commented Jul 17, 2018

@Stebalien thanks for your reply.
There is a few parts to this, the ipfsmh could of-course be based on the official client as a start, however as I wrote (but then removed before posting DOH!), having a ipfs client or even go installed where this "needs" to run might not be an option, it needs to be self contained C or Python, which seems to be doable from what I understand?

The code would preferably be small enough that it can be copied over to a different machine by hand without to much effort.
Another reason would be to document how the hashing actually works.
Understandably this would not implement the latest and greatest logic. but then again it can't (or at least shouldn't) change either if it is used as a hash such as sha256. (but thinking about that part more there is other things that I might be confused about but is not relevant for this issue)

@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Jul 18, 2018

having a ipfs client or even go installed where this "needs" to run might not be an option, it needs to be self contained C or Python, which seems to be doable from what I understand?

I agree that a full ipfs client is not the solution. However, go builds really portable binaries so I don't really see any reason to go with C (let alone Python).

The code would preferably be small enough that it can be copied over to a different machine by hand without to much effort.

There are a lot of options that can affect the resulting hash so the tool would need to replicate all the options available on ipfs add.


Note: if you want something reasonably portable, you can probably build it with js-ipfs and run it with node (or a browser). However, that's going to be a bit clunky.

@NiKiZe

This comment has been minimized.

Copy link

NiKiZe commented Jul 18, 2018

To run a go binary, go needs to be installed
To run js we need an engine to run that
I agree that it is portable in the sense that one bin runs on anything.. however it is not portable in the sense that it has dependencies that in many cases are not available, and can't be made available.
From an educational point of view however many have understanding of Python
I will tell you more, but again that part is not relevant for this issue. (I will try to ping you on IRC)

@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Jul 18, 2018

To run a go binary, go needs to be installed

Nope, go is a compiled language. It just compiles really fast so you can run go programs with go run myprogram.go. However, you can compile them with go build myprogram.go.

To run js we need an engine to run that

Same with Python. Personally, I wouldn'g go with either.

@OR13 OR13 referenced this issue Aug 8, 2018

Open

Highlevel Workflow #1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.