[politeia_verify] no longer verifies file contents #718
In the process of creating a page in dcrdocs on how to verify censorship, I attempted to verify a sample proposal. I was able to successfully verify using the -json method:
However, when I use the other method (directly passing in filenames along with censorship variables), it fails to verify:
I have tried this using sample proposals uploaded using politeiagui (running locally) as well as politeiawwwcli, to the same result.
This is problematic, as passing in files appears to be the only feasible way to verify the content of a censored proposal. A spurned censored proposal owner can't show the community a string of random characters in a json payload as proof of censorship.
It appears that for politeiagui (the main way users will upload proposals), this is because the way we encode a proposal has diverged from the way politeia_verify encodes it. The UI takes in data through fields in a web form (title, body in Markdown, attachments), and creates a proposal file that is then used to calculate the Merkle root. The user can then download the 'Proposal Bundle', but this is just a json with a payload of random characters.
Here's an example JSON:
So it has the censorship record and the digest and payload (which are derived from the raw proposal data/files I believe, but afaik, not reversible back to the raw data).
Digest is clearly not easily reversible, but the payload probably is. Looks like the payload is derived from the single index.md file and each file is encoded separately. Perhaps it's some baseXX encoding? What was in the
This field doens't explicitly specify the type of hash used (e.g. `"sha256": "blabla"). Weird.
So it appears the payload is just a base64 encoded markdown file. If I take the payload from the above JSON and copy/paste it into this online base64 decoder, I get this:
Which is the title I gave my proposal (
I haven't tested it with attachments (e.g. an image), but it looks like the function that creates the proposal would just appended attachments after the markdown...
So...I suppose someone looking to prove censorship could base64 decode their payload from the JSON. Then run politeia_verify on that same json using the
It looks like we could get politeia_verify working again if we base64 encode the files before hashing them, as we do in politeiagui. The only problem there is that the payload doesn't just consist of uploaded files anymore. The user now inputs the title and markdown into the UI, and politeiagui creates the payload on the backend. Additionally, there's an open PR to add a summary field when you're creating a proposal, adding more data through the UI. Presumably it's a better user experience creating the proposal through the UI. But I think we now need to figure out some way for the user to input all that data back into politeia_verify.
Wouldn't swapping the JSON payload change the censorship token? Can the user generate the token himself from the payload?
Side question: did the now-broken verification flow ever work?
The token is just a random number generated on the backend. The merkle field is the "ordered merkle root of all files in the record" (hashes of raw files concatenated together). The signature field is just the cryptographically signed (by the pi server key I believe) 'merkle+token' (concatenated string of merkle and token). So this signature is really what proves that this unique (due to the randomness of censorship token) profile was uploaded at a specific time (time proven by the fact that this censorship record was recorded at a specific block on the Decred blockchain using dcrtime). Phew!
So the payload cannot be used to generate the token. As @lukebp has pointed out, 'censorship token' is kind of a misnomer, as it's the entire 'censorship record' that is needed to prove censorhip. I updated the Politeia docs recently actually to replace 'censorship token' with 'censorship recored' btw.
The way the user proves censorship really, is generating the merkle root. They can then sign the 'merkle'+'token' (which they have from the JSON), and get the signature (which is what is stored on the blockchain?). Anyway, this is actually what politea_verify does under the hood.
I'm sure it did at one point. But the last code commit was 9 months ago. What happened in that time it appears, is that politeia and politeiagui continued evolving, and diverged from politeia_verify (so far the focus has been on launching and iterating features, not verification). The problem is that politeiagui now encodes proposals and calculates the merkle root in a way that's different. This relates back to the convo you and @lukebp are having this Issue about metadata. If the user inputs metadata through the UI, which isn't put into a file, we no longer have a file that contains all the proposal data. And our current mechanism of verifying censorship breaks down.
Going back to "first principles", what we need to accomplish is: a user needs a way to cryptographically verify censorship in a way that doesn't present an unreasonable technical burden.
The problem with our current strategy is that users are not uploading raw source files anymore. They're doing it through the GUI. The way I see it, there are two basic solutions: