Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update proof parameters file to most recent version. #1595

Merged
merged 2 commits into from Jun 29, 2022

Conversation

lemmih
Copy link
Contributor

@lemmih lemmih commented Jun 28, 2022

Summary of changes
Changes introduced in this pull request:

  • Some parameter files were apparently implicit in the past but have to be explicit now. Not exactly sure but this update should fix the sync issues.

Reference issue to close (if applicable)

Other information and links

@codecov
Copy link

codecov bot commented Jun 28, 2022

Codecov Report

Merging #1595 (c592cee) into main (a61b376) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1595   +/-   ##
=======================================
  Coverage   34.49%   34.49%           
=======================================
  Files         208      208           
  Lines       23384    23384           
=======================================
  Hits         8066     8066           
  Misses      15318    15318           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a61b376...c592cee. Read the comment docs.

@elmattic
Copy link
Contributor

Good! What about removing "v28-fil-inner-product-v1.srs" entry as well?

@lemmih
Copy link
Contributor Author

lemmih commented Jun 28, 2022

Good! What about removing "v28-fil-inner-product-v1.srs" entry as well?

It failed to sync for me without it.

@elmattic
Copy link
Contributor

Good! What about removing "v28-fil-inner-product-v1.srs" entry as well?

It failed to sync for me without it.

I managed to sync without but maybe we can keep it and be conservative.

@lemmih
Copy link
Contributor Author

lemmih commented Jun 28, 2022

Good! What about removing "v28-fil-inner-product-v1.srs" entry as well?

It failed to sync for me without it.

I managed to sync without but maybe we can keep it and be conservative.

It works for you if you remove v28-fil-inner-product-v1.srs from the parameters json file and delete /var/tmp/filecoin-proof-parameters/?

@elmattic
Copy link
Contributor

Good! What about removing "v28-fil-inner-product-v1.srs" entry as well?

It failed to sync for me without it.

I managed to sync without but maybe we can keep it and be conservative.

It works for you if you remove v28-fil-inner-product-v1.srs from the parameters json file and delete /var/tmp/filecoin-proof-parameters/?

My bad, I forget to recompile the client after my change (didn't saw the include_str).

@elmattic
Copy link
Contributor

Good! What about removing "v28-fil-inner-product-v1.srs" entry as well?

It failed to sync for me without it.

I managed to sync without but maybe we can keep it and be conservative.

It works for you if you remove v28-fil-inner-product-v1.srs from the parameters json file and delete /var/tmp/filecoin-proof-parameters/?

Yes, it works for me and the file v28-fil-inner-product-v1.srs is still downloaded even though
I have removed the entry in parameters.json .

@LesnyRumcajs
Copy link
Member

@lemmih @elmattic any wrap-up on why it caused an error, where in the system the artifacts were persisted (and grabbed by Forest)? We should understand this in full lest we risk it happening again. And perhaps add some assertions if it's feasible.

On a testing side note, once the nightly check machine is back I'll change it to not build Forest but to take the image from the registry, ensuring clean state. We would be able to catch this class of bugs sooner.

@elmattic
Copy link
Contributor

Can confirm it's not working without v28-fil-inner-product-v1.srs (My bad, I was modifying the wrong repo previously). Accepting your commit.

@lemmih
Copy link
Contributor Author

lemmih commented Jun 29, 2022

Brief post-mortem:

Guillaume and I were able to sync Forest to mainnet but Hubert saw state root mismatches with the exact same code. No matter how many times we cleared the database and built Forest from scratch, the two different outcomes wouldn't go a way.

We faultily assumed all our data was stored in $DATA_DIR and clearing this folder would give us identical environments. It turns out that important parameter files are stored in /var/tmp/filecoin-proof-parameters. These files are downloaded from IPFS by forest according to a builtin parameters.json file. This parameters file is part of rust-fil-proofs1 and was changed five months ago in their repository. We aren't aware of this change (or even of the existence of this file) and didn't update our copy. However, anyone who had tried to sync before February 2nd would have already downloaded the needed files and wouldn't see any failures. The docker image downloads the parameters on demand which explains why it couldn't sync to mainnet while both Guillaume and I still could.

It's a significant problem that a parameter mishap leads to such hard-to-debug state root mismatches rather than fail with informative error messages. Skimming our code as well as that of rust-fil-proofs, I don't see any trivial ways of checking whether we have the most up-to-date parameters.

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as a workaround, but this JSON is not really maintained by us, so in my opinion, it should be fetched during e.g. build phase or the relevant repo should be added as a submodule.

@lemmih
Copy link
Contributor Author

lemmih commented Jun 29, 2022

LGTM as a workaround, but this JSON is not really maintained by us, so in my opinion, it should be fetched during e.g. build phase or the relevant repo should be added as a submodule.

Yeah. We can't use the un-edited json file, though. We need to add the entry for "v28-fil-inner-product-v1.srs" (reasons still unknown).

@lemmih
Copy link
Contributor Author

lemmih commented Jun 29, 2022

I'll merge this PR but this is not done. We 100% need something better.

@lemmih lemmih merged commit 26c85e8 into main Jun 29, 2022
@lemmih lemmih deleted the lemmih/update-proof-parameters branch June 29, 2022 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants