-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade GlacierScript to python3 #61
Conversation
This reverts commit 971523d. It's not clear why the original developers switched to this older version, but it's incompatible with python3. I'm switching to this modern version of base58.py and plan to update it further next. It appears this base58.py comes from: https://github.com/keis/base58/releases/tag/v0.2.3
Otherwise any change to the Python interpreter can change the order of the outputs in the transaction, causing the developer tests to fail. This changes some of the generated transactions, reversing the order of the two outputs.
Otherwise create-withdrawal-data.fails.test would randomly change the order of the arguments passed to `createrawtransaction` (which are part of the error message in the *.golden file).
(I simply added parentheses around the arguments.)
I started by running `2to3 -w glacierscript.py` which took care of many of the details. Then I had to fix by hand several places where string vs Unicode were mixed.
pipes.quote() has been deprecated since python 2.7
PEP 8 is the Python style guide. Note: base58 is in a separate paragraph since it's not a standard library module like the others are.
For some reason, probably having to do with all the rebases I did on this branch before creating this PR, GitHub is listing the commits out of order. The correct order is: 1a806ba Revert "Swapping base 58 implementation for Gavin Andresen’s" |
|
base58.py
Outdated
alphabet = b'123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz' | ||
|
||
|
||
if bytes == str: # python2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use the sys package instead to check the python version? https://stackoverflow.com/questions/9079036/how-do-i-detect-the-python-version-at-runtime
edit: if this is copied from elsewhere we should have a clear link to it, preferably in the file itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
# Copyright (c) 2015 David Keijser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this now a combination of the old script and https://github.com/trezor/trezor-core/blob/master/src/trezor/crypto/base58.py ?
0 BTC going back to cold storage address 2NGPJX8kzdRpAQJuZWMpCBth1umjQFHeFcz | ||
0.19996310 BTC going to destination address mxBQD1QAYpwiudaCJdRhE9QSW9cokafJ99 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this should change with a python version bump?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same issue from 60e189b, the ordering of the outputs changed for some of the withdrawals. It was effectively random before; now it is consistent and predictable.
"txid": "cc0871827f83c927d79ca8d50c52a72fcaf4223f1de7f92931bfd70f7d44e3b3", | ||
"hash": "cc0871827f83c927d79ca8d50c52a72fcaf4223f1de7f92931bfd70f7d44e3b3", | ||
"txid": "31906179a4f3d983f62cecd9e8fcd4bc3df4969e3d5ec85547b3fecbdb53c51c", | ||
"hash": "31906179a4f3d983f62cecd9e8fcd4bc3df4969e3d5ec85547b3fecbdb53c51c", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did all of these fixtures change? Should they not be compatible with the script after the version bump?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is explained in the commit message for 60e189b. The order of the transaction outputs was dependent on the ordering of a python dict, which is not specified by the language, and effectively random. It may change when the python interpreter version changes. To alleviate this, I switched to using an OrderedDict, and now the ordering of the outputs is consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. How did we not have tests fail from the unordered python dict before? My understanding is that python 2.7 has the same behavior, please let me know if I'm missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than you ever wanted to know about Python dict ordering
In 2.7 it was stable but unspecified. Every run would be the same. The Python3 version we have (3.5.2) uses the random hash seed explained in the SO answer linked above. Every time we run the script, the ordering is randomized, which makes it impossible for us. We need a stable ordering for both the developer tests and so the two quarantined laptops will match each other. Hence OrderedDict.
In Python 3.7 the dict ordering is specified, and we wouldn't need OrderedDict anymore.
Thank you for the review @gracenoah. While I believe all your questions were already answered in the individual commit messages, I've also addressed your comments above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the explanation
Following are some comments about the pull request commit: 62def07
Generally speaking I am a bit skeptical of changing the base58 encoding file to this new one. Are we sure the sourced file is coming from a reliable place? Note also how the source file has a cluttered main function that accepts command line arguments etc... Looks like we need to discuss this more. commit dff8ba6
commit bb0a435 Do we have a convention for when to use "from X import Y" or "import X" and then referencing with either Y or X.Y? Seems like we are confusing two styles with no consistent basis |
You can make comments directly on the source code lines during a review, like @gracenoah did above. That would make it easier for me to reply to each individually. Re 62def07, the old base58.py did not support python3, so I updated it to a later version from a "reliable" open-source repo. I didn't want to make any edits to that file, only copy it verbatim. It has some unneeded stuff, yes, but I figured it was better to use it as-is than to make edits which would require further scrutiny. Re dff8ba6, the python 2to3 conversion program made this change, and it's important since the code modifies the dict inside the loop. If I remove it, python complains:
Re bb0a435: I'm not aware of any style guidelines for imports like this (though I haven't gone looking, but pylint et al have no checks for this). Nor do I think any consistency is needed, since it's clear what's being imported in each case. (But Thank you for the detailed review. |
I see. Regarding the list remark a more direct way would perhaps be to change the for loop and if statement to
As far as the reliability of the base58 code, how was that determined? I am a bit skeptical to just include code from another repo unless we have an objective measure of its reliability. Otherwise, I'd much rather see that we take the code, refactor it, include it as our internal library, remove whatever is not needed and actually properly review what was created. |
This way, we have less code to review.
I tried the following:
The problem is, this creates a new dict, then passes that dict into OrderedDict. In python 3.5.2, dict ordering is random, which creates problems. (See discussion above.) That's why I switched to OrderedDict in the first place. For But I take your point, so I pushed some changes to a new branch in my fork. Do you think I should include those changes here? The file is greatly simplified now. |
Gotcha, missed the ordering part. In that case I think you can do
to achieve the required ordering. If by including them here you mean in this PR, then I'd say yes. Personally I like this structuring much better this way. We can then take a more in depth look into the code and see if all fits. |
Can you paste the last line of
(Note: any rpcport should match the regexp in the last line of |
Also I noticed that in case of a failure like this one, the reported line number was incorrect (off by one), so I just pushed a change to fix that. |
|
Somewhere between python 3.5.3 and python 3.7.4, they added a period to the end of this error message. Using a regexp like I did here will allow this test to pass on either version.
Looks like the difference is in the very last byte, a period at the end of the error message from python. The quarantined laptops with Ubuntu 16.04.1 have python 3.5.2 so I've been testing with 3.5.3 which I already had installed locally. I tried 3.7.4 and I saw the same failure. I pushed a change to optionally allow the period at the end of the line and now the tests pass for me with both 3.5.3 and 3.7.4. |
Also, I have a change in another branch that improves this cryptic error message, but it's probably not appropriate to include in this PR which is only about the python version upgrade. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tACK
@bitcoinhodler please resolve the conflict by removing the import of random library I'll also note that I tried running tests with that conflict resolved and got a failure:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs conflicts fixed and retested
What's the policy on rebasing? I could rebase my entire change onto master, and keep a clean history, which seems to be generally preferred among Git intelligentsia. But that will make review more difficult since (to be safe) you'll have to go over my entire change again. Alternatively I could merge and patch up the two minor issues as part of the merge commit. |
I went with the latter policy and merged in the latest master into this branch. Alternatively, I created a python3-rebased branch that I could force-push over this one if we wanted to be cleaner about the history. |
I realized we were always calling encode('ascii') on every parameter passed into these functions. So why not do that inside the function?
What is needed to move this PR along? |
One more review would be nice if @jacoblyles @jazarija @gracenoah are available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to use encode here, given that hash_sha256 already does that?
I'm not sure how you left these comments on github, but it's not showing me what lines/changes you're referring to. My latest change on this branch (b75c400) took out a couple of unnecessary encodes. Does that address your concern? |
And while you're all here, how about a review on #76 and #73 that build atop this one. I opened those before this one was merged so that people could review all of them at once, in an attempt to accelerate the pace of reviews. I've got 6 more PRs ready for review after that, and then we get to PSBT, which I have working already. |
@bitcoinhodler - if you are looking for testers for the new combined PRs? Do you have them all merged in a single branch? Is there an update to the PDF that goes with those? How can I help? :D |
I'm looking for reviewers, mainly. Read through my commits and make sure I haven't broken anything, or introduced any new weaknesses or backdoors (intentional or not). Each of my three open PRs builds upon the previous (though they are out of order). Review this one first, then #76, then #73. The branch for #73 includes all the commits from all three PRs. For these three PRs, and the next six I have ready to submit, no PDF updates are needed. It's not until we get to PSBT that the process changes at all, and that will no doubt be documented separately (as an optional appendix, perhaps) until it's gone through a public review process. |
I ran the tests and reviewed the changed and comments. Thanks @bitcoinhodler for doing this and responding to all the inquiries on the thread. LGTM |
GlacierScript was unnecessarily using an old version of Python (2.7.12), while Python 3.5.2 is available as
python3
.I successfully ran all the developer tests on my own live boot of Ubuntu 16.04.1 (same as the protocol specifies).
Let's bring GlacierScript into modern times with Python 3.
As usual, I recommend that reviewers go one commit at a time, as the commit messages explain both what & why for each change.