Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add full UTF-8 support to RPC #7892

Merged
merged 5 commits into from Jun 16, 2016

Conversation

@laanwj
Copy link
Member

laanwj commented Apr 16, 2016

(the univalue patch should be taken upstream - the RPC test and doc change should end up here. Posting it here for visibility and because this fixes an ancient issue: #2127. I think this is important to many non-English users)

This adds full UTF-8 support for both on input and output through JSON.

bitcoin-cli usage

Before:

$ src/bitcoin-cli -datadir=/store/tmp/testbtc getnewaddress "рыба"
1HQCE3H87fZ1er1ExLiF5V4a1Kxf46r3J2
$ src/bitcoin-cli -datadir=/store/tmp/testbtc listaccounts
{
...  "\u00c3\u0083\u00c2\u0091\u00c3\u0082\u00c2\u0080\u00c3\u0083\u00c2\u0091\u00c3\u0082\u00c2\u008b\u00c3\u0083\u00c2\u0090\u00c3\u0082\u00c2\u00b1\u00c3\u0083\u00c2\u0090\u00c3\u0082\u00c2\u00b0": 0.00000000
}
$ src/bitcoin-cli -datadir=/store/tmp/testbtc getaccount 1HQCE3H87fZ1er1ExLiF5V4a1Kxf46r3J2
�����±�°

After:

$ src/bitcoin-cli -regtest getaccountaddress "рыба"
mrVZ8GaURJmL3LEUu6ygU2CrUrZ8979iK2
$ src/bitcoin-cli -regtest listaccounts
{
...
  "рыба": 0.00000000
}
$ src/bitcoin-cli -regtest getaccount mrVZ8GaURJmL3LEUu6ygU2CrUrZ8979iK2                                       
рыба

GUI

This also affects the GUI debug console.

Before:

i18n_before

When strings are passed directly (such as for getaccount's return argument) it works fine, but when they go through JSON formatting/parsing, it fails.

After:

i18n_after

RPC test

A test for this new functionality has been added to the wallet.py test.

See the commit message of the first commit for details on what exactly had to be changed in univalue.

@laanwj laanwj added the RPC/REST/ZMQ label Apr 16, 2016
@jonasschnelli

This comment has been minimized.

Copy link
Member

jonasschnelli commented Apr 16, 2016

Nice!
Concept ACK.

Not sure how we should treat the UniValue changes. Should we try to keep the "upstream" repository bitcoin/univalue in sync?

@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Apr 16, 2016

Concept ACK. Tests look good.

"upstream" repository bitcoin/univalue in sync

The patch goes to jgarzik/univalue as bitcoin/univalue is only for bitcoin related patches. (Still in the end we need to merge jgarzik/univalue into bitcoin/univalue because we use this as subtree.)

@laanwj laanwj referenced this pull request Apr 18, 2016
@laanwj laanwj force-pushed the laanwj:2016_04_i18n_unicode_rpc branch Apr 18, 2016
@laanwj laanwj force-pushed the laanwj:2016_04_i18n_unicode_rpc branch Apr 28, 2016
@gmaxwell

This comment has been minimized.

Copy link
Contributor

gmaxwell commented May 17, 2016

Neat. What happens with characters like unicode direction modifiers? Can you end up with your terminal in a weird state?

@MarcoFalke
MarcoFalke reviewed May 17, 2016
View changes
qa/rpc-tests/wallet.py Outdated
@@ -1,4 +1,6 @@
#!/usr/bin/env python2
# coding=utf-8
# ^^^^^^^^^^^^ TODO remove when supporting only Python3

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke May 17, 2016

Member

Needs rebase.

@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented May 18, 2016

Neat. What happens with characters like unicode direction modifiers? Can you end up with your terminal in a weird state?

I don't think that's possible with just unicode?

ANSI escape sequences are a whole different story. In JSON objects they'll be escaped, but when printing strings (say, for help, when there is a single string result) everything is let through. Not sure whether this should be changed, but this would be a change local to bitcoin-cli, not the unicode framework. I'd suggest doing so in a separate pull.

Nothing new here though: It's always been possible to pass any kind of crap characters from the server (just look at my OP) to bitcoin-cli, this just makes the round-trip sane and conform to UTF-8.

@dcousens

This comment has been minimized.

Copy link
Contributor

dcousens commented May 19, 2016

concept ACK, once-over utACK (did not cross-verify UTF-8 impl.).

Could probably use more tests for that JSONUTF8Writer...

@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented May 20, 2016

Could probably use more tests for that JSONUTF8Writer...
ing

There are few direct tests for it, but all the other tests that parses JSON with strings in it is testing it.
What would be nice is doing some fuzzing using https://github.com/laanwj/univalue/tree/2015_11_unifuzz and see that everything is accepted is valid utf-8.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

gmaxwell commented May 25, 2016

@pstratem Any interest in fuzzing this?

@sipa

This comment has been minimized.

Copy link
Member

sipa commented May 25, 2016

Needs rebase.

@laanwj laanwj force-pushed the laanwj:2016_04_i18n_unicode_rpc branch Jun 9, 2016
@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Jun 9, 2016

Rebased, done and removed the Python3 TODOs

laanwj added 5 commits Jun 10, 2016
f32df99 Merge branch '2016_04_unicode' into bitcoin
280b191 Merge remote-tracking branch 'jgarzik/master' into bitcoin
c9a716c Handle UTF-8
bed8dd9 Version 1.0.2.
5e7985a Merge pull request #14 from laanwj/2015_11_escape_plan

git-subtree-dir: src/univalue
git-subtree-split: f32df99e96d99ab49e5eeda16cac93747d388245
Add a setting ensure_ascii to AuthServiceProxy. This setting,
defaulting to True (backwards compatible),
is passed through to json.dumps. If set to False, non-ASCII characters
>0x80 are not escaped. This is useful for testing server
input processing, as well as slightly more bandwidth friendly in case of
heavy unicode usage.
@laanwj laanwj force-pushed the laanwj:2016_04_i18n_unicode_rpc branch to 7982fce Jun 10, 2016
@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Jun 10, 2016

Ok:

This should be ready now.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

gmaxwell commented Jun 15, 2016

ACK

@laanwj laanwj merged commit 7982fce into bitcoin:master Jun 16, 2016
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
laanwj added a commit that referenced this pull request Jun 16, 2016
7982fce doc: Mention full UTF-8 support in release notes (Wladimir J. van der Laan)
6bbb4ef test: test utf-8 for labels in wallet (Wladimir J. van der Laan)
a406fcb test: add ensure_ascii setting to AuthServiceProxy (Wladimir J. van der Laan)
60ab9b2 Squashed 'src/univalue/' changes from 2740c4f..f32df99 (Wladimir J. van der Laan)
@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Jun 16, 2016

It looks like the build system doesn't detect changes to univalue source files and will not automatically rebuild the library: if you get RPC test failures concerning unicode, please build from a clean tree

@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Jun 16, 2016

make clean
make distclean

should take care of this, I assume.

@laanwj

This comment has been minimized.

Copy link
Member Author

laanwj commented Jun 16, 2016

Yes, make clean should be enough.

kyuupichan referenced this pull request in kyuupichan/BitcoinUnlimited Apr 30, 2017
7982fce doc: Mention full UTF-8 support in release notes (Wladimir J. van der Laan)
6bbb4ef test: test utf-8 for labels in wallet (Wladimir J. van der Laan)
a406fcb test: add ensure_ascii setting to AuthServiceProxy (Wladimir J. van der Laan)
60ab9b2 Squashed 'src/univalue/' changes from 2740c4f..f32df99 (Wladimir J. van der Laan)
codablock added a commit to codablock/dash that referenced this pull request Dec 28, 2017
This was probably missed while backporting Bitcoin PR bitcoin#7892
andvgal added a commit to energicryptocurrency/energi that referenced this pull request Jan 6, 2019
This was probably missed while backporting Bitcoin PR bitcoin#7892
@LitecoinZ LitecoinZ referenced this pull request Apr 30, 2019
53 of 77 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.