Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name/value encodings for Namecoin RPCs #246

Closed
domob1812 opened this issue Jul 29, 2018 · 5 comments
Closed

Name/value encodings for Namecoin RPCs #246

domob1812 opened this issue Jul 29, 2018 · 5 comments

Comments

@domob1812
Copy link

domob1812 commented Jul 29, 2018

We should implement options to specify the encodings of names and values for the RPC (and perhaps REST) interfaces, both for "read" (e.g. name_show but also decodescript) and "write" (e.g. name_update) RPCs. These encodings should be possible:

  • ascii, which only allows printable ASCII characters, i.e. with a code between 0x20 and 0x7f (both inclusive).
  • utf8, which allows all valid UTF-8 strings. This encoding still ensures that returned JSON is valid. This corresponds to the current behaviour.
  • hex, which allows arbitrary binary data but encodes it in the RPC as hex digits.

If invalid values are inputs to an RPC (i.e. argument values), then the RPC will fail. If invalid values are outputs (e.g. name and value in the result of name_show), then the offending field(s) will not be returned and instead a name_error and/or value_error field will be set.

If not specified, then both the name and the value encoding should default to ascii, to prevent "silent" errors in software that is not properly handling non-ASCII data. It should be possible to change the default through namecoin.conf. The encoding to use for name and/or value in a single RPC can also be specified explicitly with the JSON options argument that is already present for write-RPCs and will be added for read-RPCs.

This is a part of the extensions proposed in #194 (but a sufficiently complex change to have its own issue).

@domob1812 domob1812 self-assigned this Jul 29, 2018
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Jul 30, 2018
This is a partial cherry-pick of
xaya/xaya@f436174,
and exposes the UTF-8 validation logic in univalue publicly.  Based on
this, we can implement validation of names/values in the "utf8" mode
as proposed in namecoin#246
in a follow-up change.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Jul 30, 2018
The new files names/encoding.h/.cpp contain functions to handle name
and value encodings in the RPC (and REST) interfaces.  Roughly speaking,
they implement the "main logic" for
namecoin#246.

Actual usage of these to support encodings in the external interfaces
will be added in follow-up changes.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Jul 30, 2018
The new files names/encoding.h/.cpp contain functions to handle name
and value encodings in the RPC (and REST) interfaces.  Roughly speaking,
they implement the "main logic" for
namecoin#246.

Actual usage of these to support encodings in the external interfaces
will be added in follow-up changes.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 1, 2018
This is a partial cherry-pick of
xaya/xaya@f436174,
and exposes the UTF-8 validation logic in univalue publicly.  Based on
this, we can implement validation of names/values in the "utf8" mode
as proposed in namecoin#246
in a follow-up change.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 1, 2018
The new files names/encoding.h/.cpp contain functions to handle name
and value encodings in the RPC (and REST) interfaces.  Roughly speaking,
they implement the "main logic" for
namecoin#246.

Actual usage of these to support encodings in the external interfaces
will be added in follow-up changes.
@domob1812
Copy link
Author

domob1812 commented Aug 2, 2018

Actually, I think utf8 should perhaps allow non-printable characters as well. That matches the current behaviour, and might be useful to allow for text with line breaks and stuff like that. (Whether or not that is useful byitself in Namecoin is a different question, but it seems at least useful to support that easily through the RPC interface.)

I've updated the original report accordingly.

domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 2, 2018
Instead of using ValtypeFromString to convert names/values given in the
RPC interface to valtype, call DecodeName and take the configured
encoding into account.

(The input-side for the REST interface is using a custom function to
handle URL-escaped names and thus not related to this change.)

This implements all changes required for
namecoin#246 from the "input"
side.  The "output side" (e.g. "name" and "value" fields in the result
of name_show) is still missing.

With this change, all uses of ValtypeFromString have been removed except
for the one in DecodeName itself and the unit test.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 3, 2018
The new files names/encoding.h/.cpp contain functions to handle name
and value encodings in the RPC (and REST) interfaces.  Roughly speaking,
they implement the "main logic" for
namecoin#246.

Actual usage of these to support encodings in the external interfaces
will be added in follow-up changes.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 3, 2018
Instead of using ValtypeFromString to convert names/values given in the
RPC interface to valtype, call DecodeName and take the configured
encoding into account.

(The input-side for the REST interface is using a custom function to
handle URL-escaped names and thus not related to this change.)

This implements all changes required for
namecoin#246 from the "input"
side.  The "output side" (e.g. "name" and "value" fields in the result
of name_show) is still missing.

With this change, all uses of ValtypeFromString have been removed except
for the one in DecodeName itself and the unit test.
@domob1812
Copy link
Author

domob1812 commented Aug 4, 2018

I thought about the "output" side of this some more. The original proposal was/is to fail the RPC when the requested name (or its value) is not valid with respect to one of the chosen encodings. But I think this is not a good approach. One particular area where this is bad is for name_scan:

Of course, if the requested start name ("input" to the RPC) is invalid, the RPC will fail. But I think it would be highly confusing and error-prone for client software if the RPC would then either fail or silently skip names in the result if their values were not matching the encoding. (Especially if the client tried to enumerate all names, which I think is one of the main uses of name_scan.)

Thus, I propose the following (and will update the original post accordingly): RPCs only fail if the input data (i.e. arguments to the RPC) are encoded in an invalid way. For invalid names or values in the result of RPCs (i.e. the name and value fields), the RPC instead returns a name_error or value_error field.

This approach still has the property that naively implemented client software cannot choke on or be attacked with unexpected data, as the invalid name/value won't be present. By adding a name_error/value_error field, it is also clear what happened from a manual look (unlike if we just left out the offending fields). This was recently implemented for Huntercoin in a stripped-down form.

@JeremyRand and others: What do you think about this proposal?

domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 4, 2018
The new files names/encoding.h/.cpp contain functions to handle name
and value encodings in the RPC (and REST) interfaces.  Roughly speaking,
they implement the "main logic" for
namecoin#246.

Actual usage of these to support encodings in the external interfaces
will be added in follow-up changes.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 4, 2018
Instead of using ValtypeFromString to convert names/values given in the
RPC interface to valtype, call DecodeName and take the configured
encoding into account.

(The input-side for the REST interface is using a custom function to
handle URL-escaped names and thus not related to this change.)

This implements all changes required for
namecoin#246 from the "input"
side.  The "output side" (e.g. "name" and "value" fields in the result
of name_show) is still missing.

With this change, all uses of ValtypeFromString have been removed except
for the one in DecodeName itself and the unit test.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 5, 2018
Use the configured name/value encodings also for return values of RPCs
(and the /name/ REST handler in JSON format).

This mostly finishes the implementation of the proposal in
namecoin#246, except for making
the encoding configurable per-RPC through an options argument.

In particular, we now use the encoding options when populating the
'nameOp' JSON object for decoded raw transcations and scripts, and for
returning data from name_show, name_scan and the likes.

Those JSON objects got new fields 'name_encoding' and 'value_encoding',
so that the 'name' and 'value' fields can be interpreted correctly.
When a name/value cannot be encoded for the chosen encoding, then instead
of setting 'name'/'value', we set 'name_error' or 'value_error' to
indicate this.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 5, 2018
Add a new entry for the Namecoin release notes mentioning the new
encoding options (and the change that occured from "basically utf8"
to "ascii by default").  For details, the entry links to the Github
issue at namecoin#246.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 6, 2018
This is a partial cherry-pick of
xaya/xaya@f436174,
and exposes the UTF-8 validation logic in univalue publicly.  Based on
this, we can implement validation of names/values in the "utf8" mode
as proposed in namecoin#246
in a follow-up change.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 6, 2018
The new files names/encoding.h/.cpp contain functions to handle name
and value encodings in the RPC (and REST) interfaces.  Roughly speaking,
they implement the "main logic" for
namecoin#246.

Actual usage of these to support encodings in the external interfaces
will be added in follow-up changes.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 6, 2018
Instead of using ValtypeFromString to convert names/values given in the
RPC interface to valtype, call DecodeName and take the configured
encoding into account.

(The input-side for the REST interface is using a custom function to
handle URL-escaped names and thus not related to this change.)

This implements all changes required for
namecoin#246 from the "input"
side.  The "output side" (e.g. "name" and "value" fields in the result
of name_show) is still missing.

With this change, all uses of ValtypeFromString have been removed except
for the one in DecodeName itself and the unit test.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 6, 2018
Use the configured name/value encodings also for return values of RPCs
(and the /name/ REST handler in JSON format).

This mostly finishes the implementation of the proposal in
namecoin#246, except for making
the encoding configurable per-RPC through an options argument.

In particular, we now use the encoding options when populating the
'nameOp' JSON object for decoded raw transcations and scripts, and for
returning data from name_show, name_scan and the likes.

Those JSON objects got new fields 'name_encoding' and 'value_encoding',
so that the 'name' and 'value' fields can be interpreted correctly.
When a name/value cannot be encoded for the chosen encoding, then instead
of setting 'name'/'value', we set 'name_error' or 'value_error' to
indicate this.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Aug 6, 2018
Add a new entry for the Namecoin release notes mentioning the new
encoding options (and the change that occured from "basically utf8"
to "ascii by default").  For details, the entry links to the Github
issue at namecoin#246.
domob1812 added a commit that referenced this issue Aug 11, 2018
4cf815e Add release notes entry for encodings. (Daniel Kraft)
810bc13 Support -name/valueencoding for namecoin-tx. (Daniel Kraft)
20ab39a Handle encodings on the output side. (Daniel Kraft)
482c30e Explicitly use UTF-8 for name_filter regexp check. (Daniel Kraft)
ee7fba9 Introduce EncodeNameForMessage for logs / errors. (Daniel Kraft)
80f8ba0 Move names/encoding library to common lib. (Daniel Kraft)
0688f01 Remove ValtypeFromString. (Daniel Kraft)
9367b48 Use configured encoding for "input" names/values. (Daniel Kraft)
7e364eb Add -nameencoding / -valueencoding options. (Daniel Kraft)
b6b8483 Add new names/encoding source files. (Daniel Kraft)
0a1c72e Expose univalue's UTF-8 validation logic. (Daniel Kraft)

Pull request description:

  This implements the main part of the proposal in #246:  New `-nameencoding` and `-valueencoding` options are added, with the possible values of `ascii`, `utf8` and `hex`.  They specify in which encoding name and value data is expected in RPC arguments and returned in JSON fields.  This fixes a long-standing issue Namecoin had with binary data that was invalid UTF-8 (as well as adding proper support for real binary data in the RPC interface).  As a consequence, this likely resolves (or is at least a big part of the solution of) #152 and #170.

  Still missing for a full solution of #246 is the possibility to specify per-RPC encodings in an `options` RPC argument.  For now, the encodings can only be set for the full process at startup time through the config options.

Tree-SHA512: b221c3ccf421517be3419aec568e02475c2562fa1e99a70b7c53332ce343dc78a537285183262b270fb676f7c39cf16a9e58cb8160e2e4653211d0daa77e0437
@domob1812
Copy link
Author

With #249 merged, a big part of this is implemented now. Still missing is to support per-RPC override of the encodings through an options argument (as per #194).

@JeremyRand
Copy link
Member

Thus, I propose the following (and will update the original post accordingly): RPCs only fail if the input data (i.e. arguments to the RPC) are encoded in an invalid way. For invalid names or values in the result of RPCs (i.e. the name and value fields), the RPC instead returns a name_error or value_error field.

This approach still has the property that naively implemented client software cannot choke on or be attacked with unexpected data, as the invalid name/value won't be present. By adding a name_error/value_error field, it is also clear what happened from a manual look (unlike if we just left out the offending fields). This was recently implemented for Huntercoin in a stripped-down form.

@JeremyRand and others: What do you think about this proposal?

@domob1812 As long as no name or value field is returned in cases where that field is invalid ASCII (meaning that any naive downstream software looking for such a field will error rather than silently processing something incorrectly), I think this is fine.

@domob1812
Copy link
Author

@JeremyRand: No such fields are returned (instead a name_error or value_error will be set, but not name or value itself).

domob1812 added a commit to domob1812/namecoin-core that referenced this issue Sep 6, 2018
Extend the "options" RPC argument that is already present on name_new,
name_firstupdate and name_update to include "nameEncoding" and
"valueEncoding" as new fields.  They can be used to override the name
and value encodings for RPC arguments on a per-RPC basis.

This implements a further missing piece (but not yet all) of
namecoin#246.
@domob1812 domob1812 added this to the nc0.18 milestone Sep 6, 2018
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Sep 27, 2018
Extend the "options" RPC argument that is already present on name_new,
name_firstupdate and name_update to include "nameEncoding" and
"valueEncoding" as new fields.  They can be used to override the name
and value encodings for RPC arguments on a per-RPC basis.

This implements a further missing piece (but not yet all) of
namecoin#246.
domob1812 added a commit that referenced this issue Sep 27, 2018
dd92fe3 Allow encoding override in write RPCs. (Daniel Kraft)
3658cca Separate method for decoding value from RPC. (Daniel Kraft)

Pull request description:

  This extends the already existing `options` argument to the "write" RPCs (`name_new`, `name_firstupdate` and `name_update`) to include `nameEncoding` and `valueEncoding`, as proposed in #194.  Using these options, the encodings can be changed from the configured defaults on a per-RPC basis as needed.

  This is one of the still missing parts for #246, although not the only one (because the RPCs that do not yet have an `options` argument still do not support this).

Tree-SHA512: 44f9ba398f7491debcc0f9c33930e6ceca37a793976df7dd6314c626cc45f44da00a73e1c3ddfd26aef2de9db04594ce80b9862562b6b4f3920b6e1f08722f59
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Oct 22, 2018
Refactor the encoding handling in the name RPCs a bit; in particular,
add an options argument to all getNameInfo variants, that allows to
override the configured default encodings for names/values that is
applied for the output.

For now, NO_OPTIONS is passed at all call sites.  This will be changed
in a follow-up, though, to ultimately implement options overrides for
read-type RPCs.  This is a still missing piece for resolving
namecoin#246.
domob1812 added a commit to domob1812/namecoin-core that referenced this issue Oct 23, 2018
This adds an "options" JSON object argument for name_scan, currently
supporting explicitly overriding the name and value encoding to be used
for the RPC.

This implements namecoin#246 for
name_scan, and will be the basis for implementing more options for
namecoin#237 in follow ups.
domob1812 added a commit that referenced this issue Oct 27, 2018
…solete

4a0a288 Remove name_filter. (Daniel Kraft)
695204e Add filtering options for name_scan. (Daniel Kraft)
e980bb9 Allow empty string with hex encoding. (Daniel Kraft)
8d3e8cb Support encoding options for name_scan. (Daniel Kraft)
e6097b8 Add _encoding / _error fields to help text. (Daniel Kraft)
fbbe194 Move help builder for "options" to rpc/names. (Daniel Kraft)
f156d78 Add options argument to getNameInfo. (Daniel Kraft)

Pull request description:

  In this change, we add a new optional `options` JSON object argument for `name_scan`.  It can be used to override the name and value encodings, implementing #246 for `name_scan`.  Furthermore, options specific to `name_scan` can be set to filter names based on the number of confirmations, their prefix or a regexp as discussed in #237.  (`showExpired` is not yet supported, as that will be added in a later step for other RPCs as well, as described in #194.)

  With this extension, `name_scan` now supports everything that `name_filter` does except for the stats mode.  Thus, `name_filter` is removed to simplify the RPC interface and reduce the future maintenance burden.  Users of `name_filter` should migrate to the new options of `name_scan`.

  This also finally implements the already closed feature request in #16.

Tree-SHA512: 1120f0f3ed24dc0c23346c8328cf2f865058118373674a8ee5d8f7233f16c23a3ba4feab5947666acc2ac394b5058b7bae16affb00d1272693ae4cb9abd61ddb
domob1812 added a commit that referenced this issue Nov 26, 2018
32d8d8a Add options argument for name_pending. (Daniel Kraft)
9cf463b Add options argument to name_list. (Daniel Kraft)
5bb8a2e Add options argument for name_show/history. (Daniel Kraft)

Pull request description:

  This adds an optional `options` argument for the remaining name RPCs: `name_show`, `name_history`, `name_list` and `name_pending`, as suggested in #194.  For now, the only implemented options are `nameEncoding` and `valueEncoding`, allowing to override the default-configured encodings on a per-RPC basis (implementing another missing part of #246).

  Two name-related RPC methods are remaining without an `options` argument, where I suggest to not add it at all:

  - `namerawtransaction`:  This already has many arguments and adding an `options` argument seems not too useful.  `namecoin-tx` can already be used instead, which allows to have arbitrary names and values in hex-encoding.
  - `sendtoname`:  This also already has many arguments so adding `options` would make the signature even more complex.  Furthermore, this call is "informally deprecated" anyway, see #12, and probably not often used.

Tree-SHA512: fe9db50986e05c6adc96ccf4dd4baf904a7d88cc19ff043f01613da7da3da818accb7c73083489ff87e6263dee7e615628b3d27081a433b526ac317b4d1f4f72
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants