Skip to content

Escape rather than remove any printable characters in UAs#10731

Closed
luke-jr wants to merge 4 commits into
bitcoin:masterfrom
luke-jr:log_more_uacomment
Closed

Escape rather than remove any printable characters in UAs#10731
luke-jr wants to merge 4 commits into
bitcoin:masterfrom
luke-jr:log_more_uacomment

Conversation

@luke-jr

@luke-jr luke-jr commented Jul 3, 2017

Copy link
Copy Markdown
Member

Current Core strips out the !, + and = characters used by the UASF client and Knots to indicate whether BIP148 enforcement is enabled. This expands the allowed characters to all printable ASCII characters for the GUI, and escapes the disallowed-from-log ones in %XX format when printing to the log.

@jgarzik

jgarzik commented Jul 3, 2017

Copy link
Copy Markdown
Contributor

It seems risky to put quote chars in there.

@luke-jr

luke-jr commented Jul 3, 2017

Copy link
Copy Markdown
Member Author

Hmm, you mean in case someone is reading the log and inserting it into a SQL database or something?

@jgarzik

jgarzik commented Jul 3, 2017

Copy link
Copy Markdown
Contributor

@luke-jr Yep. Or it makes its way to the command line, and someone is lazy and fails to quote. We've seen this movie before :)

@luke-jr luke-jr force-pushed the log_more_uacomment branch from 99c566d to 71b576f Compare July 3, 2017 21:51
@luke-jr luke-jr changed the title net_processing: Avoid filtering any printable characters from UAs in the log SanitizeString: Expand upon allowed characters in logging to include "!#%&*+=^{}~" Jul 3, 2017
@luke-jr

luke-jr commented Jul 3, 2017

Copy link
Copy Markdown
Member Author

Reduced the subset to avoid quotes and other possibly dangerous characters, and just made it the default (which is only used for logging).

@gmaxwell gmaxwell left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK.

! can divert shell processing (past event references), % and & can divert URIs and HTML contexts (by constructing other prohibited characters).

Do we really need to do things like this? Reviewing these sorts of things take time an effort that could better be spent elsewhere.

@luke-jr

luke-jr commented Jul 4, 2017

Copy link
Copy Markdown
Member Author

I can remove % and & (although IMO this is a bit too much nannying), but ! is needed to properly log existing UAs...

@maflcko

maflcko commented Jul 4, 2017

Copy link
Copy Markdown
Member

Concept NACK per @gmaxwell. The existing safe chars should be enough to generate any ua comment that might render useful.

@luke-jr

luke-jr commented Jul 4, 2017

Copy link
Copy Markdown
Member Author

@MarcoFalke I'm not talking about a speculative scenario. UAs using !, +, and = are already live on the network. This only fixes the display and logging of these.

And the existing safe chars uses the same limits for both -uacomment options as well as the log filtering. That particular combination makes it impossible to have code-only UA comments, hence why the usage of !, +, and = were necessary.

@luke-jr

luke-jr commented Jul 4, 2017

Copy link
Copy Markdown
Member Author

(Also, the super-nanny approach of not allowing any possible "needs escaping" characters already sailed a long time ago: ; is one of the most dangerous such characters, and has been allowed from the start.)

@promag

promag commented Jul 4, 2017

Copy link
Copy Markdown
Contributor

; is one of the most dangerous such characters, and has been allowed from the start.)

@luke-jr that doesn't sound a good argument 😄.

@laanwj

laanwj commented Jul 6, 2017

Copy link
Copy Markdown
Member

Do we really need to do things like this? Reviewing these sorts of things take time an effort that could better be spent elsewhere.

I agree, I'd rather not make this change.

@luke-jr

luke-jr commented Jul 6, 2017

Copy link
Copy Markdown
Member Author

@laanwj You yourself added shell characters in #4983 ...

How do we get this fixed? Would it help to reduce it to just !, +, and = so it only addresses the real-world issue?

@gmaxwell

gmaxwell commented Jul 6, 2017

Copy link
Copy Markdown
Contributor

! is shell problematic

@jonasschnelli

Copy link
Copy Markdown
Contributor

Closing because seems not something we should do.

@luke-jr

luke-jr commented Jul 13, 2017

Copy link
Copy Markdown
Member Author

Please reopen. This is a bug fix for a present issue, for which no alternate solutions have been proposed.

We already use/allow "shell-problematic" characters (such as parenthesis and semicolon), and worrying about them in log files is pretty absurd anyway. My bitcoin logs already have ! from other sources anyway.

@jonasschnelli

Copy link
Copy Markdown
Contributor

But @luke-jr: your changing the default charset for the general function SanitizeString() where you break the assumption that this function produces a sanitized/safe string?
If you wan't to fix a bug, I think you should find a ways that doesn't break that – reasonable – assumption.
Example: there are open pull requests who want to use SanitizeString for user feedback (via CLI, RPC). Not sure if this is a good idea or not but it clearly shows that SanitizeString() should include shell/pipe safety.

@jonasschnelli

Copy link
Copy Markdown
Contributor

reopening

@jonasschnelli jonasschnelli reopened this Jul 13, 2017
@luke-jr luke-jr changed the title SanitizeString: Expand upon allowed characters in logging to include "!#%&*+=^{}~" Escape rather than remove any printable characters in UAs Jul 13, 2017
@luke-jr luke-jr force-pushed the log_more_uacomment branch from 71b576f to 9439269 Compare July 13, 2017 08:28
@luke-jr

luke-jr commented Jul 13, 2017

Copy link
Copy Markdown
Member Author

Rewrote this to only allow the full set of printable characters in the GUI where it should be harmless, and to escape them in %XX format when printing to the log.

Comment thread src/utilstrencodings.cpp Outdated

@ryanofsky ryanofsky Jul 13, 2017

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really escaping, because the output is ambiguous (there no way to detect escape sequences and recover the original string). Probably would be better to change to something like strprintf("%%%02X", str[i]) or strprintf("\\x%02X", str[i]) to use a more standard url-style or c-style escape format.

Also, if you do url-style or c-style escaping you should make sure the % or \ escape character is itself escaped by tweaking the SAFE_CHARS.find condition or just not including the character in SAFE_CHAR lists.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it was supposed to be %%%02x

Fixed

Comment thread src/utilstrencodings.cpp Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will also want to remove the % character from SAFE_CHARS_PRINTABLE so it will be escaped itself, or alternately add a && (!escape || str[i] != '%') clause to the find check above.

@ryanofsky ryanofsky Jul 13, 2017

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use %02X instead of %02x since it helps escape sequences stand out a little more, and is the more common way you see percent encoding done.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@TheBlueMatt

Copy link
Copy Markdown
Contributor

Can we instead remove exposing of subver entirely? I'm really tired of people "voting" by spinning up sybil attacks, its just not in any way useful.

@luke-jr

luke-jr commented Jul 17, 2017

Copy link
Copy Markdown
Member Author

@TheBlueMatt Whether we do or don't, it's outside the scope of this PR.

@luke-jr luke-jr force-pushed the log_more_uacomment branch from 183eb69 to b016818 Compare July 26, 2017 09:22
@luke-jr luke-jr force-pushed the log_more_uacomment branch from b016818 to c1ff31f Compare July 27, 2017 04:45
@luke-jr

luke-jr commented Jul 27, 2017

Copy link
Copy Markdown
Member Author

@ryanofsky (addressed your nits btw)

@laanwj laanwj added the Docs label Sep 6, 2017
@maflcko

maflcko commented Sep 7, 2017

Copy link
Copy Markdown
Member

There seems to be no conceptual acknowledgment to do this. Closing for now.

@maflcko maflcko closed this Sep 7, 2017
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants