Normalize EOL to CRLF for text-mode signatures. #1263

ni4 · 2020-08-17T14:31:55Z

Previously we didn't distinguish text-mode document signatures from binary ones, so for non-canonical line endings (i.e. non-CRLF) validation failed.
This PR is aimed to fix this misbehavior.

Fixes #1226
Closes #1228

ronaldtse

LGTM @ni4 !

rrrooommmaaa

In my case, test_stream_signatures fails because src/tests/data/test_stream_signatures/source.txt received extra CR when cloning from github without git config --global core.autocrlf false.
Shouldn't we update test files in this PR so they no longer depend on github settings?

ni4 · 2020-08-19T08:30:15Z

Shouldn't we update test files in this PR so they no longer depend on github settings?

Actually, some signatures there are done in binary mode, so tests will actually fail if eol sequence is changed in source.txt
Text-mode signatures would (since this PR) work well once LF is changed to CRLF.

rrrooommmaaa · 2020-08-19T09:33:10Z

@ni4 The question is -- the test will remain dependent on git settings?

ni4 · 2020-08-19T09:58:20Z

@rrrooommmaaa Please see the issue #1268

And, getting back to review, do you have any further comments/suggestions on this PR?

rrrooommmaaa · 2020-08-19T13:28:24Z

@ni4 I expect that lastcr should also be handled in singed_src_finish in case it was the last character of the buffer? And then it should be replaced with CRLF? Is that what you meant by /* we support CR, LF and CRLF line endings */?

ni4 · 2020-08-19T13:52:23Z

I expect that lastcr should also be handled in singed_src_finish in case it was the last character of the buffer

@rrrooommmaaa Once CR character is reached, CRLF is written. I.e. the only purpose of this flag - to detect case where CR is the last character in buffer, and the possible following LF is sent in the next signed_src_update() call.

rrrooommmaaa · 2020-08-20T06:14:26Z

Once CR character is reached, CRLF is written

got it now. Can you add a remark in the code describing the purpose of lastcr field, please?
The problem is that the name describes how it is set, and I have to understand how it is used from the code.
Also, why isn't variable en called end? Is there a standard for this?

ni4 · 2020-08-20T10:09:13Z

Can you add a remark in the code describing the purpose of lastcr field, please?
The problem is that the name describes how it is set, and I have to understand how it is used from the code.

Added comment and force-pushed.

Also, why isn't variable en called end? Is there a standard for this?

Just got used to this style, to use 2 chars for temporary variable name, where i,j,k are not applicable.
Maybe it is related to past Pascal experience, where end is a reserved word.

dewyatt · 2020-08-20T10:52:08Z

Also, why isn't variable en called end? Is there a standard for this?

Just got used to this style, to use 2 chars for temporary variable name, where i,j,k are not applicable.
Maybe it is related to past Pascal experience, where end is a reserved word.

I'd vote against this style personally. I like names to be as descriptive as possible, just not overly long.

ni4 · 2020-08-20T11:00:07Z

I'd vote against this style personally. I like names to be as descriptive as possible, just not overly long.

Ok, np. While it is clear how to rename en, what would be better to call ch, linebg? current_char, line_begin seems to be too long. This could also lead to clang-formatting single code line to 2-3 lines, making code less readable.

dewyatt · 2020-08-20T11:23:37Z

I'd vote against this style personally. I like names to be as descriptive as possible, just not overly long.

Ok, np. While it is clear how to rename en, what would be better to call ch, linebg? current_char, line_begin seems to be too long. This could also lead to clang-formatting single code line to 2-3 lines, making code less readable.

I would read ch as character and I believe that it's a well-known abbreviation so I don't see a problem there.
I would read bg as background. Typically beg would be used to abbreviate begin. So personally I might use linebeg, begline, line_beg, etc., those all seem pretty readable as a native.

I may be biased specifically on en because I don't see that used as an abbreviation for enable very often (usually only in firmware/embedded/kernel code).

EDIT: ena is a slightly clearer abbreviation for enable

ni4 · 2020-08-20T11:38:28Z

@dewyatt Thanks for the detailed explanation. Renamed/force-pushed.

rrrooommmaaa · 2020-08-20T17:39:03Z

@ni4 As long as lastcr is used to keep the state (last character) between update calls, I think, it's worthwhile
creating edge case tests using callback input, so that we can emulate chunks ending with 'CR' followed by 'LF' or non-LF.
Do you agree?

ni4 · 2020-08-22T08:51:16Z

@rrrooommmaaa Please see the updated PR. Actually it is simpler then input_from_callback since we know that information for signed_src_update() is read by 32k chunks.

rrrooommmaaa · 2020-08-23T09:49:16Z

src/librepgp/stream-parse.cpp

+        ch++;
+    }
+    uint8_t *linebeg = ch;
+    uint8_t *end = (uint8_t *) buf + len;


This piece of code will work in 99.9999% cases, I understand, still, it doesn't look clean, as in theory the buffer might be at the very end of virtual memory, so buf+len can be 0x00000 or an overflow exception can be thrown,
so ch<end and ch<end+1 may be false right away.

@rrrooommmaaa Thanks for pointing at this. While it should not happen in real-life systems, this may be hit on low-memory embedded systems or so on. Added a commit.

I suggest simply to replace ch < end and ch + 1 < end with !=
I know that < is generally safer than != but in this case it would be neater

@rrrooommmaaa That was the first idea, however then we would end up in need to check whether ch++ overflows, adding one more check on each byte processed.

Ok, merging then.

antonsviridenko · 2020-08-24T10:57:39Z

EDIT: ena is a slightly clearer abbreviation for enable

I vote for "enbl" :)

rrrooommmaaa · 2020-08-24T12:11:42Z

src/librepgp/stream-parse.cpp

+        ch++;
+    }
+    uint8_t *linebeg = ch;
+    uint8_t *end = (uint8_t *) buf + len;


I suggest simply to replace ch < end and ch + 1 < end with !=
I know that < is generally safer than != but in this case it would be neater

ni4 force-pushed the ni4-1228-normalize-eol-for-text-sigs branch from a04edbd to 74e393c Compare August 18, 2020 11:05

ni4 marked this pull request as ready for review August 18, 2020 13:38

ni4 requested review from dewyatt, antonsviridenko, ronaldtse and rrrooommmaaa August 18, 2020 13:39

ronaldtse approved these changes Aug 19, 2020

View reviewed changes

rrrooommmaaa requested changes Aug 19, 2020

View reviewed changes

ni4 force-pushed the ni4-1228-normalize-eol-for-text-sigs branch from 74e393c to 78923d1 Compare August 20, 2020 10:06

ni4 added 2 commits August 20, 2020 14:36

Normalize EOLs during text-mode signature verification.

714e2fd

Update tests with text-mode signatures verification.

4342adf

ni4 force-pushed the ni4-1228-normalize-eol-for-text-sigs branch from 78923d1 to 4342adf Compare August 20, 2020 11:36

Update tests with 32k-boundary CRLF edge case.

4887e2f

rrrooommmaaa reviewed Aug 23, 2020

View reviewed changes

Check for possible pointer airthmetic overflow in signed_src_update().

3fb08a9

ni4 force-pushed the ni4-1228-normalize-eol-for-text-sigs branch from b45c9df to 3fb08a9 Compare August 24, 2020 10:03

antonsviridenko approved these changes Aug 24, 2020

View reviewed changes

rrrooommmaaa requested changes Aug 24, 2020

View reviewed changes

rrrooommmaaa merged commit 5101979 into master Aug 24, 2020

rrrooommmaaa deleted the ni4-1228-normalize-eol-for-text-sigs branch August 24, 2020 12:42

antonsviridenko mentioned this pull request Aug 25, 2020

Improve handling of CR characters within cleartext signatures. #1265

Closed

ni4 added this to the v0.14.0 milestone Jan 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalize EOL to CRLF for text-mode signatures. #1263

Normalize EOL to CRLF for text-mode signatures. #1263

ni4 commented Aug 17, 2020

ronaldtse left a comment

rrrooommmaaa left a comment •

edited

Loading

ni4 commented Aug 19, 2020

rrrooommmaaa commented Aug 19, 2020

ni4 commented Aug 19, 2020

rrrooommmaaa commented Aug 19, 2020

ni4 commented Aug 19, 2020

rrrooommmaaa commented Aug 20, 2020

ni4 commented Aug 20, 2020

dewyatt commented Aug 20, 2020

ni4 commented Aug 20, 2020

dewyatt commented Aug 20, 2020 •

edited

Loading

ni4 commented Aug 20, 2020

rrrooommmaaa commented Aug 20, 2020

ni4 commented Aug 22, 2020

rrrooommmaaa Aug 23, 2020

ni4 Aug 24, 2020

rrrooommmaaa Aug 24, 2020 •

edited

Loading

ni4 Aug 24, 2020

rrrooommmaaa Aug 24, 2020

antonsviridenko commented Aug 24, 2020

rrrooommmaaa Aug 24, 2020 •

edited

Loading

Normalize EOL to CRLF for text-mode signatures. #1263

Normalize EOL to CRLF for text-mode signatures. #1263

Conversation

ni4 commented Aug 17, 2020

ronaldtse left a comment

Choose a reason for hiding this comment

rrrooommmaaa left a comment • edited Loading

Choose a reason for hiding this comment

ni4 commented Aug 19, 2020

rrrooommmaaa commented Aug 19, 2020

ni4 commented Aug 19, 2020

rrrooommmaaa commented Aug 19, 2020

ni4 commented Aug 19, 2020

rrrooommmaaa commented Aug 20, 2020

ni4 commented Aug 20, 2020

dewyatt commented Aug 20, 2020

ni4 commented Aug 20, 2020

dewyatt commented Aug 20, 2020 • edited Loading

ni4 commented Aug 20, 2020

rrrooommmaaa commented Aug 20, 2020

ni4 commented Aug 22, 2020

rrrooommmaaa Aug 23, 2020

Choose a reason for hiding this comment

ni4 Aug 24, 2020

Choose a reason for hiding this comment

rrrooommmaaa Aug 24, 2020 • edited Loading

Choose a reason for hiding this comment

ni4 Aug 24, 2020

Choose a reason for hiding this comment

rrrooommmaaa Aug 24, 2020

Choose a reason for hiding this comment

antonsviridenko commented Aug 24, 2020

rrrooommmaaa Aug 24, 2020 • edited Loading

Choose a reason for hiding this comment

rrrooommmaaa left a comment •

edited

Loading

dewyatt commented Aug 20, 2020 •

edited

Loading

rrrooommmaaa Aug 24, 2020 •

edited

Loading

rrrooommmaaa Aug 24, 2020 •

edited

Loading