Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable pagination for sign/verify message #1337

Closed
wants to merge 8 commits into from

Conversation

mmilata
Copy link
Member

@mmilata mmilata commented Nov 4, 2020

Fixes #1271. Required rather intrusive changes to render_text().

Some notes:

  • Some of the recorded screens changed because we were printing normal-font space (5px) when showing monospace text (12px space).
  • Changed cardano confirm transaction to use the same pagination logic. As a result there are 5 lines on the initial page instead of 4, don't know if the empty last line was intentional.
  • Noticed that we're using normal font for signing message but monospace for verifying, is it intentional?
  • Some paginated messages don't look very good:
    screen

@matejcik
Copy link
Contributor

matejcik commented Nov 4, 2020

  • Some paginated messages don't look very good:

we should make the right margin wider in such case

) -> None:
if not pre_linebreaked:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like having both pre_linebreaked and new_lines. What is currently the difference? Could we maybe get rid of new_lines now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • both=False: words are treated as words, if a word fits on single line it is never broken, if it doesn't fit it's broken such that as many characters as possible stay on the first line. Spaces are treated as any other character when breaking a word containing them.
  • new_lines=True: words are treated more as a lines, inserting BR after every one. Otherwise behaves as above. This is the most common use in our codebase.
  • pre_linebreaked=True: words are expected to contain linebreaks at the right places and are not processed before rendering.

Thinking about this, the result will be the same often for the latter two options.

Then there's another behaviour we want but is not part of this PR - accept strings containing spaces and break on spaces wherever possible. Perhaps it would make sense to merge all these flags into something like mode which would take three different values?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm still a little confused so I'm going to make a full table:

new_lines pre_linebreaked Behavior
False False Text is split at (a) explicit BRs, (b) end of screen
True False Text is split at (a) explicit BRs, (b) end of screen, (c) end of individual item. This is most common.
Any True Text is split at (a) explicit BRs. I.e, no magic happens if the text overflows.

Is this correct?

I believe that going forward, we will be able to completely get rid of the (c) variant, "end of individual item". But for this PR it stays.

I'm not sure, however, how valuable is keeping the (b) variant. If we removed that, then there is no difference in behavior between pre_linebreaked=True and `False.

Also implementation-wise it's just a matter of "reflowing" the words parameter via a separate call.

(BTW, I believe the reason new_lines is a parameter of render_text() is to save memory and/or processing time otherwise needed to insert the BR symbols to the list. But going forward we'll be splitting strings anyway so I don't think this is relevant, but let's keep that in mind)

Anyway.
I think pre_linebreaked should only be argument of Text class, not render_text call, and Text should be responsible for calling break_lines

also i'd call it something else, something along the lines of auto_linebreaks=True instead of pre_linebreaked=False

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd say the table is correct. Agreed that it probably doesn't make sense to keep (b) long-term.

Regarding memory & time, this PR probably makes things worse but not sure how to get around that.

Will look into your suggestions, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding memory & time, this PR probably makes things worse but not sure how to get around that.

Total allocations for device tests in text.py went from 180000 to 390000, that's pretty bad ...

offset_x: int = TEXT_MARGIN_LEFT,
offset_y: int = TEXT_HEADER_HEIGHT + TEXT_LINE_HEIGHT,
offset_x_max: int = ui.WIDTH,
) -> List[List[TextContent]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why return List[List] if you convert it to a flat list anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In apps/common/signverify.py (and then apps/common/layout) we first call chunks on the result before passing it to render_text which flattens it.

@mmilata
Copy link
Member Author

mmilata commented Nov 5, 2020

Changed:

  • move break_lines out of render_text
  • rename pre_linebreaked to auto_linebreaks
  • get rid of nested lists
  • wider margin when scroll-dots are displayed

@matejcik
Copy link
Contributor

matejcik commented Nov 5, 2020

Yeaahh... I guess I take everything back 👀 this here is the reason rewriting the UI in Rust would be so nice.

Total allocations for device tests in text.py went from 180000 to 390000, that's pretty bad ...
(...)
we first call chunks on the result before passing it to render_text

Although the refactor looks good on paper, I'm afraid this is showing that it's the wrong approach. The advantage of the original render_text() method is that it does almost no allocations. The new approach does a lot; even the original unmodified split_message() is doing work that is technically not necessary, and implies copying (!) the whole message to be signed.

(my code from #1312 also does text.split("\n") which might actually be better handled inside render_text)

rough outline of a solution that looks like it would work better in this regard:

  • a function measure_line(string, string_offset=0, break_words=False) -> int returns a number of characters such that string[string_offset : string_offset + result] fits on one line. If break_words is False, then either (a) string_offset + result == len(string), (b) string[string_offset + result] is whitespace, or (c) the respective part of a string is an unbroken word that does not fit on the line
  • the above is used to calculate number of lines in text, and number of characters that fit on single page
  • the above is also used in render_text, where we need additional checks that tell us if we should stick a "-" or "..." at end

then the message pagination is done either by pre-breaking the message at given indexes, or we add a string_offset argument to Text so that you can point to the same string on multiple pages. I'm trying to find out whether string slices make copies -- if not, slicing would be a more readable solution.

and we use this to get completely rid of split_message(), because paginate_lines() will do all the work

@matejcik
Copy link
Contributor

matejcik commented Nov 5, 2020

I'm trying to find out whether string slices make copies

not sure why, but it seems that they do

@matejcik
Copy link
Contributor

matejcik commented Nov 5, 2020

of course, if your "get rid of nested lists" modification is OK memory-wise, we don't actually need to do the above :)

@mmilata
Copy link
Member Author

mmilata commented Nov 5, 2020

Removing the nesting decreased the allocations from 390k to 330k which is far from OK. Back to the drawing board then ...

Changed:
- use MONO-sized space instead of NORMAL-sized space when rendering MONO text
- fix corner case where word is broken before 0-th character
- add flag for breaking text on spaces
Avoid enumerate(). Decreases number of allocations by ~90k.
Turn the line breaking generator into a function that returns list.
Decreases the number of allocations by ~30k.
Get rid of the next_line() closure in break_lines(). Decreases the
number of allocations by ~90k.
@mmilata mmilata force-pushed the mmilata/sign-verify-pagination branch from 1598add to f5da228 Compare November 15, 2020 19:55
@mmilata
Copy link
Member Author

mmilata commented Nov 15, 2020

I've force-pushed a version where break_lines returns a list of indexes (3-tuples: list index, char index, break marker). It's less performant than I thought it will be and I'll be grateful for some high-level feedback. Here's a summary of how it looks in terms of number of allocations over all device tests:

what text.py #allocs Δ
master 182000
factor out break_lines, returns indexes 513000 +331k
extend modtrezorui to avoid string slices 446000 -67k
do not use enumerate 360000 -86k
turn break_lines from generator to function that returns list 327000 -33k
manually inline next_line closure 230000 -97k

@tsusanka tsusanka modified the milestones: 2020-12, 2021-01 Nov 18, 2020
@tsusanka tsusanka removed their request for review November 25, 2020 17:56
@matejcik
Copy link
Contributor

matejcik commented Dec 1, 2020

thanks for the measurements, these show hotspots nicely

so, a problem I see is returning and manipulating tuples repeatedly. each "return tuple()" is an allocation. I'm not sure about this but maybe tuple deconstruction is also an allocation? Furthermore, you're reusing the results of break_lines, which necessitates storing them, so all the tuples are kept around?

my suggestion would be reworking so that:
(1) what is now next_sentinel would fill out a passed-in structure instead of returning
(2) then you don't need the helper and can do this directly in break_lines
(3) ...making it again a generator instead of returning a list.

I'm not sure I completely understand the reason for next_sentinel helper, so I'm not sure if it makes sense to get rid of it. (it doesn't seem to make things worse, but it complicates reading the code somewhat)
Does it exist to avoid wrapping everything in try/except? if yes, maybe next_sentinel should return true/false to make the detection postponeable?

I believe that by modifying break_lines/next_sentinel to fill out a structure, we completely avoid the issue that you worked around in e06cdb2, and thus we can again call break_lines multiple times, sacrificing runtime for memory efficiency. Ideally, running break_lines should cause one allocation, for instantiating the generator.

class BreakEntry:
	def __init__(self):
		self.set(0, 0, None)

	def set(self, word_index: int, char_index: int, break_split: Optional[str]) -> None:
		self.word_index = word_index
		self.char_index = char_index
		self.break_split = break_split

def break_lines(...arguments, result: BreakEntry) -> Iterable[None]:
	...
	result.set(word_index, char_index, break_split)
	yield
	...
	return

def next_helper(break_iterator: Iterable[None]) -> bool:
	try:
		next(break_iterator)
		return True
	except StopIteration:
		return False

def usage(...arguments):
	break_entry = BreakEntry()
	break_lines_iter = break_lines(...arguments, break_entry)
	should_continue = True
	while should_continue:
		should_continue = next_helper(break_lines_iter)
		if break_entry.char_index > blabla:
			...

For now I wouldn't worry about calling break_lines often, if the above approach does lead to decrease in allocations. (as the code stands, you'd have to call it once as "count lines" in paginate_content, and second time to assign a fresh generator to text.breaks. I believe we can get rid of that with some careful factoring, but let's solve the allocation thing first.

@matejcik
Copy link
Contributor

matejcik commented Dec 1, 2020

what worries me is that the next_line closure is so costly.

perhaps we could extend the BreakEntry trick above, and add the other variables (like line_no etc.) to that class? then next_line could happen as part of entry.set() ?
or something like that

@matejcik
Copy link
Contributor

matejcik commented Dec 1, 2020

i'm also kind of surprised that measure_line is relatively cheap. still, perhaps it would be worth using the same trick, make it return None and fill out a passed-in List[int]

@matejcik
Copy link
Contributor

matejcik commented Dec 1, 2020

what worries me is that the next_line closure is so costly.

would you mind retrying with next_line as closure, if it turns out that the BreakEntry thing helps? i suspect that its bad behavior could be some sort of artifact

@tsusanka tsusanka modified the milestones: 2021-01, 2021-02 Dec 29, 2020
@mmilata mmilata closed this Jan 4, 2021
@mmilata mmilata deleted the mmilata/sign-verify-pagination branch September 29, 2021 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add pagination for sign/verify
3 participants