Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Storing of styled text in a struct for later recall w/o borrowing #649

Closed
J-Bockhofer opened this issue Nov 26, 2023 · 12 comments
Closed
Labels
Type: Enhancement New feature or request

Comments

@J-Bockhofer
Copy link

J-Bockhofer commented Nov 26, 2023

Problem

A) Experiencing high CPU load when doing regex-based styling of lines that are received / generated at runtime.
B) Storing styled text in a struct

Detailed description

In my case I had a filewatcher in a different thread sending new lines added to the file into the main process.
These lines where then pushed into a Vec<String> of a struct.

I first naively tried parsing this vector in a render/ui function that gets called each frame/tick, ie reading, splitting and regex matching the string to yield conditionally styled spans that get reassembled into a partially colored line (some words might be colored based on matching).

This approach has led to the increase in CPU load, since, as the vector increases, the amount of work to be done each frame increases -> Problem A.

To solve this I wanted to have the partial stylings for each line precomputed when it is received, stored in my main struct and read from in the render function to avoid having to re-compute styles / complex highlighting every frame.

This led to Problem B -> Storing a Vec<ListItem<'_>>, or simpler Vec<Line<'_>> or Span.

As implied by the notation, setting up the lifetimes correctly proved impossible for me (already had a lifetime in place for a StatefulList and was working with the async-template ) so I needed a simpler solution.

Solution

After getting some advice from @kdheepak and seeing a solution used in one of his projects (taskwarrior-tui) it inspired me to come up with the following solution for my own problem of storing styled spans generated at runtime:

Instead of trying to store a Vec<Span> or any text widget for that matter, I set up a simple struct like this to represent a line with styled spans.

#[derive(Default, Clone)]
struct StyledLine {
 words: Vec<(String, Style)>,
}

This StyledLine struct can then be stored anywhere because both String and Style are owned types.
I then stored a Vec<StyledLine> in my struct to represent a body of styled text. Let's call this myStruct.textbody for now.

Parsing received strings into a Vec<StyledLine> in the "on_receive() method" was almost the same as setting up the Vec<ListItem> I had in my render function before, just changing some types and property names.
The parsing of this Vec<StyledLine> in the render function into a Vec<ListItem> for example was also very straight-forward then.

     // inside the render function executed each frame/draw call
     // has a reference to myStruct passed in as self or other argument.

      let list_lines: Vec<ListItem> = myStruct
        .textbody // Vec<StyledLine>
        .iter()
        .map(|i| {
          let mut line: Line = Line::default();
          for word in i.words.clone() {
            let cspan = Span::styled(word.0, word.1); 
            line.spans.push(cspan);
          }
          ListItem::new(line)
        })
        .collect();

     // then do the usual to parse a Vec<ListItem> into a List
     let drawable_list_widget =  List::new( list_lines) // .block or whatever further styling you want to do

I would suggest somehow highlighting the owned nature of Style in the docs.
Maybe by setting up a suitable example to illustrate this capability as I'm sure many will come across a similar issue when going with a naive approach to text styling first. I also think this solution in particular is very uninvasive and easier on the soul than dealing with explicit lifetimes.

@J-Bockhofer J-Bockhofer added the Type: Enhancement New feature or request label Nov 26, 2023
@joshka
Copy link
Member

joshka commented Nov 27, 2023

I'm not quite sure I understand the problem. There's a lot of complexity in this, and some ambiguity in which parts you're stuck on / which are causing the perf issue. Can you talk more about problem A and B, perhaps provide some code and then simplify that to just the pertinent part. Can you give more information about the size of things and the specific performance measurements that are a problem?

I suspect that there's probably an approach to fixing the perf issue where we change List to accept an Iterator<Into<ListItem>> instead of Into<Vec<ListItem>> that could mean that only the currently displayed items actually need to be computed each frame.

I'm not sure what the lifetime problem your having is, some application context would be useful to help explain the problem.

@kdheepak
Copy link
Collaborator

I think @J-Bockhofer was suggesting documenting the approach I suggested (storing styles in their struct instead of storing ListItem, which comes with lifetime + ownership related issues).

Maybe this is a ratatui.rs related issue?

@J-Bockhofer
Copy link
Author

I will put together a minimal example to clarify the various points tonight or tomorrow and update this post.

Real quick: The perf issue was behaving inconsistently and the number of lines being displayed/styled was just below 10. Still the main process was nearly pinning my CPU,

I hope to provide better insight with the example tomorrow.

@J-Bockhofer
Copy link
Author

J-Bockhofer commented Nov 27, 2023

So as promised here is the example:
https://github.com/J-Bockhofer/rs-ratatui-i649

I set it up in three branches:

  1. Main (which contains Problem A)
  2. ProblemB
  3. SolutionB

The commit diffs should make the changes between branches simple to follow and the Readmes contain things relevant for each branch.
The Readme for the Solution contains some additional extensions to the introduced approach to further showcase the options with it.

Problem A

High CPU usage when displaying relatively small lists with styled lines where words per line can have differing colors.

Was tied to rather inefficient regex redefinitions inside in the draw function, which led to me looking everywhere but there for an explanation. Increase directly tied to where the regex definitions were in the render function.

To solve Problem A, I tried setting up a function that computes the styles per word/substring and stores them for use in the render function, which led to Problem B.

The example contains a program that displays list of string on the right after they are received.
To try it out, just change what is mentioned in the associated readme and copy+paste lines in the included text.txt file.

Problem B

Setting up lifetimes for fields of Components / Widgets in the async-template.

The example does not compile since it calls for defining explicit lifetimes in multiple locations. (EDIT: Probably shouldn't even go down that route because Widgets need to be redrawn each frame?!)

Solution B

Implementation of lifetime-free struct with owned values String and Style to store a precalculated list of word-style pairs.

Extension of the approach for simple theme'ing implementation and better highlighting function.

Example

@joshka
Copy link
Member

joshka commented Nov 28, 2023

Problem A:

cargo flamegraph --root --release

Pasted a few lines:
flamegraph

(note: run cargo flamegraph locally and open the file in firefox to get an interactive view of the SVG)

The loop is dominated by creating the Regex, which gets linearly worse each time you add an extra line - You've called this out in a comment though, and it's noted in the Regex::new() docs:

Note that regex compilation tends to be a somewhat expensive process, and unlike higher level environments, compilation is not automatically cached for you. One should endeavor to compile a regex once and then reuse it. For example, it's a bad idea to compile the same regex repeatedly in a loop.

Moving this to a OnceLock initialized a single time is about a ~10x improvement in time spent in that method compared to the rest of the code (76% of the app time vs 8%)

flamegraph

static IP_REGEX: OnceLock<Regex> = OnceLock::new();
static BAN_REGEX: OnceLock<Regex> = OnceLock::new();
static FOUND_REGEX: OnceLock<Regex> = OnceLock::new();

...

    let ip_re = IP_REGEX.get_or_init(|| {
      Regex::new(r"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})").unwrap() // IPv4 regex
    });
    let ban_re = BAN_REGEX.get_or_init(|| Regex::new(r"Ban").unwrap());
    let found_re = FOUND_REGEX.get_or_init(|| Regex::new(r"Found").unwrap());

Regarding the problem, I think I'd suggest the following approach for this:

  1. move the regex parsing logic into update instead of draw and store some real struct that represents your data. I.e. perhaps something like:
struct Fail2BanItem {
  start: String,
  found_or_ban: String,
  ip: String,
  rest: String,
}
  1. implement From<Fail2BanItem> for ListItem (or for Text)
  2. In your draw, call let items = self.items.iter().map(ListItem::from); (or .map(ListItem::new))

Additionally, for the multiple lines added at once case, using an enum for this might be handy instead of having to handle that +++ tag

enum AddedData {
  SingleLine(...)
  MultipleLines(...)
}

Or put more simply, the advice might be that apps should not process data in the drawing function - do it before the app gets there.


From a what to do for documenting this question, this seems more like a good blog post idea (you're most of the way there already) than a document in Ratatui to me right now. There seems to be a lot of complexity that seems to come from your specific problem space that muddies any generic solution to your problems A and B. To get it to a point where it's helpful more generally, I'd encourage you to simplify your example as much as possible. If you can, show the point where the problem occurs by extending from a small hello world code example rather than highlighting the problem in your application.

@J-Bockhofer
Copy link
Author

J-Bockhofer commented Nov 28, 2023

Thank you for looking into it.

I'm sorry for the confusion Problem A was merely a stepping stone on the way to the real problem of storing precomputed styles per word/string and wasnt seen as a ratatui problem for me, as stated. Should have worded that differently. So more of an application that led to the "need" of storing per word stylings at runtime.

The Fail2BanItem you've implied would still necessitate checking against something else, either directly in the render function or elsewhere to create a styled Span. That's why I thought having an example for a simple struct like the Vec<StyledLine> and an approach to creating a theme could be useful (all the in the solution readme). I agree though that it might have a better place elsewhere.

So this was just a suggestion to highlight the possibility of associating Styles with Strings in some way. As this approach has solved my problems.

@joshka
Copy link
Member

joshka commented Nov 28, 2023

I guess the part I'm misunderstanding here is what the difference is between a Span and a StyledString? A Span is a type that associates a Style and a String (well Cow<&str> strictly, which can be either a borrowed slice or an owned String).

@J-Bockhofer
Copy link
Author

J-Bockhofer commented Nov 28, 2023

Maybe I'm just completely off-base, very possible judging by my initial regex placement.

Doesn't a Span always need to have a lifetime attached? As in Span<'a>?

It would make sense to store the Span (completely parsed and styled line) in the main struct -> Home in this case.

For illustration sake the Home struct in my example already had an explicit lifetime for two &'a str specified. Since the draw and update functions are derived from the Component trait, the way to propagate lifetime specifiers then does not seem obvious to me. The repo in Problem B presents a compiler error that was quite the puzzle to get through for me, which i did not manage to do.

So if I indeed need to specify the lifetime to store a Span, I think it might sometimes be easier to store a (String, Style)-tuple or struct.
Changing all of the Spans in the Home.highlightio() function to directly consume a String did at least little to remedy the necessity to specify lifetimes for the Component trait in Problem B.

What might I be missing here? Or did this clear the misunderstanding?

@joshka
Copy link
Member

joshka commented Nov 28, 2023

Playing around with this simplified example playground link

The following works:

#![allow(dead_code)]

use std::borrow::Cow;

enum Action {
    A(String),
    
}
trait Component {
    fn update(&mut self, action: Action);
    fn render(&self);
}

#[derive(Debug)]
struct Span<'a> {
    content: Cow<'a, str>,
}

impl<'a> Span<'a> {
    fn raw<T: Into<Cow<'a, str>>>(content: T) -> Span<'a> {
        Span { content: content.into() }
    }
}

struct Home<'a> {
    spans: Vec<Span<'a>>
}

impl<'a> Component for Home<'a> {
    fn update(&mut self, action: Action) {
        match action {
            Action::A(s) => {
                let span = Span::raw(s);
                self.spans.push(span);
            }
            
        }
    }
    fn render(&self) {
        println!("Home {:?}", self.spans);
    }
}

fn main() {
    let mut home = Home {
        spans: vec![],
    };
    home.update(Action::A("aaa".into()));
    home.render();
}

Output:

Home [Span { content: "aaa" }]

So there's some inherent complexity that makes the app fail to compile that comes from the combination of the application types, traits, lifetimes etc. These are too complex to reasonable dig into without spending a bunch of time, so I'm not sure where the problem is. When I tried some small changes, changing one thing broke other code around it which requires knowledge about your entire app to understand the fixed.

Perhaps pull out everything that doesn't matter like this to find a minimum reproducible problem.
I didn't model the Box<> stuff around the components, but you might build up a model like this of the interactions of lifetime that would help find the exact point where things don't work.

@J-Bockhofer
Copy link
Author

Thanks a lot for all the insights and this simplified example!

I guess it comes down to the async-template that I based everything on.
Apart from all the complexity provided there, I didn't add anything other than the filewatcher, another Action::IONotify to set up message passing and the parsing of that message into a List with conditional styling.

This should've probably been an issue with the template rather than Ratatui.

Yes, I agree that the combination of all the complexities with the "async-component-actor" model makes it hard to reason about.
That's why I found it much simpler to resort to the solution mentioned above, which I wanted to share to save some potential hassle.

Thank you again for trying your luck with the puzzle of <'a>lifetime (sorry couldnt resist) and your time in general!
I think this can be marked as closed.

@kdheepak
Copy link
Collaborator

Maybe let's move this to a discussion? btw, I haven't had a chance to look into this yet.

My general comments are that

  1. I tend to avoid lifetimes as much as possible when prototyping and developing, and only opt to use them when I need performance.
  2. I tend to always create ratatui objects on the fly and never store any widgets, span, text etc in a struct that maintains the state of the program, even if it is less efficient to do that way initially.

@joshka
Copy link
Member

joshka commented Nov 28, 2023

Thank you again for trying your luck with the puzzle of <'a>lifetime (sorry couldnt resist) and your time in general!
I think this can be marked as closed.

LOL :D

Maybe let's move this to a discussion? btw, I haven't had a chance to look into this yet.

Sounds good

@ratatui-org ratatui-org locked and limited conversation to collaborators Nov 28, 2023
@joshka joshka converted this issue into discussion #652 Nov 28, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Type: Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants