Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: New Text API #2355

Open
naveen521kk opened this issue Dec 3, 2021 · 12 comments
Open

RFC: New Text API #2355

naveen521kk opened this issue Dec 3, 2021 · 12 comments
Assignees
Labels
enhancement Additions and improvements in general needs discussion Things which needs to be discussed before implemented.
Milestone

Comments

@naveen521kk
Copy link
Member

Currently, the text API which uses Pango isn't great and I would like to refactor it. Here is my proposal:

  1. New Class: FormattedString - It's similar to a list of strings but would give more control over the formatting of each string. Initialising it would require the following optional parameters:

    • text: The text to show. This would be added as the first in the list.
    • font: The fonts of the strings. Default when not overridden individually. (much like how text in MarkupText current is)
    • font_size: The size of the font. Default when not overridden individually.
    • fallback_font: Font to fall back to when the main one isn't available or doesn't contain a specific character. This can also be overridden individually.
    • color: The color of the text. Default to white. Probably should be overridden by themes.
    • strikethrough: whether to strikethrough
    • strikethrough_color: the color of strikethrough
    • insert_hypens: use hyphens when breaking a work (default to True)
    • underline: Underline the text.
    • overline: Draw an overline on text.
    • underline_color: Color of underline
    • overline_color: Color of overline
    • baseline_shift: Set's baseline shift for a given text
    • open_type_features: which features to enable
    • language: The language of the text
    • indent: the indent of the para
    • style: styles of text
    • weight: Font weight
    • letter_spacing: Space between letters

    and more see https://gnome.pages.gitlab.gnome.org/gtk/Pango/enum.AttrType.html

  2. The FormattedString will have the functions:

    • clear: clear all the texts stored
    • append: Add a new text. Accepts all arguments from __init__().
  3. Addition should be possible with FormattedString. If the addition is FormattedString and FormattedString it should add items from second to the first one. If the addition is FormattedString and str a new text should be added to the original one.

  4. The Text class which exists should accept an str and a FormattedString. The string passed here will be considered as Pango Markup.

  5. The MarkupText class will be deprecated.

  6. All the t2* parameters are deprecated and removed from the Text class.

  7. Paragraph will be aliased to Text

  8. Text will have new parameters width and height which will be considered when creating text and paragraphs would be broken based on this. By default width and height should be full screen.

  9. Text will be treated as a paragraph, which means it has parameters like
    - justify: Justify paragraph
    - align: Align the text should be left, right, center
    - line_height: Height of each line.

  10. Text should also accept parameters from FormattedString for backwards compatibility.

Example Code

class TextScene(Scene):
    def construct(self):
        a = FormattedString(font="Roboto")
        a += "Hello"
        a += "World"
        a.append("bold", weigth=BOLD)
        self.add(Text(a))

CC: @PhilippImhof

@naveen521kk naveen521kk added enhancement Additions and improvements in general needs discussion Things which needs to be discussed before implemented. labels Dec 3, 2021
@behackl
Copy link
Member

behackl commented Dec 3, 2021

This seems fairly reasonable, and like a good development to me. Some comments and questions:

  • The underline_color and underline keyword arguments (etc) could potentially be combined to just underline for convenience.
  • It would be very nice if Text (and probably also FormattedString) had text as a property. We often get requests for supporting syntax like my_text.animate.set(text=...)�, and that might be a clean way to do that.
  • About setting width and height of texts automatically: Would this mean that the width and height of the mobject Text("hello!") would also be the full screen size? Or would these "text width" and "text height" only be used for text formatting? Would it make sense to separate the behavior for Text (not respecting any width/height by default) and Paragraph (for which width/height is respected automatically, and if not specified they are set to the scene dimensions)?
  • What would be the result of FormattedString("hello", color=YELLOW) + FormattedString("world", underline=True, underline_color=RED)?

@naveen521kk
Copy link
Member Author

naveen521kk commented Dec 3, 2021

The underline_color and underline keyword arguments (etc) could potentially be combined to just underline for convenience.

👍. Looks good, probably a dataclass representing them would be better.

It would be very nice if Text (and probably also FormattedString) had text as a property. We often get requests for supporting syntax like my_text.animate.set(text=...)�, and that might be a clean way to do that.

Should be possible but I guess only for Text. I don't think FormattedString needs something like that.

About setting width and height of texts automatically: Would this mean that the width and height of the mobject Text("hello!") would also be the full screen size? Or would these "text width" and "text height" only be used for text formatting?

The size of "hello" would depend on the font_size parameter and not width and height. Actually, width and height would only make a difference when there is a multiline text or a long line. If the line exceeds the screen(or the specified width) it will be wrapped to the next line. Probably calling it bbox is better I think. Thoughts?

Would it make sense to separate the behavior for Text (not respecting any width/height by default) and Paragraph (for which width/height is respected automatically, and if not specified they are set to the scene dimensions)?

As explained above width/height would make difference only for long sentences or paragraphs I think it's better to have Text the same as Paragraph. That would mean, Paragraph is just an alias for Text.

What would be the result of FormattedString("hello", color=YELLOW) + FormattedString("world", underline=True, underline_color=RED)``?

hello would be in YELLOW color and world would be the default color (ie white) and underlined in RED color.

@PhilippImhof
Copy link
Member

I like the approach very much, it has several very good points:

  • In order to bring non-LaTeX (I will not continue to explicitly say it for the remainder of my comment) text to the next level, it seems important for me to get rid of the old (archaic) t2* formatting. While I consider PangoMarkup like in MarkupText as the easiest solution, using PangoAttrType will eventually offer more possibilities AFAICS. Your approach would both and thus bring the best of both worlds.
  • I like the idea of the + operator. However, as @behackl already wrote, I see that there is a huge risk of "unexpected" behaviour. If I have the yellow text HELLO and I add just WORLD to it, I would expect it to be yellow as well, unless I explicitly ordered WORLD to have another colour. Others might expect something else.
  • Having Paragraph adapted to the new format (and thus working with PangoMarkup) would allow for easy typesetting of code, because we are already using Pygments and Pygments has a PangoMarkup formatter.

@naveen521kk
Copy link
Member Author

I see that there is a huge risk of "unexpected" behaviour.

What would be the better approach here to avoid confusion?

@PhilippImhof
Copy link
Member

I do not have one. I even doubt there is one, because one might expect several things and all of them could be reasonable. One thing I can see is that we might distinguish between attributes that have been explicitly set (those should persist) and attributes that have their default value (they should be superseded by the existing value):

  • FormattedString("hello", color=YELLOW) + FormattedString("world") should be all in yellow, because no color is set for the second part
  • FormattedString("hello", color=YELLOW) + FormattedString("world", color=WHITE) should have hello in yellow and world in white.

However, one might argue that this is not to be expected either, because FormattedString("world") is white by default and it should maybe remain white even if added to some other text.

citrusmunch added a commit to citrusmunch/manim that referenced this issue Dec 20, 2021
Specifically fixes ManimCommunity#2366 by correcting the shape of the `VGroup` of
`VGroup`s.
Attempted to balance keeping all names and methods intact, while still
cleaning things up since we're expecting this class to eventually be
aliased in the rework for `Text` (ManimCommunity#2355).
@YishiMichael
Copy link

Some personal opinions on the new implementation of Text:

  • Due to the existance of ligatures, the result of slicing mobjects may be not expected (e.g. Text("flame")[:2] may contain 3 characters). So let's just give up slicing through text string itself, let alone markup tags that may be contained.
  • I'd suggest to introduce in a third-party package, svgelements, for svg parsing --- in ManimGL I've committed a PR to do so, and the code becomes much cleaner. Although this parser still has something not supported, it provides more than current parsing. However, I'm not sure if CE would like to include another third-party package.
  • The user should be able to control where a long string should wrap with \n character. Currently in both ManimGL and ManimCE a long text will be automatically wrapped, due to the width parameter passed to cairo surface (See the following picture, using ManimGL. In ManimCE, the font is scaled so that a line can hold more words, but to a certain extent it also wraps). The side effect brought by this parameter should be eliminated.
  • I'm not in favor of the syntax FormattedString("hello") + " world", which appears confusing to me. The + operator in FormattedString("hello") + FormattedString("world") can just inherit from VMobject + VMobject.
intro_words = Text("""
    The original motivation for manim was to
    better illustrate mathematical functions
    as transformations.
""")
self.add(intro_words)

IntroWords

@behackl
Copy link
Member

behackl commented Feb 23, 2022

I'd suggest to introduce in a third-party package, svgelements, for svg parsing --- in ManimGL I've committed a 3b1b/manim#1731 to do so, and the code becomes much cleaner. Although this parser still has something not supported, it provides more than current parsing. However, I'm not sure if CE would like to include another third-party package.

Yes. We should not reinvent the wheel with SVG parsing given how complex it is. Replacing the current parser with a third-party library would without a doubt be healthy for the library! svgelements in particular has also been mentioned in the related issue, #1040.

@naveen521kk
Copy link
Member Author

So let's just give up slicing through text string itself, let alone markup tags that may be contained.

Yeah, I plan to drop support for slicing, instead, users can use the Markup or use FormattedString as proposed here.

The user should be able to control where a long string should wrap with \n character. Currently in both ManimGL and ManimCE a long text will be automatically wrapped, due to the width parameter passed to cairo surface (See the following picture, using ManimGL. In ManimCE, the font is scaled so that a line can hold more words, but to a certain extent it also wraps). The side effect brought by this parameter should be eliminated.

I think it would be better to wrap the lines if it exceeds the screen size, rather than a fixed width as done currently. Also, there would be a parameter to not automatically wrap lines in FormattedString so that users can control it by adding new lines manually.

I'm not in favor of the syntax FormattedString("hello") + " world", which appears confusing to me. The + operator in FormattedString("hello") + FormattedString("world") can just inherit from VMobject + VMobject.

I don't think FormattedString will be a Mobject since it doesn't render anything, but instead, it can be combined in Text to render. Also, are VMobject are addable (ie is VMobject + VMobject even possible?)?
I wanted FormattedString to inherit from str since it's literally storing a string but with extra data and that would mean it should support adding str as well. I think if it creates a lot of confusion we can avoid doing it? I would like to hear more thoughts about that.

@YishiMichael
Copy link

YishiMichael commented Feb 23, 2022

Oh, I regarded FormattedString as a child class of Mobject. Sorry for misleading. In that case it becomes reasonable for me.

Also, are VMobject are addable (ie is VMobject + VMobject even possible?)?

VMobject.__add__(self, vmobject) is implemented which returns a new VMobject. See code here. But never mind then...

@icedcoffeeee
Copy link
Collaborator

Another idea for this Text refactor: in regards to fixing AddTextWordByWord, maybe indexing a Text mobject should return each word separately?

@naveen521kk
Copy link
Member Author

maybe indexing a Text mobject should return each word separately?

I think it would break the Write animation then.

@naveen521kk
Copy link
Member Author

Just to post some updates here, I have done some work on ManimPango; see https://github.com/ManimCommunity/ManimPango/milestone/1. It is a huge refactor, would take a lot of time and is still a WIP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Additions and improvements in general needs discussion Things which needs to be discussed before implemented.
Projects
Status: 🆕 New
Development

No branches or pull requests

5 participants