Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Construct MTex #1678

Merged
merged 32 commits into from
Dec 14, 2021
Merged

Construct MTex #1678

merged 32 commits into from
Dec 14, 2021

Conversation

YishiMichael
Copy link
Contributor

@YishiMichael YishiMichael commented Nov 12, 2021

Updated on 11/28 (GMT+8): Some information in this comment may be outdated. Please see the latest comment.

Motivation

As what I've mentioned in #1677 , recently I'm working on the refactoring of Tex mobject. Manim generates all substrings of a Tex, namely the SingleStringTex objects, and calculates their lengths in order to figure out indices of some certain submobjects. This may cause the program to become slow as a Tex is split into tons of components, which is a usual case for complicated math formulas. And, even worse, symbols composed with a mutable number of symbols in different contexts also prevent us from indexing the mobject correctly.

My little trick is to insert some \color commands into the tex code. For example, Tex("abc", isolate=["a"]) will send "{\color[RGB]{0, 0, 1} a}bc" to be compiled. In most cases, the addition of \color won't change the shapes of glyphs, but the svg output will carry information about colors, which marks out the key glyphs. This can help us finish the indexing work, only with compiling just one tex file.

However, this still doesn't fix all the problems. I'll briefly list pros and cons about this pull request.

Pros

  • Every Tex corresponds to exactly 1 compilation, which is the greatest benefit and can speed up rendering.
  • Problems concerned with indexing will no longer happen, and the coloring issue is fixed. For example, \frac command is available when the denominator part is expected to be colored.
  • The preprocessing part, including balancing brackets, filling in missing parameters, becomes useless and is discarded.
  • Using the combination of the full tex_string and substrings_to_isolate to generate hash keys, with which to determine whether a Tex should be compiled again. This fixes the issue that two Tex with the same tex_string but different substrings are only compiled once, and the indexing of the later constructed one crashes.

Cons

  • Still, in some cases the shapes of glyphs will change when \color is inserted. For instance, when x_{0}^{2} becomes {\color[RGB]{0, 0, 1} x_{0}}^{2}, the exponent 2 is shifted rightward.
  • It's uncertain whether all tex_strings of components actually match them or not. This may happen if a tex_substring actually matches no mobjects, but matches the next group of mobjects according to the program.
  • The class SingleStringTex is removed, which may cause some old scenes to break. This can be simply fixed by replacing them with Tex.
  • Without the brackets balancing process, the tex_string and substrings_to_isolate (including isolate and tex_to_color_map) provided must be balanced in brackets, or a compilation error will occur.
  • All Tex with at least one substrings_to_isolate provided will be remade since the files are inserted with \color commands, which is never seen in previous versions.
  • mob.get_tex() is preserved in a temporary fashion.

Last but not least, this pull request is newly written, and is far from being tested enough. Any discussion or better strategy to fix this type of thing is welcome!

Test

Code:

from manimlib import *


TEST_STR = """\\lim_{{n} \\to \\infty} \\left\\lfloor
\\sqrt{\\frac{1}{{n} !} \\mathrm{e}^{n} {a}_{n} + b_{n}^{p}} \\otimes
\\sqrt[{n}]{\\sum_{m = 0}^{{n}^{2}} \\tilde{c}_{m \\cdot {n}}^{b_{n}^{p}
\\cos \\left( \\theta \\right)}} \\right\\rfloor""".replace("\n", " ")


class TestScene(Scene):
    def construct(self):
        tex1 = Tex(
            TEST_STR,
            fill_color=TEAL
        )
        tex1.shift(2 * UP)
        tex2 = Tex(
            TEST_STR,
            tex_to_color_map={
                #"\\sum": BLUE_E,
                "b_{n}": LIGHT_PINK,
                "{n}": YELLOW
            },
            fill_color=TEAL
        )
        tex2.shift(2 * DOWN)
        self.add(tex1, tex2)

        # Labels of indices for debugging
        self.add(
            #index_labels(tex1[0]),
            #index_labels(tex2),
            #*[index_labels(submob) for submob in tex2]
        )

Result:
TestScene

@YishiMichael YishiMichael changed the title Refactor Tex Construct MTex Nov 27, 2021
@YishiMichael
Copy link
Contributor Author

YishiMichael commented Nov 27, 2021

I daren't modify the Tex class as it's deeply rooted in manim, so I've written a new class called MTex for substitute. This class shares some similarities but also some differences with Tex. Why choose "M" as a prefix? Because it means "magic".

  • MTex takes in exactly one string and has no arg_separator argument. One may simply use join function to cover this problem.
  • Every MTex needs to be compiled once, no matter how many substrings are specified in isolate and tex_to_color_map. However, they must be a valid tex string when isolated, like {a, \right) are no longer allowed since MTex no longer provides automatic brace balancing. Also, inputting strings that cross (like ab and bc, which can be simply replaced by a, b and c) may also cause problems and is strongly unrecommended.
  • One of the main goals of this PR is to avoid wasting time finding the indices of mobjects of tex desired to be colored. Note that get_parts_by_tex is set to be always case sensitive, and strings cannot be matched as substrings in MTex.
# Substrings enclosed by a pair of braces (not escaped characters) are automatically isolated.
# tex1 = Tex("x = \\frac{-b \\pm \\sqrt{b^2-4ac}}{2a}", isolate=["{b^2-4ac}", "{2a}"])
tex1 = MTex("x = \\frac{-b \\pm \\sqrt{b^2-4ac}}{2a}")
# Needn't use `isolate` or `tex_to_color_map`
tex1.set_color_by_tex("{b^2-4ac}", YELLOW)
tex1.set_color_by_tex("{2a}", BLUE)

# Furthermore, subscripts and superscripts are also automatically isolated.
tex2 = MTex("a^2 + b^2 = c^2")
# Either would be OK:
# tex2.set_color_by_tex("2", LIGHT_PINK)
tex2.set_color_by_tex("^2", LIGHT_PINK)

# A variable with a subscript and a superscript won't cause problems.
tex3 = MTex("N = p_1^{r_1} p_2^{r_2} \\cdots p_s^{r_s}", tex_to_color_map={
    "p_1": GREEN,
    "p_2": RED,
    "p_s": BLUE,
})

# You can even color a string as long as their substrings are all isolated.
# You may use `get_all_isolated_substrings()` method to check all the substrings isolated.
# Isolating " \\cdot " is not recommended since `{ \cdot }` eliminates the space on both sides of the operator.
tex4 = MTex("Q = {c_m} \\cdot {m} \\cdot {\\Delta T}", isolate=[" \\cdot "])
tex4.set_color_by_tex("{c_m} \\cdot {m} \\cdot {\\Delta T}", RED)

self.add(VGroup(tex1, tex2, tex3, tex4).arrange(DOWN, buff=1))

The result of the code above is shown here:
TestScene

However, the MTex class isn't able to assign a corresponding to each component, since the tex string doesn't always align well with the svg generated, so the get_tex() method is removed. This will make it unable to work with TransformMatchingTex. This may be fixed in the future.

I use full_tex to generate hash keys, so two tex files that should contain different contents must be generated separately.

You are welcome to discuss any bugs or issues encountered with this new class!

@YishiMichael YishiMichael marked this pull request as ready for review November 27, 2021 09:41
@TonyCrane TonyCrane requested a review from 3b1b November 27, 2021 10:07
@3b1b
Copy link
Owner

3b1b commented Nov 28, 2021

Thanks for doing this! It's a clever solution to the problem. I'll take a closer look and see if I can find any edge cases that outweigh the pros, but overall I think replacing Tex with this implementation will be the right way to go.

@YishiMichael
Copy link
Contributor Author

YishiMichael commented Nov 29, 2021

Let me clarify some main differences between methods of Tex and MTex.

SingleStringTex.get_tex(self)

Tex is made up by a series of SingleStringTex with this method to return a prtial tex string. However, submobjects of MTex don't have such a method. The MTex version may be implenented in the future. (Updated on 12/7 (GMT + 8): Already implemented)

Tex.get_parts_by_tex(self, tex, substring=True, case_sensitive=True)
Tex.get_part_by_tex(self, tex, **kwargs)

The former returns a VGroup containing several submobjects, which are of type SingleStringTex. The latter returns the first submobject matched, or None if nothing can be matched. The tex string can be matched in substring or case_sensitive mode as options.

MTex.get_parts_by_tex(self, tex)
MTex.get_part_by_tex(self, tex, index=0)

The former returns a VGroup with several parts. Note that in MTex, part is regarded as a VGroup of submobjects, so a part and a submobject have different hierarchies. The latter returns the index-th element of the former result, so it may raise IndexError. The tex string must be able to be decomposed into some substrings that are already isolated.

Tex.index_of_part(self, part, start=0)
Tex.index_of_part_by_tex(self, tex, start=0, **kwargs)

Return the index of the submobject, namely the part. An optional attribute start may be passed.

MTex.indices_of_part(self, part)
MTex.indices_of_part_by_tex(self, tex, index=0)
MTex.slice_of_part(self, part)
MTex.slice_of_part_by_tex(self, tex, index=0)
MTex.index_of_part(self, part)
MTex.index_of_part_by_tex(self, tex, index=0)

The indices_of_part() function returns the indices of each submobject of part in submobjects of MTex. slice_of_part() only gets the first and the last indices returned by indices_of_part(), and returns the slice spanned by them. It's assumed the order of submobjects preserve that of tex string (at least in this scope). index_of_part() returns the first index returned by indices_of_part().

MTex.get_all_isolated_substrings(self)

Returns all substrings that are already isolated. It's implemented here to make the class more user-friendly.

@YishiMichael
Copy link
Contributor Author

YishiMichael commented Dec 6, 2021

I've just finished implementing the get_tex() method for submobjects of MTex. I have no idea whether this works fluently in many other cases that I haven't tested so far, and it sometimes may assign some meaningless tex_strings to submobjects because of the ordering issue. As an example, a fraction line will be assigned with an empty string if the tex given is \frac{1}{2}. The print_tex_strings_of_submobjects() method is provided for debugging, which works well with index_labels().
Anyway, MTex is now almost fully compatible with Tex, at least all methods are implemented. One may use TransformMatchingTex for MTex objects.

@YishiMichael
Copy link
Contributor Author

@3b1b I think the code is ready for reviewing now. Sorry for forgetting marking this as a draft before. This implementation works so far so good based on my LaTeX environment. I wonder whether there're still some bugs remaining in some edge cases...

@behackl
Copy link

behackl commented Dec 10, 2021

Hey @YishiMichael! This is a very clever approach which we'd also like to include over at https://github.com/ManimCommunity/manim -- are you either interested in submitting a similar PR there, or are you otherwise okay with someone of the community devs taking care of that for you?

(Inclusion of your new classes as-is should be a more or less straightforward process, replacing the current default Tex classes could be a completely separate, second step.)

@YishiMichael
Copy link
Contributor Author

I noticed that its feature that all paired braces are automatically broken up would cause unexpected breakings like \begin{matrix}, \hspace{3em}. So I added unbreakable_commands parameter for users to input some commands, the braces follow which won't be broken up. It's default to be ["\\begin", "\\end"].
@behackl Sorry for replying so lately, for I'm busy with school work these days. Thanks for your approval! I'd prefer someone of the community to help maintain the code, and add more features or make some modifications if necessary. To be honest, some methods in this class may still have space for better implementation, or even have bugs I haven't discovered so far, so I'm glad if someone can come up with ideas and further improve the code.

@TonyCrane
Copy link
Collaborator

@3b1b I think it's time to merge this. What about you?

- Split out `_TexParser` class
- Replace `math_mode` parameter with `tex_environment`
- Fix the bug that braces following even number of backslashes aren't matched
@3b1b
Copy link
Owner

3b1b commented Dec 14, 2021

I agree, it looks good to me. I'll merge it, then make a separate PR to replace Tex and TexText with this.

@3b1b 3b1b merged commit 3adaf8e into 3b1b:master Dec 14, 2021
TonyCrane added a commit that referenced this pull request Dec 14, 2021
@mark-zz
Copy link

mark-zz commented Dec 20, 2021

Why I have the following same errors popping up either I use the examples above or input anything in MTex?
my environment: Win10, python 39. Conda, But using Tex is fine.

----> 1 b = MTex("hello")

d:\appsgrouped\conda\envs\py39\lib\site-packages\manimlib\mobject\svg\mtex_mobject.py in init(self
, tex_string, **kwargs)
283 for submob in tex_hash_to_mob_map[hash_val]
284 ])
--> 285 self.build_submobjects()
286
287 self.init_colors()

d:\appsgrouped\conda\envs\py39\lib\site-packages\manimlib\mobject\svg\mtex_mobject.py in build_submobj
ects(self)
322 self.group_submobjects()
323 self.sort_scripts_in_tex_order()
--> 324 self.assign_submob_tex_strings()
325
326 def group_submobjects(self):

d:\appsgrouped\conda\envs\py39\lib\site-packages\manimlib\mobject\svg\mtex_mobject.py in assign_submob
_tex_strings(self)
400 curr_labels, prev_labels, next_labels
401 ):
--> 402 curr_span_tuple = label_dict[curr_label]
403 prev_span_tuple = label_dict[prev_label]
404 next_span_tuple = label_dict[next_label]

KeyError: -1

@YishiMichael
Copy link
Contributor Author

YishiMichael commented Dec 20, 2021

@mark-zz That's weird as it totally works on my computer (Windows 10). I guess it could be the different behaviors of LaTeX compilers. Could you give me your svg file generated?
Mine is shown below (I use MikTeX):

MTex("hello")
<?xml version='1.0' encoding='UTF-8'?>
<!-- This file was generated by dvisvgm 2.11.1 -->
<svg version='1.1' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink' width='22.707525pt' height='7.291664pt' viewBox='88.501797 -56.729015 22.707525 7.291664'>
<defs>
<path id='g0-101' d='M1.963499-2.425498C2.267998-2.425498 3.044998-2.446498 3.569998-2.666998C4.304997-2.981998 4.357497-3.601497 4.357497-3.748497C4.357497-4.210497 3.958497-4.640997 3.233998-4.640997C2.068499-4.640997 .483-3.622497 .483-1.784999C.483-.714 1.102499 .1155 2.131499 .1155C3.632997 .1155 4.514997-.997499 4.514997-1.123499C4.514997-1.186499 4.451997-1.259999 4.388997-1.259999C4.336497-1.259999 4.315497-1.238999 4.252497-1.154999C3.422998-.1155 2.278498-.1155 2.152499-.1155C1.333499-.1155 1.238999-.997499 1.238999-1.333499C1.238999-1.459499 1.249499-1.784999 1.406999-2.425498H1.963499ZM1.469999-2.656498C1.879499-4.252497 2.960998-4.409997 3.233998-4.409997C3.727497-4.409997 4.010997-4.105497 4.010997-3.748497C4.010997-2.656498 2.330998-2.656498 1.900499-2.656498H1.469999Z'/>
<path id='g0-104' d='M3.013498-7.171495C3.013498-7.181995 3.013498-7.286995 2.876998-7.286995C2.635498-7.286995 1.868999-7.202995 1.595999-7.181995C1.511999-7.171495 1.396499-7.160995 1.396499-6.971995C1.396499-6.845995 1.490999-6.845995 1.648499-6.845995C2.152499-6.845995 2.173498-6.772495 2.173498-6.667495L2.141999-6.457496L.6195-.4095C.5775-.2625 .5775-.2415 .5775-.1785C.5775 .063 .787499 .1155 .881999 .1155C1.049999 .1155 1.217999-.0105 1.270499-.1575L1.469999-.955499L1.700999-1.900499C1.763999-2.131499 1.826999-2.362498 1.879499-2.603998C1.900499-2.666998 1.984499-3.013498 1.994999-3.076498C2.026499-3.170998 2.351998-3.758997 2.708998-4.042497C2.939998-4.210497 3.265498-4.409997 3.716997-4.409997S4.283997-4.052997 4.283997-3.674997C4.283997-3.107998 3.884997-1.963499 3.632997-1.322999C3.548998-1.081499 3.496498-.955499 3.496498-.745499C3.496498-.252 3.863997 .1155 4.357497 .1155C5.344496 .1155 5.732996-1.417499 5.732996-1.501499C5.732996-1.606499 5.638496-1.606499 5.606996-1.606499C5.501996-1.606499 5.501996-1.574999 5.449496-1.417499C5.291996-.860999 4.955997-.1155 4.378497-.1155C4.199997-.1155 4.126497-.2205 4.126497-.462C4.126497-.724499 4.220997-.976499 4.315497-1.207499C4.483497-1.658999 4.955997-2.908498 4.955997-3.517498C4.955997-4.199997 4.535997-4.640997 3.748497-4.640997C3.086998-4.640997 2.582998-4.315497 2.194498-3.832497L3.013498-7.171495Z'/>
<path id='g0-108' d='M2.708998-7.171495C2.708998-7.181995 2.708998-7.286995 2.572498-7.286995C2.330998-7.286995 1.564499-7.202995 1.291499-7.181995C1.207499-7.171495 1.091999-7.160995 1.091999-6.961495C1.091999-6.845995 1.196999-6.845995 1.354499-6.845995C1.858499-6.845995 1.868999-6.751495 1.868999-6.667495L1.837499-6.457496L.5145-1.207499C.483-1.091999 .462-1.018499 .462-.850499C.462-.252 .923999 .1155 1.417499 .1155C1.763999 .1155 2.026499-.0945 2.204998-.4725C2.393998-.871499 2.519998-1.480499 2.519998-1.501499C2.519998-1.606499 2.425498-1.606499 2.393998-1.606499C2.288998-1.606499 2.278498-1.564499 2.246998-1.417499C2.068499-.734999 1.868999-.1155 1.448999-.1155C1.133999-.1155 1.133999-.4515 1.133999-.5985C1.133999-.850499 1.144499-.902999 1.196999-1.102499L2.708998-7.171495Z'/>
<path id='g0-111' d='M4.924497-2.866498C4.924497-3.958497 4.189497-4.640997 3.244498-4.640997C1.837499-4.640997 .4305-3.149998 .4305-1.658999C.4305-.6195 1.133999 .1155 2.110499 .1155C3.506998 .1155 4.924497-1.333499 4.924497-2.866498ZM2.120999-.1155C1.669499-.1155 1.207499-.441 1.207499-1.259999C1.207499-1.774499 1.480499-2.908498 1.816499-3.443998C2.341498-4.252497 2.939998-4.409997 3.233998-4.409997C3.842997-4.409997 4.157997-3.905997 4.157997-3.275998C4.157997-2.866498 3.947997-1.763999 3.548998-1.081499C3.181498-.4725 2.603998-.1155 2.120999-.1155Z'/>
</defs>
<g id='page1'>
<g fill='#000001'>
<use x='88.501797' y='-49.437352' xlink:href='#g0-104'/>
<use x='94.551468' y='-49.437352' xlink:href='#g0-101'/>
<use x='99.440545' y='-49.437352' xlink:href='#g0-108'/>
<use x='102.780139' y='-49.437352' xlink:href='#g0-108'/>
<use x='106.119733' y='-49.437352' xlink:href='#g0-111'/>
</g>
</g>
</svg>

@mark-zz
Copy link

mark-zz commented Dec 20, 2021

@YishiMichael Here is the svg generated by Tex('hello'). I can't generate using MTex because it keeps popping up the key errors.
I use MikTex too.

@TonyCrane
Copy link
Collaborator

@YishiMichael Here is the svg generated by Tex('hello'). I can't generate using MTex because it keeps popping up the key errors. I use MikTex too.

Please open a new issue and attach your .svg file.

@YishiMichael
Copy link
Contributor Author

@YishiMichael Here is the svg generated by Tex('hello'). I can't generate using MTex because it keeps popping up the key errors. I use MikTex too.

It appears to me that the .svg file is successfully generated, and some error pops up when parsing it. You may look up where you store these intermediate files and find the specific .svg file. I need to check it to figure out whether it's generated in an unexpected style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants