# toHTML() method to generate HTML output #690

Closed
opened this Issue Jul 19, 2016 · 49 comments

Projects
None yet
3 participants
Contributor

### Nekomajin42 commented Jul 19, 2016 • edited

Hi Jos!

After the discussion in a previous issue (#684), I decided to give the toHTML() function a try. The regexp and the tree.traverse() solutions were both messy, and I really needed a customisable HTML output.

At the end, I implemented a toHTML() function to every node, and it works like a charm. I'd like to contribute it, if it's OK to you. It would be a good addition to the library for those who want to build web-based interfaces on top of math.js.

The idea was to enclose every element of the expression between <span> tags and give them semantic class names. In this way, I get a nice structure, which can be styled easily.

# Examples

For the expression 2+2=5 the output will be:
<span class="number">2</span>
<span class="operator binary explicit">+</span>
<span class="number">2</span>
<span class="operator assignment">=</span>
<span class="number">5</span>

For the expression (a<42)?true:false the output will be:
<span class="parenthesis round">(</span>
<span class="symbol">a</span>
<span class="operator binary explicit"><</span>
<span class="number">42</span>
<span class="parenthesis round">)</span>
<span class="operator conditional">?</span>
<span class="boolean">true</span>
<span class="operator conditional">:</span>
<span class="boolean">false</span>

# Styling

Using the class names, it is really easy to assign different colors or other styles to the elements. The user even has the ability to decide about spaces without tempering the toString() function.

An example for the first expression:
.number {color: blue;}
.operator::before, .operator::after {content: " ";}

If you give it a go, I will send a pull request and I can write a documentation too. (However, I have 18 pending items in the test, and I don't understand how I caused them, because I did nothing else besides this implementation in the *Node.js files.)

Owner

### josdejong commented Jul 21, 2016

 That's nice! Thanks for your offer to contribute! Your proposal looks good. There is only one thing: it may be wise to give the classes a more specific class name like math-number, math-operator, etc. to prevent possible CSS collisions with frameworks like bootstrap. What do you think? As for the problems with the tests: can you tell me what errors you get?

Contributor

### Nekomajin42 commented Jul 21, 2016 • edited

 It's a good idea! I will change the class names. After the build, I will run a test too and show you the results. However, a few additions to the nodes would be helpful. First of all, there was a discusson before about the introduction of UnitNodes. What is the status of that? My greates problem is still that a SymbolNode can contain a lot of different things. It might be indifferent for the parser, but it's really hard to differentiate between them for me. It is easy to filter a few special symbold like i, NaN, etc., but to do it with units... Not to mention the user defined units. Second, the function and operator nodes could have a category property. So we can append them as class names too, like .math-arithmetic-function. In this way, we can have different styles for constructors, expression functions, utils and such. The same for operators. We might want to display a+b without spaces, but a and b is bad without them. Third, the ConstantNode should specify the type of numbers. Besides "number", there should be a "BigNumber", "Complex" and "Fraction" value. Please, tell me what do you think!
Contributor

### Nekomajin42 commented Jul 21, 2016

 Curretly the following classes exist in the toHTML() function: .math-number - Numbers and "Infinity" .math-string - Strings and "NaN" .math-boolean - "true" and "false" .math-variable - Variables names (currently working only with object properties) .math-property - Property names of objects (currently working only with object properties) .math-keyword - "function" .math-symbol - Some kind of symbols, with the following special cases: .math-imaginary-symbol - "i" .math-novalue-symbol - "null" and "uninitialized" .math-operator - Any kind of operator, with the following special cases: .math-unary-operator - Unary operators .math-lefthand-unary-operator - Lefthand unary operators .math-righthand-unary-operator - Righthand unary operators .math-binary-operator - Binary operators .math-implicit-binary-operator - Implicit binary operators .math-explicit-binary-operator - Explicit binary operators .math-separator-operator - ",", ":", ";" and "\n" .math-assignment-operator - "=" .math-accessor-operator - "." .math-range-operator - ":" .math-conditional-operator - "?" and ":" .math-function - Any kind of function names (symbols), with the following special cases: .math-custom-function - User defined function names .math-parenthesis - Any kind of parenthesis, with the following special cases: .math-round-parenthesis - "(" and ")" .math-square-parenthesis - "[" and "]" .math-curly-parenthesis - "{" and "}" Any remarks?
Contributor

### Nekomajin42 commented Jul 21, 2016 • edited

 And here are the pending test results: - should import a factory with name - should import a factory with path - should import a factory without name - should pass the namespace to a factory function - should import an Array - should calculate the modulus of bignumbers for fractions - should calculate the modulus of bignumbers for negative values - should multiply a bignumber and a unit correctly - should multiply a bignumber and a unit without value correctly - should return bignumber unary minus of a boolean - should return bignumber unary plus of a boolean - should return bignumber unary plus on a string - should calculate xgcd for edge cases with negative values - should not cast a single number or boolean to string - should throw an error if called with invalid arguments - should return the quantileSeq of an array of bignumbers with number probability - should throw an error when input array does not have two dimensions - should throw an error when the dimensions of the input array are invalid 
Collaborator

### FSMaxB commented Jul 23, 2016

 Looks good to me so far. The only thing that looks a bit odd to me is that NaN is treated like a string. I think it should be either a number or have its own class. @Nekomajin42 If you have questions or ideas about using or improving the common infrastructure used by conversion functions (toString and toTex) feel free to reach out to me.
Contributor

### Nekomajin42 commented Jul 23, 2016

 @FSMaxB Thanks! Since NaN is not a number, I don't want to treat it as a number. But an own class is a good idea. What do you think about the suggestions regarding the Nodes?
Collaborator

### FSMaxB commented Jul 23, 2016

 Looks good to me. The only thing I'm not sure about is if the separators should really be treated as operators, but I don't have any suggestions either.
Owner

### josdejong commented Jul 25, 2016

 Sounds good @Nekomajin42, your list with classes looks complete. How about splitting operators and separtors into two separate categories? You refer to the sort mentioning of a UnitNode here: #279 (comment). It's not trivial to implement such a UnitNode, it would require some serious changes. Right now units are evaluated as regular variables at evaluation time. At parse time the parser doesn't know whether a symbol is a unit or not. Maybe it makes sense to create a helper function which recognizes whether a Node tree holds a unit (like m/s^2), by traversing the node and its childs and checks whether: it only contains OperatorNodes multiplication/division/power, and each of the operands is a SymbolNode, which name is a unit (Unit.isValuelessUnit(...) == true) the second argument of power OperatorNodes is an integer number (ConstantNode) In that case you could use toString() inside toHTML() and return markup for a unit. Does that make sense?
Contributor

### Nekomajin42 commented Jul 25, 2016

 The operator/separator idea is fine. I will make the changes. And I understand your workaround. I will give it a try. However, the same problem exists for built in constants. Would it be possible to create a list/dictionary before the first parse event from the hard coded constants, units and prefixes? And finally, can you take a look at the pending test results? I use the latest stable branch for developement, so I guess there should be no issues, but I don't understand what is the problem.
Owner

### josdejong commented Jul 27, 2016

 Would it be possible to create a list/dictionary before the first parse event from the hard coded constants, units and prefixes? Yes, I think you can simply check their existence in the math namespace And finally, can you take a look at the pending test results? Sure! Where can I see your tests?
Contributor

### Nekomajin42 commented Jul 27, 2016

 Sure! Where can I see your tests? A few comments above. :) Well, I am not sure what am I doing. But after I made the changes in the Node* files, I ran the npm test command, and the results were a lot of passed tests and 18 pending. You can see those above. This is my first time using Node.js and npm, so I don't know whether I caused them.
Owner

### josdejong commented Jul 27, 2016

 To see what's going wrong I have to get your code on my computer and run the tests here. Are you familiar with git? Normal way is to clone the mathjs project, make your changes in a new branch, and push that to your clone on github - at that point I can simply grab your version of the code from github and run it on my computer.
Contributor

### Nekomajin42 commented Jul 27, 2016

 I am doing it right now.
Contributor

### Nekomajin42 commented Jul 27, 2016

 I finally managed to make it work. You can find it here: https://github.com/Nekomajin42/mathjs
Owner

### josdejong commented Jul 27, 2016

 I've run your clone on my computer, and it runs fine here. Are you maybe using an old version of node.js or something like that? Can you post the exact output (the errors at least) you get when running the tests?
Contributor

### Nekomajin42 commented Jul 27, 2016

 There are no errors, only pending items. You can see them here: #690 (comment) I am using v4.4.7, which is the latest LTS version, according to the official page. Should I update it? Btw, I have problem with comments and function assignments. Both of them give an error: input: "#comment" output: "TypeError: Cannot read property 'toString' of undefined" input: "f(x)=x+x" output: "SyntaxError: Unexpected operator { (char 18)" Is it possible that they are related to the Node.js version?
Owner

### josdejong commented Jul 28, 2016

 Ah, now I get it. These 18 tests are just skipped tests (I thought you had 18 tests failing or something). Most are just unit tests that need to be implemented for better test coverage. You can ignore this. node.js v4 works just fine, that can't be the problem here. How can I reproduce your issue with comments and function assignments? Do you run these expressions in the command line application or something?
Contributor

### Nekomajin42 commented Jul 28, 2016

 OK, I see now. I'm building a web based interface. When the user submits an expression, I call the parser.eval() method to evaluate it. It works fine with simple expressions, like 5+5, or built-in function calls, I can even handle the help() function, but those two always give the same error.
Owner

### josdejong commented Aug 1, 2016

 How can I reproduce this? Can you demonstrate it in a clone of this jsbin for example?
Contributor

### Nekomajin42 commented Aug 1, 2016

 Here is what I do in my script: http://jsbin.com/fucamuquho/1/edit?html,output I get the same errors. The only difference, is that I get the expressions from a form element, but that's also a string, so it's the same.
Owner

### josdejong commented Aug 1, 2016

 argh! jsbin is down
Contributor

### Nekomajin42 commented Aug 1, 2016 • edited by FSMaxB

 Here is my code: var parser = math.parser(); // comment print("#comment"); // function assignment print("f(x)=x^2"); // helper function to output formatted results. function print(expression) { try { var result = parser.eval(expression); document.write(math.parse(result.toString()) + '
'); } catch (e) { document.write(e + '
'); } } The outputs are: TypeError: Cannot read property 'toString' of undefined SyntaxError: Unexpected operator { (char 18) The first one for the comment, the second one for the function assignment.
Owner

### josdejong commented Aug 1, 2016

 The issue is because you call toString() on the result and then try to parse it again. I suppose you want to do: document.write(math.format(result) + '
'); instead of document.write(math.parse(result.toString()) + '
');
Contributor

### Nekomajin42 commented Aug 1, 2016

 No, I want to parse it again, because I have to use the toHTML() on the result too. Parsing the evaluated result works fine with other expressions. Only these to give me trouble.
Owner

### josdejong commented Aug 1, 2016

 In that case I think you don't want to use the parser, but simply do something like this: var expression = 'f(x) = x^2' var node = math.parse(expression) var html = node.toHTML() var scope = {} var result = node.compile().eval(scope) var formattedResult = math.format(result)
Contributor

### Nekomajin42 commented Aug 7, 2016

 I managed to solve the function assignment issue. However, the comment issue still exists. math.parse("#comment") returns ConstantNode {value: "undefined", valueType: "undefined"}, while math.eval("#comment") returns undefined. You can try this in the JSbin you inserted a few comments above. It is strange, because math.parse("comment") returns SymbolNode {name: "comment"}, and math.eval("comment") returns Error: Undefined symbol comment, which is the correct behavior. So I assume, the parser looks for the # character to find comments. But then why does it return a ConstantNode?
Owner

### josdejong commented Aug 8, 2016

 Comments are simply ignored so math.parse('#comment') results in the same as math.parse(''): it is an empty expression which is parsed as a constant undefined. Things that never change, like booleans, numbers, strings, null, and undefined are parsed into a ConstantNode. Variables which value has to be evaluated at evaluation time, like pi, myVariable, and comment, are parsed into a SymbolNode.
Contributor

### Nekomajin42 commented Aug 8, 2016

 OK, I get it now. From the docs: Comments can be added to explain or describe calculations in the text. A comment starts with a sharp sign character #, and ends at the end of the line. A line can contain a comment only, or can contain an expression followed by a comment. I think it is misleading, because it suggests that the parser handles the comments, like a string constant or something. No problem though, I can handle it at my end. Thanks! And I am about to finish the toHTML implementation. I will send a pull request soon.
Owner

### josdejong commented Aug 9, 2016

 OK, looking forward to your PR :)
Contributor

### Nekomajin42 commented Aug 10, 2016

 OK, it is finished, except the comment thing. I've made it work in my calculator interface, but it is messy. After you previous comment, I finally understand how the parser handles comments, and from the perspective of evaluation, it's fine to skip any characters from # to \n. However, the point of the HTML output is to format not just the result, but the input expression too, to create rich interfaces without regexp. And in the current implementation, the text of the comment gets lost, so I can not print it back with the rest of the (formatted) expression. So, can we somehow store the text of the comment in the expression tree?
Collaborator

### FSMaxB commented Aug 11, 2016

 One Suggestion: Add a comment attribute to Node that represents a comment right to the Node. Problem with this approach: It doesn't support comments at the beginning of an expression.
Contributor

### Nekomajin42 commented Aug 11, 2016

 If an expression is empty, it is evaluated as a ConstantNode with undefined value, isn't it? So a comment, which starts at the beginning of the line, is actually a ConstantNode with an undefined value and a comment attribute. It can work.
Collaborator

### FSMaxB commented Aug 11, 2016 • edited

 You're right, this should actually work. I think this would need a global option though in order to turn it off by default for HTML, String and LaTeX.
Owner

### josdejong commented Aug 14, 2016

 Hm, that's nasty. It's not just comments, this will also be an issue with white space, which isn't stored by the expression parser. Like when someone enters 2 or more spaces between two values, or uses a tab. Maybe we can extend the parser such that it can store white space left of a node, and white space and an optional comment right from the node? So we can always reconstruct the exact input from a node tree using toString and toHTML?
Contributor

### Nekomajin42 commented Aug 14, 2016

 I don't think we should store the exact input with all the different white spaces. If you build an interface, you can always leave the input on the screen, or print back the original expression with any kind of modifications. Both toString and toHTML exist to format the input/output, not to store the original expression, as it was typed be the user, because that's easy to achieve without the API. (The HTML output even gives you the opportunity to decide about white spaces via CSS.) My problem with the comments is their content. I can filter them, store them, even format them on my end, but it's very messy to store their position. (Imagine a multi-line expression with comments at the end of a few line, but not all of them!) Inside the comment, I think we should keep everything as is, and don't mind multiple white spaces. But, as @FSMaxB suggested, we can extend the Nodes with a comment property, and store the comment of the line in the comment property of the root Node. I think a few Nodes can not be roots by themselves (index, range, parenthesis?), but for the others, there are the following scenarios: The line is a single expression with a comment at the end. The comment is appended to the root of the expression. The input is a multi-line expression. The comments are appended to the roots inside the BlockNode. The line is a single comment only. The comment is appended to the root, which is a ConstantNode with valueType undefined. The input is a multi-line expression with single comment lines at the beginning, or anywhere in the middle. The solution is the mix of the previous cases. What do you think?
Collaborator

### FSMaxB commented Aug 15, 2016

 I agree that storing all whitespaces might be a bit overkill.
Owner

### josdejong commented Aug 15, 2016

 I'm not really in favor of storing white spaces either, it feels out of scope of the intention of the parser. But if it solves a serious need we can consider it. Keeping a comment property sounds similarly out of scope to me. So if we go that way anyway we can probably solve it in a more general way storing all white spaces information ignored right now by the parser and be able to reproduce the original expression 100% precise instead of 95%. Just thinking aloud. My problem with the comments is their content. I can filter them, store them, even format them on my end, but it's very messy to store their position. Why is this not the case with white spaces in your editor, why don't you have to (re)store them? Isn't that the same problem as keeping a comment intact?
Contributor

### Nekomajin42 commented Aug 15, 2016

 Well, my editor does not print back the original expression. It prints the HTML output of the parsed expression, and white spaces between operands/operators are handled by CSS. I don't want to keep multiple white-spaces at all, but I need to keep the comment strings. However, I see your point, but still don't think white-spaces are as important as comments. The latter adds extra information to the expression, while the other is only there for readability.
Owner

### josdejong commented Aug 17, 2016

 Ah then the white space issue is indeed no problem in your case. A comment indeed holds information so we could store that somehow. That's a good argument. Ok then let's go for that: store comments, ignore whitespaces. How can we concretely store this comment? I can think of two ways: Wrap the expression in a new CommentNode which holds the comment Allow setting properties in a meta data object on every Node, like myNode.setMeta({comment: 'The comment'}) var comment = myNode.getMeta().comment What do you think?
Contributor

### Nekomajin42 commented Aug 17, 2016

 Well, for the sake of the toHTML and toString functions, the CommentNode would be easier, I guess. But what about a third way? The one from @FSMaxB #690 (comment) ? The parser reads the comment, and stores it until the end of the parse operation. At that time, it already knows which Node is the root, so it can append the comment to the root, and only to the root. The toHTML and toString can check whether the .comment property is undefined, and print it or skip it. What do you think?
Collaborator

### FSMaxB commented Aug 18, 2016

 Do we have inline comments, like an equivalent to /* */ in C or are there any plans to do this in the future? If yes it might not be a good idea to just add it to the root node. My idea was more like allowing it to be stored in every kind of node. For example suppose we have inline comments: With 2 * /* comment */ 3 it would be stored in the ConstantNode's leftComment attribute.
Owner

### josdejong commented Aug 18, 2016

 @FSMaxB block comments /* */ are not supported right now though we could add them in the future. What is supported though is multiple lines in one expression like a=2 #foo \n b=3 #bar, so we can actually have comments in nested nodes. @Nekomajin42 yes the setMeta({comment: 'The comment'}) solution is more or less the same as storing a plain property .comment on a node (the idea of Max). I thought it may be a good idea to clearly separate the stuff that's needed for calculations from "meta" stuff (i.e. a comment) by getting/setting it via explicit methods getMeta and setMeta. Just an idea, it may be overkill. .comment will work just as good in practice, but adding this property to the Nodes sometimes (and leaving it undefined in most cases?) feels a bit hacky. Besides the way how we store the comment, we have to decide whether the default behavior for toString, toTex and toHTML will be: output comments too or not? The typical use cases for these methods that I can think of are: display an unambiguous or colored text output including parenthesis to the user like https://qalculate.github.io/ does display the entered expression in a mathematical way with toTex (see this example). Use toString or toHTML in an editor with two way binding like @Nekomajin42 is working on to normalize, sanitize, or colorize expressions. In the first two cases you don't want the comment to be added to the output. In the third case you do want to keep the comments there, but typically you want to maintain white spaces too (though not in particular case of @Nekomajin42 ). It's most flexible if we don't output comments by default and leave it up to the developer to concat the output with a comment if any - the other way around is less convenient for the developer. Or we could create an option for this like we can customize parenthesis in the output. But if we do that I expect that people will also want to be able to (optionally) restore the exact white spacing too, what I pointed out before, so then we still should go all in. Storing all whitespace information I think is about the same amount of code compared to storing a rightComment and/or leftComment in nodes, it doesn't need to be that complicated. Let me give it a bit more thought.

### josdejong added a commit that referenced this issue Aug 23, 2016

 Comments of expressions are are now stored in the parsed nodes, see #690 
 23d1786 
Owner

### josdejong commented Aug 23, 2016

 I've had a look in the code to see what it would take to implement this. Storing comments is very easy, but storing all whitespace characters would be very intrusive (unlike I expected at first). So lets KISS and just add a comment property to all nodes (Max's idea). I've implemented storing comments now see 23d1786 (it's still in a separate branch feature/comments). If you agree this solves our needs I will merge it into develop. As for the behavior of toString, toTex, and toHTML: I think these methods should not output comments by default. We can leave it up to the user to append a comment if any, or introduce a new option for this.
Contributor

### Nekomajin42 commented Aug 24, 2016

 I think the HTML output could contain comments, because it's easy to hide them with CSS. But for the sake of coherence, a global option would be fine.
Owner

### josdejong commented Aug 24, 2016

 Yes indeed for HTML it would be easy to hide if unwanted but I would indeed like to keep the behavior of methods (toString, toTex, toHTML) consistent with each other. If we make such an option it would not be global, but passed as argument to the method, similar to the parenthesis option (see docs).
Owner

### josdejong commented Sep 7, 2016

 FYI: the new comment property is now available in mathjs v3.5.0.
Contributor

### Nekomajin42 commented Sep 8, 2016

 Thanks! I will look into the toHTML() fix and the unit tests for the union operations as soon as I have time. I'm a bit busy at the beginning of the school year.
Owner