Improved parsing of Trans component #85

coyotte508 · 2018-04-02T11:29:24Z

Based on https://react.i18next.com/components/trans-component.html

Each html tag (closing or self-closing) and each js expression is converted to <x>content</x> where x is the index of the tag/expression in the array of children of the current tag (Trans being the root tag).

The example:

Hello <strong title={t('fourth')}>{{name}}</strong>, you have {{count}} unread message. <Link to="/msgs">Go to messages</Link>.

Becomes:

Hello <1><0>{{name}}</0></1>, you have <3>{{count}}</3> unread message. <5>Go to messages</5>.

You may notice that I deviated from the usual way of parsing. I use regexps for parsing quotes, but for the rest I don't use regexps. This avoids some caveats with regexps, for example <a stuff="a>" ....> with regexps the first closing > would trigger the end of the tag even though it shouldn't as it's surrounded by quotes.

The js parser could be improved to try and detect keys of objects, for example {{name: 'Albert'}} is now parsed as is, but it should render {{name}}. That said, it's good practice to set the variables before rendering the jsx, and as long as the user of the parser is aware, there should be no problems.

karellm · 2018-04-02T14:42:17Z

Thanks for the PR, I will review it later today or tomorrow

coyotte508 · 2018-04-04T10:43:43Z

Alright, let me know if there's anything I can do to make it easier

karellm

I realize that some of my comments are based on an undocumented style guide. At some point I will add a linter to avoid that.

My main concern really is the lack of test for the methods of the lexer. And since it adds a lot of complexity, I would vote for using an existing library.

karellm · 2018-04-06T01:08:41Z

src/lexers/jsx-lexer.js

+   * @param {*} string
+   */
+  parseJsx(string) {
+    if (string.length === 0) {


The code usually uses !string for this kind of check

The problem is that it would not parse '0' correctly in that case. As '0' == false.

Well !'0' is false and '0'.length === 0 is also false. Only empty string is a falsy value (aka length == 0)

karellm · 2018-04-06T01:09:46Z

src/lexers/jsx-lexer.js

+   *
+   * @returns string
+   */
+  eraseTags(string) {


Can you please test these methods in the lexer's test? The parser test is a high level check but individual method that do as much as these one should have test of their own so we can debug them later on.

I added a test for eraseTags.

karellm · 2018-04-06T01:11:42Z

src/lexers/jsx-lexer.js

+
+    const tag = /[A-Z0-9-]+/i.exec(string)[0].toLowerCase()
+
+    let currentIndex = tag.length+1


Please add space around operators tag.length + 1

karellm · 2018-04-06T01:16:49Z

src/lexers/jsx-lexer.js

+
+    let currentIndex = tag.length+1
+
+    while (currentIndex < string.length) {


Have you considered a html parser rather than implementing this yourself? There are a bunch listed here.

If we want code so complex in this package, I would require a lot more tests for edge cases. I really think relying on a library is the right way to go here.

After a quick search, there seems to be a library just for react here based on htmlparser2.

I don't have any issue with using an existing html parser. I'll look into it.

coyotte508 · 2018-04-06T11:12:33Z

@karellm It's done, using acorn-jsx: https://github.com/RReverser/acorn-jsx.

None of the other parsers I found, including react-html-parser and html-react-parser, htmlparser2, parse5, could handle js expressions mixed in html.

karellm

Thanks for the quick update. I'm glad that there was a library that worked in the end. I added couple more comments but I'm also happy to take over for minor style changes.

karellm · 2018-04-06T14:49:59Z

src/lexers/jsx-lexer.js

+   * @param {string} originalString The original string being parsed
+   */
+  simplify(children, originalString) {
+    for (let i = 0; i < children.length; i ++) {


Is there a reason you don't use forEach here? If not, it would be more consistent with the rest of the code

karellm · 2018-04-06T14:51:06Z

src/lexers/jsx-lexer.js

+   * @returns string
+   */
+  eraseTags(string) {
+    const children = this.simplify(acorn.parse(string, {plugins: {jsx: true}}).body[0].expression.children, string);


Please avoid line that are longer than 80 char. I would suggest doing this in two steps

karellm · 2018-04-06T14:52:12Z

src/lexers/jsx-lexer.js

+  eraseTags(string) {
+    const children = this.simplify(acorn.parse(string, {plugins: {jsx: true}}).body[0].expression.children, string);
+
+    const elemsToString = children => children.map((child, index) => {


Is there a reason you create a function rather than directly returning the mapped children?

Yes, it calls itself recurisvely.

karellm · 2018-04-06T14:52:40Z

src/lexers/jsx-lexer.js

+   * @param {*} children An array of elements contained inside an html tag
+   * @param {string} originalString The original string being parsed
+   */
+  simplify(children, originalString) {


Can you name this something a little more idiomatic like parseAcornPayload?

karellm · 2018-04-06T14:54:17Z

src/lexers/jsx-lexer.js

+    // Filter empty text elements. Using string.length instead of !string because
+    // '0' is a valid text element, and '' is not, and !string doesn't make a difference
+    // between the two.
+    children = children.filter(child => !(child.type === 'text' && child.content.length === 0));


This could be written child.type !== 'text' || child.content which is easier to read imo

I'm not sure about your comment either. As I commented above: '0' is truthy, '' is falsy. They are not the same.

Right, '0' == false, yet !'0' == false too... A quirk of js

coyotte508 · 2018-04-06T16:06:58Z

Thank you for your fast review. I just pushed some changes, don't hesitate to take over if needed.

karellm · 2018-04-07T20:36:28Z

Thanks for the contribution, 1.0.0-beta9 was just released

coyotte508 added 4 commits April 2, 2018 11:20

Improved parsing of Trans component

4302b1a

Properly trim newlines surrounded by spaces in html

4bffe6a

Fix a trimming issue and add test cases

00efb6d

Support for numbers in html tags (h1, h2...)

23cd30d

Fix regexp gulping spaces

a85d51c

coyotte508 mentioned this pull request Apr 4, 2018

Add multi-language capability to blip with react-i18next tidepool-org/blip#468

Closed

romainseb mentioned this pull request Apr 5, 2018

feat(DeleteResource): Add wording & i18n Talend/ui#1243

Merged

4 tasks

karellm requested changes Apr 6, 2018

View reviewed changes

Switch to using a library for parsing jsx

c69892b

karellm requested changes Apr 6, 2018

View reviewed changes

Code style fixes

05f7f53

Improve style - replace forEach by map

b18e60a

coyotte508 force-pushed the master branch from a241073 to b18e60a Compare April 6, 2018 21:07

Add jsx-lexer test

4038f6a

karellm merged commit 9ef46f5 into i18next:master Apr 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved parsing of Trans component #85

Improved parsing of Trans component #85

coyotte508 commented Apr 2, 2018 •

edited

karellm commented Apr 2, 2018

coyotte508 commented Apr 4, 2018 •

edited

karellm left a comment

karellm Apr 6, 2018

coyotte508 Apr 6, 2018 •

edited

karellm Apr 6, 2018 •

edited

karellm Apr 6, 2018

coyotte508 Apr 7, 2018

karellm Apr 6, 2018

karellm Apr 6, 2018

coyotte508 Apr 6, 2018 •

edited

coyotte508 commented Apr 6, 2018

karellm left a comment

karellm Apr 6, 2018

karellm Apr 6, 2018

karellm Apr 6, 2018

coyotte508 Apr 6, 2018

karellm Apr 6, 2018

karellm Apr 6, 2018

coyotte508 Apr 6, 2018

coyotte508 commented Apr 6, 2018

karellm commented Apr 7, 2018


		const tag = /[A-Z0-9-]+/i.exec(string)[0].toLowerCase()

		let currentIndex = tag.length+1


		let currentIndex = tag.length+1

		while (currentIndex < string.length) {

Improved parsing of Trans component #85

Improved parsing of Trans component #85

Conversation

coyotte508 commented Apr 2, 2018 • edited

karellm commented Apr 2, 2018

coyotte508 commented Apr 4, 2018 • edited

karellm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coyotte508 Apr 6, 2018 • edited

Choose a reason for hiding this comment

karellm Apr 6, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coyotte508 Apr 6, 2018 • edited

Choose a reason for hiding this comment

coyotte508 commented Apr 6, 2018

karellm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coyotte508 commented Apr 6, 2018

karellm commented Apr 7, 2018

coyotte508 commented Apr 2, 2018 •

edited

coyotte508 commented Apr 4, 2018 •

edited

coyotte508 Apr 6, 2018 •

edited

karellm Apr 6, 2018 •

edited

coyotte508 Apr 6, 2018 •

edited