Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve data type analysis for arbitrary values #9320

Merged
merged 7 commits into from
Sep 14, 2022

Conversation

RobinMalfait
Copy link
Member

This PR fixes an issue with incorrect splitting of arbitrary values and therefore improving the data type detection so that we require fewer type hints.

The original RegEx did mostly what we want, the idea is that we wanted to split by a , but one that was not within (). This is useful when you define multiple background colors for example:

<div class="bg-[rgb(0,0,0),rgb(255,255,255)]"></div>

In this case splitting by the regex would result in the proper result:

let result = [
  'rgb(0,0,0)',
  'rgb(255,255,255)'
]

Visually, you can think of it like:

    ┌─[./example.html]
    │
∙ 1 │   <div class="bg-[rgb(0,0,0),rgb(255,255,255)]"></div>
    ·                       ──┬── ┬    ─────┬─────
    ·                         │   │         ╰─────── Guarded by parens
    ·                         │   ╰───────────────── We will split here
    ·                         ╰───────────────────── Guarded by parens
    │
    └─

We properly split by , not inside a (). However, this RegEx fails the moment you have deeply nested , values or an ( is seen at the right of the current position. If you have the following example:

<div class="bg-[rgba(0,0,0,var(--alpha))]"></div>

Visually, this is what's happening:

    ┌─[./example.html]
    │
∙ 1 │   <div class="bg-[rgba(0,0,0,var(--alpha))]"></div>
    ·                         ┬ ┬ ┬
    ·                         ╰─┴─┴── We accidentally split here
    │
    └─

This is because on the right of the ,, the first paren is an opening paren ( instead of a closing one ).

I'm not 100% sure how we can improve the RegEx to handle that case as well, instead I wrote a small splitBy function that allows you to split the string by a character (just like you could do before) but ignores the ones inside the given exceptions. This keeps track of a stack to know whether we are within parens or not.

Visually, the fix looks like this:

    ┌─[./example.html]
    │
∙ 1 │   <div class="bg-[rgba(0,0,0,var(--alpha)),rgb(255,255,255,var(--alpha))]"></div>
    ·                         ┬ ┬ ┬             ┬       ┬   ┬   ┬
    ·                         │ │ │             │       ╰───┴───┴── Guarded by parens
    ·                         │ │ │             ╰────────────────── We will split here
    ·                         ╰─┴─┴──────────────────────────────── Guarded by parens
    │
    └─

RobinMalfait and others added 7 commits September 13, 2022 21:29
The original RegEx did mostly what we want, the idea is that we wanted
to split by a `,` but one that was not within `()`. This is useful when
you define multiple background colors for example:
```html
<div class="bg-[rgb(0,0,0),rgb(255,255,255)]"></div>
```

In this case splitting by the regex would result in the proper result:
```js
let result = [
  'rgb(0,0,0)',
  'rgb(255,255,255)'
]
```

Visually, you can think of it like:
```
    ┌─[./example.html]
    │
∙ 1 │   <div class="bg-[rgb(0,0,0),rgb(255,255,255)]"></div>
    ·                       ──┬── ┬    ─────┬─────
    ·                         │   │         ╰─────── Guarded by parens
    ·                         │   ╰───────────────── We will split here
    ·                         ╰───────────────────── Guarded by parens
    │
    └─
```

We properly split by `,` not inside a `()`. However, this RegEx fails
the moment you have deeply nested RegEx values.

Visually, this is what's happening:
```
    ┌─[./example.html]
    │
∙ 1 │   <div class="bg-[rgba(0,0,0,var(--alpha))]"></div>
    ·                         ┬ ┬ ┬
    ·                         ╰─┴─┴── We accidentally split here
    │
    └─
```
This is because on the right of the `,`, the first paren is an opening
paren `(` instead of a closing one `)`.

I'm not 100% sure how we can improve the RegEx to handle that case as
well, instead I wrote a small `splitBy` function that allows you to
split the string by a character (just like you could do before) but
ignores the ones inside the given exceptions. This keeps track of a
stack to know whether we are within parens or not.

Visually, the fix looks like this:
```
    ┌─[./example.html]
    │
∙ 1 │   <div class="bg-[rgba(0,0,0,var(--alpha)),rgb(255,255,255,var(--alpha))]"></div>
    ·                         ┬ ┬ ┬             ┬       ┬   ┬   ┬
    ·                         │ │ │             │       ╰───┴───┴── Guarded by parens
    ·                         │ │ │             ╰────────────────── We will split here
    ·                         ╰─┴─┴──────────────────────────────── Guarded by parens
    │
    └─
```
However, the faster version can't handle separators with multiple
characters right now. So instead of using buggy code or only using the
"slower" code, we've added a fast path where we use the faster code
wherever we can.
@RobinMalfait RobinMalfait merged commit 527031d into master Sep 14, 2022
@RobinMalfait RobinMalfait deleted the fix/data-type-improvements branch September 14, 2022 12:08
@RobinMalfait RobinMalfait changed the title Improve data type analyses for arbitrary values Improve data type analysis for arbitrary values Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants