Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: reduce some false positive matches #23

Merged

Conversation

antongolub
Copy link
Contributor

@antongolub antongolub commented May 25, 2022

  • fix: handle some false positive matches
    * feat: introduce SEMVER_REGEX const
  • test: check what exactly is captured by regex
  • test: verify test cases via semver
  • chore: up deps, apply xo

closes #21

@antongolub
Copy link
Contributor Author

@sindresorhus, rfr.

@sindresorhus
Copy link
Owner

I think it's important to ensure we don't introduce another ReDoS vulnerability with these changes. I have already had a few: https://github.com/sindresorhus/semver-regex/commits/main

https://www.google.com/search?client=safari&rls=en&q=redos+check&ie=UTF-8&oe=UTF-8

@antongolub
Copy link
Contributor Author

antongolub commented May 25, 2022

Reasonable. https://devina.io/redos-checker, the results are almost identical:

Before
image

After
image

@antongolub
Copy link
Contributor Author

@sindresorhus, can you trigger tests again?

index.js Outdated Show resolved Hide resolved
@antongolub
Copy link
Contributor Author

antongolub commented May 26, 2022

Mb add named semverRegex export instead? We'd like to avoid default exports in our codebase.

export function semverRegex() {
	return /(?<=^v?|\sv?)(?:(?:0|[1-9]\d{0,9}?)\.){2}(?:0|[1-9]\d{0,9})(?:-(?:0|[1-9]\d*|[\da-z-]*?[a-z-][\da-z-]*?)){0,100}?(?=$| |\+|\.)(?:(?<=-\S+)(?:\.(?:[\da-z-]*?[a-z-][\da-z-]*|0|[1-9]\d*)){1,100}?)?(?!\.)(?:\+(?:[\da-z-]+(?:\.[\da-z-]+)*){1,100}?(?!\w))?(?!\+)/gi;
}

export default semverRegex;

@antongolub
Copy link
Contributor Author

antongolub commented May 27, 2022

I have to admit, that I have no idea how to implement trully redos-safe checker: we have a fairly wide range of chunk combinations and the semver spec does not impose their length restrictions.

@antongolub
Copy link
Contributor Author

@sindresorhus,

Anyway, is any chance to land this?

@sindresorhus
Copy link
Owner

we have a fairly wide range of chunk combinations and the semver spec does not impose their length restrictions.

But we can restrict them to some realistic values.

@sindresorhus
Copy link
Owner

Anyway, is any chance to land this?

To be honest, I don't know where to start reviewing. The regex is just way too complicated now. Maybe if you could split up the regex into manageable parts. Like the main version component, prerelease, and build.

If you need it right away, I would suggest just doing:

npm install 'antongolub/semver-regex#fix-false-positive-matches'

@antongolub
Copy link
Contributor Author

antongolub commented May 29, 2022

We have dozens packages depending on semver-regex, so I'd like to provide the fix here. The value of OSS is not in forks, but in improving existing tools (IMO)

The regex is just way too complicated now.

Like email regex, url regex, strict iso8601, etc. They are all complicated because the formats they describe are complicated too. I believe someday robots will write regexes (https://github.com/MaLeLabTs/RegexGenerator)

In general, the patch looks optimistic: the old tests work and the new ones also pass. Maybe we just need to extnd redos tests.

If the new regex completely covers the spec, perhaps this will be its last change)

@antongolub
Copy link
Contributor Author

Leaving things as they are seems wrong:
image
This misbehavior has been here for so long time and probably has a bunch of side effects, that the fix will have to be published as 5.0.0.

@antongolub
Copy link
Contributor Author

antongolub commented May 29, 2022

@sindresorhus,

https://devina.io/redos-checker

/(?<=^v?|\sv?)(?:(?:0|[1-9]\d{0,9}?)\.){2}(?:0|[1-9]\d{0,9})(?:-(?:--?|0|[1-9]\d*|\d*[a-z]+\d*)){0,100}(?=$| |\+|\.)(?:(?<=-\S+)(?:\.(?:--?|[\da-z-]*[a-z-]\d*|0|[1-9]\d*)){1,100}?)?(?!\.)(?:\+(?:[\da-z]\.?-?){1,100}?(?!\w))?(?!\+)/gi

image

@antongolub
Copy link
Contributor Author

antongolub commented May 30, 2022

npm install 'antongolub/semver-regex#fix-false-positive-matches'

@sindresorhus, let's publish this as beta, so we could override the current version via yarn resolutions.

@antongolub
Copy link
Contributor Author

The execution time is strongly related to the hardware. I could just up the threshold, but I have no idea where the optimum lies.

@sindresorhus
Copy link
Owner

CI is failing

@antongolub
Copy link
Contributor Author

So close)

Execution time: 20
t.true(difference < 20, `Execution time: ${difference}`);

@antongolub
Copy link
Contributor Author

✖ invalid version does not cause catatrophic backtracking Execution time: 26
t.true(difference < 25, Execution time: ${difference});

@antongolub
Copy link
Contributor Author

antongolub commented Jun 2, 2022

I'll set the timeout to 50.

@antongolub
Copy link
Contributor Author

@sindresorhus, your turn

@sindresorhus
Copy link
Owner

Mb add named semverRegex export instead? We'd like to avoid default exports in our codebase.

I prefer using default export when it makes sense and it makes no sense to export both a default and named export with the same thing.

@sindresorhus sindresorhus merged commit e93d9c8 into sindresorhus:main Jun 5, 2022
@sindresorhus
Copy link
Owner

Thanks :)

@sindresorhus
Copy link
Owner

https://github.com/sindresorhus/semver-regex/releases/tag/v4.0.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semantic version tests are not being passed
2 participants