Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove single ; from exclusions list #428

Merged
merged 1 commit into from
Nov 17, 2020
Merged

Remove single ; from exclusions list #428

merged 1 commit into from
Nov 17, 2020

Conversation

MaxGiting
Copy link
Collaborator

Because of issue #427 I think we shouldn't remove single characters in the exclusions list.

The exclusions were partly to improve performance. We see much greater returns (by using these exclusions) when running over multiple user agents, so on a single request this should make very little to no difference.

@JayBizzle
Copy link
Owner

I guess it throws up another question of whether the regex should be case sensitive? 🤔

@MaxGiting
Copy link
Collaborator Author

Yes it is worth testing case sensitive for sure.

I think it is also worth removing this from exclusions. Removing single characters has a higher likelihood of creating false positives like in the linked issue compared to removing multiple characters. Especially when as seen that single character is used as a delimiter character.

@gplumb
Copy link
Contributor

gplumb commented Nov 16, 2020 via email

@JayBizzle
Copy link
Owner

Care to expand on this thought?

As well as case insensitivity, how about a simple function that tokenizes the input upfront, builds a string in the format that we're expecting and then tests against that after? This would mean that any unusual or unnecessary semicolons, colons or whitespace placements can be normalised without breaking the regex G

On Mon, 16 Nov 2020, 22:03 MaxGiting, @.***> wrote: Yes it is worth testing case sensitive for sure. I think it is also worth removing this from exclusions. Removing single characters has a higher likelihood of creating false positives like in the linked issue compared to removing multiple characters. Especially when as seen that single character is used as a delimiter character. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#428 (comment)>, or unsubscribe <github.com/notifications/unsubscribe-auth/AAFU7ZSL33PU4OHAK2AOZNDSQGOSDANCNFSM4TXX26MQ> .

@JayBizzle
Copy link
Owner

Performance seems unaffected anyway, the test suite still take the same amount of time to run 👍

Because of issue #427 I think we shouldn't remove single characters in the exclusions list.

The exclusions were partly to improve performance. We see much greater returns (by using these exclusions) when running over multiple user agents, so on a single request this should make very little to no difference.

@gplumb
Copy link
Contributor

gplumb commented Nov 23, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants