Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix up vendor regex for bootstrap #6410

Merged

Conversation

tclem
Copy link
Contributor

@tclem tclem commented May 12, 2023

The current regex for detecting if bootstrap is vendored is too broad and overly complicated (uses lookahead). I believe the intent is only to match bootstrap files (not entire directories), but I've written a few more test to demonstrate what should and should not be detected as vendored here.

The current regex looks like this:
(^|/)bootstrap([^/.]*)(?=\.).*\.(js|css|less|scss|styl)$

As written, this would be considered vendored:

  • bootstrap-1.4/misc/other/reset.css

But this would not:

  • bootstrap/misc/other/reset.css

I don't believe that's correct or intended. This vendor rule should target bootstrap artifacts specifically (not entire directories). Some history in #5745 and #5957.

The proposal is to only match on the filename (and allow filenames that include .).

Checklist:

  • I am adding new or changing current functionality
    • I have added or updated the tests for the new or changed functionality.

Copy link
Contributor

@jorendorff jorendorff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regexp looks accurate.

I don't know if matching across slashes is doing useful work for us though. bootstrap*/*.css is a thing that exists out there, I found some in Blackbird that I guess slipped past the current regex. css under a "bootstrap" dir; some other files

@lildude lildude requested a review from Alhadis May 12, 2023 19:58
@lildude
Copy link
Member

lildude commented May 12, 2023

Pinging you for review @Alhadis as our resident regex guru 🙇‍♂️ (no rush)

Copy link
Collaborator

@Alhadis Alhadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that's correct or intended. This vendor rule should target bootstrap artefacts specifically (not entire directories).

IIRC, popular Bootstrap components like styled form components are often bundled alongside the core Bootstrap files themselves (so, stuff like bootstrap/components/datepicker/**/*.{css,js}).

I could easily be wrong, however, as I've never actually used Bootstrap before (or anything resembling it). Ergo, I trust your familiarity with Bootstrap over mine.

@tclem
Copy link
Contributor Author

tclem commented May 12, 2023

That's true! Looking at the links that @jorendorff posted though, there is a real mix of original js/css content nested inside directories named bootstrap (example) and twbs source that's just copied/vendored into another repo (example).

I ended up proposing this change in part because the lookahead in this regex isn't very portable to linguist clones that are using different regex engines (and it's the only vendor regex that uses that lookahead). In investigating and writing some tests it then also seemed like the current regex was making some arbitrary choices as documented in the main PR description. I'm open to some further discussion here on how to best detect what is actually vendored twbs vs. what is code in a subdirectory of bootstrap. Any other ideas?

@Alhadis
Copy link
Collaborator

Alhadis commented May 12, 2023

the lookahead in this regex isn't very portable to Linguist clones that are using different regex engines

In that case, I'd replace (?=\.).* with (?:(?:\.).*)?. It effectively achieves the same effect.

I'm open to some further discussion here on how to best detect what is actually vendored twbs vs. what is code in a subdirectory of bootstrap. Any other ideas?

Unfortunately, no. It's impossible to distinguish vendored components from original (or heavily-modified) ones. There's really not much we can do, other than encourage authors to manually unvendor files on a case-by-case basis.

@tclem
Copy link
Contributor Author

tclem commented May 12, 2023

In that case, I'd replace (?=\.).* with (?:(?:\.).*)?. It effectively achieves the same effect.

We came up with replacing (?=\.).* with (\..*)?\. which I believe is the same as well. How does that look?

@Alhadis
Copy link
Collaborator

Alhadis commented May 12, 2023

Ugh, did I really write (?:\.)? 😞 (I need sleep…) Obviously that should've been (?:\..*)?, which is functionally equivalent to your solution, save for the use of a non-capturing group to reduce memory footprint (?:…) instead of (…).

How does that look?

I think it's fine. 👍 I was about to suggest you use (?:…) instead of (…), but since most of the patterns defined in vendor.yml use ordinary capturing groups, it's not worth the inconsistency.

@tclem
Copy link
Contributor Author

tclem commented May 18, 2023

@lildude and @Alhadis, I don't have write permissions so I'm going to let you all press the merge button here. This is good to go now, right?

Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tclem yup, this looks good to me. I tend to merge PRs only close to when I make a release to keep changes to a minimum in the event there is a need for a quick patch fix release. I’m aiming for a new release early June once everything has settled down on GitHub.

@lildude lildude requested a review from a team as a code owner May 30, 2023 09:12
@lildude lildude added this pull request to the merge queue May 30, 2023
Merged via the queue into github-linguist:master with commit dc6fd3c May 30, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants