New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add replaceSeparatorsForGlob utility #142
base: master
Are you sure you want to change the base?
Conversation
throw new TypeError('expected a string: "' + util.inspect(path) + '"'); | ||
} | ||
|
||
return path.replace(/\\(?![{}()+?.^$])/g, '/'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copied from the linked Jest PR, all credit goes to @aldarund
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to use this regex in the toPosixSlashes utility method in picomatch. However, I think we'll need to remove the .
, since that would cause dotfiles and directories like foo\\.git
to incorrect. I think we also need *
to be added to the regex.
@SimenB I don't know how I missed this. I must have marked it as "read" at some point! My apologies. I think this is resolved by the If |
No worries, v4 is way more important! While I agree |
Agreed, there is an That said, I hate sounding like a broken record, but globs are regular expressions, not paths. There are too many edge cases where removing backslashes like that will cause the regex to be wrong, and/or could make the regular expression unsafe. It's possible to intentionally create malicious regular expressions resulting in DDoS attacks by defining a glob that seems fine until backslashes are removed. Not to mention, filenames may start with several of those characters, and removing The good news is that there is an easier way to solve it. The path that you're prefixing onto the glob needs to be converted to posix slashes before joining it to the glob. You can use one of these methods to join the glob (I'm exposing them on micromatch too), or just always use Picomatch and micromatch convert paths to Think about it this way: if we do what you're suggesting, we're literally potentially changing the intent of the glob defined by the user. Which means that - in some downstream lib like rimraf - when the user accidentally deletes a bunch of files that shouldn't have been matched, it will be 100% our fault. If we do it the way I'm suggesting, then yes, we need to remind people that globs are not file paths and that they need to correctly use forward slashes as path separators, and I know it's a pain (I see all of the issues for it too), but it's safer, and it's correct. |
@SimenB I have some ideas for how to make this easier, but as you mentioned it's pretty difficult to get right. Can you give me some scenarios that are used in jest, so that I can reason through different options for how we can handle this in picomatch or micromatch? Meaning, can you provide some examples of how jest handles paths and globs, and scenarios where problems have occurred? |
Sure! E.g. from jestjs/jest#7814 const rootDir = 'C:\\Program files (x86)\\dir'; // maybe a result of doing `process.cwd()` on win32
const filePath = '<rootDir>/test/{TestCasesNormal,StatsTestCases,ConfigTestCases}.test.js';
const result = path.resolve(
rootDir,
path.normalize('./' + filePath.substr('<rootDir>'.length)),
);
// result should be 'C:/Program files \(x86\)/dir/test/{TestCasesNormal,StatsTestCases,ConfigTestCases}.test.js' I think? escaping the parens |
@SimenB I accidentally closed this. Thanks! That helps, I'll review and come back with thoughts after I play around with some code. I appreciate it! |
@SimenB a couple of questions:
What if we added support for path variables to picomatch? We could allow path variables to be passed on the options. Since the picomatch parser already iterates over the entire string, it would take a trivial amount of processing to do this, and IMHO it might be useful for other projects. Thoughts? edit: on a related note, I coincidentally worked on a library for handling tabstops and variable expansion in snippets (like the ones in TextMate, Sublime Text, VS Code etc), and I had to add logic for recursively resolving path variables. Might be useful for this. |
Yup!
Not of the top of my head. The main difficulties have been how to escape things that look like globs but are not in paths (like parens) and successfully injecting the path replacement without messing up slashes or relative paths Our unit tests are mostly in https://github.com/facebook/jest/blob/master/packages/jest-config/src/__tests__/normalize.test.js. We also have some e2e tests, but I don't think those test any more edge cases
That'd be great! It would really help. We also need to do resolution of relative paths (e.g. resolving |
Got it, thanks for the detail. I might have a solution, I'll come back with some examples! Thanks! |
Description
We've had some issues in Jest with paths using mixed path separators due to both running on windows and some incomplete normalization. So we've added a utility which we use everywhere we use
micromatch
See jestjs/jest#6650 where this regex comes from. (if you have any feedback on that PR, it'd be most welcome btw 馃榾)
Note that we should probably be better at normalization in Jest as well as use
path.posix.join
etc to avoid the backslashes. But it doesn't always make sense (and doesn't necessarily always work) to use forward slashes, especially when doing FS operation, so being able to just pass things through a helper formultimatch
is simpler.Checklist