Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to find a folder that are not inside folder #335

Closed
wclr opened this issue May 1, 2017 · 12 comments
Closed

How to find a folder that are not inside folder #335

wclr opened this issue May 1, 2017 · 12 comments

Comments

@wclr
Copy link

wclr commented May 1, 2017

The task is to find all node_modules in the project tree, that are not inside other node_modules.

I tried unsuccessfully something like:

  • ('**/node_modules/', {ignore: '**/node_modules/**/node_modules/'})

  • ('**/node_modules/', {ignore: '**/node_modules/**'})

Any advice on this?

@vjpr
Copy link

vjpr commented Oct 31, 2017

('**/node_modules, {ignore: '**/node_modules/**/*'}

The ignore doesn't prevent it from searching inside node_modules dirs though. You can check this by passing in cache and reading it afterwards.

@vjpr
Copy link

vjpr commented Oct 31, 2017

This is also discussed here: #213 (comment)

@vjpr
Copy link

vjpr commented Oct 31, 2017

@isaacs After a lot of testing It seems there is no way to achieve this.

('**/node_modules, {ignore: '**/node_modules/**/*'}) - returns node_modules folders (what we want), but ends up searching every file in the tree because the ignore pattern doesn't finish in **.

('**/node_modules, {ignore: '**/node_modules/**'}) - doesn't return any node_modules dirs, but also doesn't search inside them (what we want).

I feel like the addition of an option to not match the parent dir of a globstar expression would help solve this.

Use case: Excluding node_modules folders from indexing in a text editor when using a mono-repo or a repo with many projects inside it.

@vjpr
Copy link

vjpr commented Oct 31, 2017

('**/node_modules, {ignore: '**/node_modules/*/**'}) comes a bit closer.

Will search all node_modules and their direct children, but not further nested node_modules.

But if you have 100s of packages in your node_modules in multiple projects you are going to see a huge slow down.

@wclr
Copy link
Author

wclr commented Nov 1, 2017

Yes I see that glob is not actually right solution for this case, actually I had some other problems with glob returning strange results (at least on alpine-node container), so I use it with caution =)

@vjpr
Copy link

vjpr commented Nov 1, 2017

@whitecolor Its a shame though because you can get very, very close to a simple elegant glob solution, but then this limitation makes it unfeasible and you have to write your own.

I ended up writing my own recursive directory walker. But I am trying to integrate with chokidar for file watching, which also uses globbing. So its a pain that globs don't work.

In both glob and chokidar, there should be two ignore options, one for searching, and one for returning results.

@jasonkuhrt
Copy link

@NMinhNguyen
Copy link

This did the job for us mrmlnc/fast-glob#how-to-exclude-directory-from-reading

could you specify what pattern you used exactly? I tried globby(['**/node_modules', '!**/node_modules']) and globby(['**/node_modules', '!**/node_modules/**']) but to no avail 😕

@jasonkuhrt
Copy link

We used ignore option https://github.com/mrmlnc/fast-glob#ignore.

@NMinhNguyen
Copy link

We used ignore option mrmlnc/fast-glob#ignore.

What syntax did you use? '**/node_modules', { ignore: '**/node_modules', onlyFiles: false } and '**/node_modules', { ignore: '**/node_modules/**', onlyFiles: false } both return an empty array for me.

@isaacs
Copy link
Owner

isaacs commented Feb 28, 2023

This works in v9, but yes, it does read the node_modules directory unnecessarily.

> require('./').globSync('**/node_modules', {ignore:'**/node_modules/*/**'})
[
  'node_modules',
  'old/node_modules',
  'old/8/node_modules',
  'old/7/node_modules',
  'bench-working-dir/node_modules'
]

I think it would be a good feature addition in v9 to allow you to pass in a ignore and ignoreChildren functions explicitly as the ignore option. It already does support receiving an Ignore object, I could just loosen the type to {ignore:(Path)=>boolean,ignoreChildren:(Path)=>boolean} instead, and it'd Just Work.

Then you could do something like:

globSync('**/node_modules', {
  ignore: {
    ignore: (p) => false,
    ignoreChildren: (p) => p.name === 'node_modules',
  },
})

The challenge here is that the pattern **/node_modules/** matches .../node_modules because the /** can match zero or more path portions. So you need to ignore **/node_modules/*/** (with the extra star), but then it doesn't realize that all children of node_modules will be ignored, and does a readdir unnecessarily.

Alternatively (or in addition to this), the Ignore logic should be smart enough to realize that .../*/** on an ignore pattern means "ignore the children, but not the thing itself", and save the extra useless readdir.

@isaacs isaacs closed this as completed in cdfde4b Mar 2, 2023
@isaacs
Copy link
Owner

isaacs commented Mar 2, 2023

Custom ignores will be in the 9.2 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants