Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distinguish between "unignored" and "not ignored" (so results can be combined) #31

Closed
billiegoose opened this issue Oct 23, 2017 · 10 comments

Comments

@billiegoose
Copy link

I need to implement hierarchical .gitignore files for my application, and I've run into an issue trying to combine the results of multiple rulesets. My approach has been to walk down the pathname and apply .gitignore files successively.¹ However, with the current API I cannot tell whether a file is merely "not ignored" or is actually "unignored".

Here's a specific scenario:

// In real life this would be wrapped in a loop or recursion,
// I'm just hard-coding variables for demonstration purposes

rootRules = ignore().add(readFileSync('.gitignore', 'utf8')) // contains 'foo.txt'
rootIgnoredA = rules.ignores("/A/foo.txt") // true
rootIgnoredB = rules.ignores("/B/foo.txt") // true

ARules = ignore().add(readFileSync('A/.gitignore', 'utf8')) // is empty
BRules = ignore().add(readFileSync('B/.gitignore', 'utf8')) // contains '!foo.txt'
ignoredA = ARules.ignores("/A/foo.txt") // false
ignoredB = BRules.ignores("/B/foo.txt") // also false

Given the result values of both .gitignore files, there's not enough information to determine the combined result. "/A/foo.txt" should be ignored, but "/B/foo.txt" should NOT be ignored, because it was unignored.

I think the cleanest way to solve this would be to allow specifying an starting value to .ignores. Like:

ignoredA = ARules.ignores("/A/foo.txt", rootIgnoredA) // true
ignoredB = BRules.ignores("/B/foo.txt", rootIgnoredB) // false

But I would also be OK with a new function that returned a ternary value, like true/null/false or "ignored"/"nochange"/"unignored".

¹ and yes I take into account the "It is not possible to re-include a file if a parent directory of that file is excluded." by testing the rules on the parent directory first and breaking out of the loop if they match.
@kaelzhang
Copy link
Owner

kaelzhang commented Oct 23, 2017

No, node-ignore does not support hierarchical .gitignore rules as mentioned in the README.md. If we need to deal with ignore rules from different root directories, multiple node-ignore instances are required.

node-ignore is a gitignore rule parser, but will do nothing about paths calculating, traversing of directories. If you need that, you could do something based upon node-ignore.

@kaelzhang
Copy link
Owner

kaelzhang commented Oct 24, 2017

Besides, the pathname which passed to ig.ignores(pathname) should be path.relative()d to the directory where the corresponding gitignore rule locates.

Suppose that:

/path/to/A/
          |-- .gitignore   # which contains foo.txt
          |-- foo.txt

Then:

ig.ignores('/A/foo.txt')   // WRONG !
ig.ignores('/path/to/a/foo.txt')  // WRONG!
ig.ignores('./foo.txt')  // WRONG !

ig.ignores('foo.txt')  // Right

@billiegoose
Copy link
Author

No, node-ignore does not support hierarchical .gitignore rules as mentioned in the README.md.

I'm sorry I didn't communicate this clearly. I know that node-ignore doesn't support hierarchies. That's why I was writing something that does, using node-ignore as a building block. I hadn't written it at the time (I got stumped you see) so I just used pseudocode in the example.

If we need to deal with ignore rules from different root directories, multiple node-ignore instances are required.

Using multiple instances isn't sufficient. You also need to keep track at each level in the hierarchy whether a file has been ignored by a previous instance. That is why I proposed the feature I did, because otherwise how would you know whether a file that is currently ignored should be un-ignored?

Anyway I did find a workaround. To check if a file that is currently ignored should be un-ignored, you can amend the ruleset by adding an ignore ** rule to the beginning of the file. That essentially sets the default state to true, so the only way it will return false is if it got un-ignored.

For the record here's my code.

async function isIgnored (workdir, pathname) {
  let pairs = [
    {
      gitignore: path.join(workdir, '.gitignore'),
      pathname
    }
  ]
  let pieces = pathname.split('/')
  for (let i = 1; i < pieces.length; i++) {
    let dir = pieces.slice(0, i).join('/')
    let file = pieces.slice(i).join('/')
    pairs.push({
      gitignore: path.join(workdir, dir, '.gitignore'),
      pathname: file
    })
  }

  let ignoredStatus = false
  for (let p of pairs) {
    let file
    try {
      file = await read(p.gitignore, 'utf8')
    } catch (err) {
      if (err.code === 'NOENT') continue
    }
    let ign = ignore().add(file)
    let unign = ignore().add(`**\n${file}`)
    // If the parent directory is excluded, we are done.
    // "It is not possible to re-include a file if a parent directory of that file is excluded. Git doesn’t list excluded directories for performance reasons, so any patterns on contained files have no effect, no matter where they are defined."
    // source: https://git-scm.com/docs/gitignore
    let parentdir = path.dirname(p.pathname)
    if (ign.ignores(parentdir)) return true
    // If the file is currently ignored, test for UNignoring.
    if (ignoredStatus) {
      ignoredStatus = unign.ignores(p.pathname)
    } else {
      ignoredStatus = ign.ignores(p.pathname)
    }
  }
  return ignoredStatus
}

The hack/workaround I mentioned are the lines:

    let ign = ignore().add(file)
    let unign = ignore().add(`**\n${file}`)
...
    if (ignoredStatus) {
      ignoredStatus = unign.ignores(p.pathname)
    } else {
      ignoredStatus = ign.ignores(p.pathname)
    }

@jirutka
Copy link

jirutka commented Aug 8, 2018

There's one bug:

-if (ign.ignores(parentdir)) return true
+if (parentdir !== '.' && ign.ignores(parentdir)) return true

@jirutka
Copy link

jirutka commented Aug 8, 2018

Another implementation of hierarchical paths matcher:

// @flow
import ignore from 'ignore'
import * as path from 'path'
import * as R from 'ramda'

type Options = {
  // Be case-insensitive? Default is true.
  ignorecase?: boolean,
}

const joinPaths = R.apply(path.join)

function splitPath (filename: string) {
  return filename.split(path.sep).filter(s => s !== '')
}

function parsePatterns (patterns: string, opts: Options = {}) {
  const ign = ignore(opts).add(patterns)
  const unign = ignore(opts).add(`**\n${patterns}`)

  return (filename: string, defaultState: boolean = false) =>
    defaultState ? unign.ignores(filename) : ign.ignores(filename)
}

export default (patternsReader: ((dirname: string) => ?string), opts: Options = {}) => {

  const patternsMatcherFromDir = R.memoize((dirname: string) => {
    const patts = patternsReader(dirname)
    if (patts) { return parsePatterns(patts, opts) }
  })

  // (filename: string) => boolean
  return R.pipe(
    splitPath,
    // ex.: ["a", "b", "c"] -> [[".", "a"/b/c"], ["a", "b/c"], ["a/b", "c"]]
    parts => parts.map((_, idx) => {
      return R.splitAt(idx, parts).map(joinPaths)
    }),
    R.reduce((state, [dirname, filename]) => {
      const matches = patternsMatcherFromDir(dirname)
      // If there's no ignore file in dirname, go up.
      if (!matches) {
        return state
      }
      // If the parent directory is excluded, we are done.
      // It is not possible to re-include a file if a parent directory of that
      // file is excluded. - https://git-scm.com/docs/gitignore
      const parentDir = path.dirname(filename)
      if (parentDir !== '.' && matches(parentDir, false)) {
        return R.reduced(true)
      }
      return matches(filename, state)
    }, false),
  )
}

@kaelzhang
Copy link
Owner

kaelzhang commented Aug 9, 2018

@jirutka

There's one bug:

-if (ign.ignores(parentdir)) return true
+if (parentdir !== '.' && ign.ignores(parentdir)) return true

Do NOT pass paths like ./some-dir to ig.ignore() or any other node-ignore methods, see related doc here.

See also previous comments of this issue.

I've updated the readme about this.

@kaelzhang
Copy link
Owner

I'm working on a project ignore-nested which is based on node-ignore.

I'll close this issue after ignore-nested published.

@jirutka
Copy link

jirutka commented Aug 9, 2018

-if (ign.ignores(parentdir)) return true
+if (parentdir !== '.' && ign.ignores(parentdir)) return true

Do NOT pass paths like ./some-dir to ig.ignore() or any other node-ignore methods, see related doc here.

I don't, please read the code properly. path.dirname('some-dir') => '.'.

I'm working on a project ignore-nested which is based on node-ignore.

Great!

kaelzhang added a commit that referenced this issue Aug 13, 2018
kaelzhang added a commit that referenced this issue Aug 13, 2018
kaelzhang added a commit that referenced this issue Aug 13, 2018
@billiegoose
Copy link
Author

Awesome! I'm looking forward to using nested-ignore in isomorphic-git.

@kaelzhang
Copy link
Owner

@wmhilton coming soon 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants