Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x2cpg: extracted file exclusion by regex and file path #2741

Merged
merged 2 commits into from
May 19, 2023

Conversation

max-leuthaeuser
Copy link
Contributor

@max-leuthaeuser max-leuthaeuser commented May 19, 2023

This lives in x2cpg now and may be used by all frontends. jssrc2cpg and c2cpg already made use of it.

To all frontend maintainers:
If your frontend should support file exclusion please use def determine(inputPath: String, sourceFileExtensions: Set[String], config: X2CpgConfig[_]): List[String] from x2cpg.SourceFiles to retrieve a file list for your extension(s) in the input path. X2CpgConfig provides three fields for the filter process:

  • ignoredFilesRegex: a regex that the user may supply via --exclude-regex. Filters out files or folders (path relative to the input is matched against the regex)
  • ignoredFiles: a list of files that the user may supply via --exclude. Filters out files or folders (paths relative to the input dir as well as absolute paths)
  • defaultIgnoredFilesRegex: a list of regex that the frontend maintainer may provide. Filters out files or folders (path relative to the input is matched against the regex) by default (e.g., unwanted test folders).

--exclude-regex and --exclude are now available for all frontends making use of X2Cpg.parseCommandLine.

This lives in x2cpg now and may be used by all frontends.
jssrc2cpg and c2cpg already made use of it.

To all frontend maintainers:
If your frontend should support file exclusion please use
`def determine(inputPath: String, sourceFileExtensions: Set[String], config: X2CpgConfig[_]): List[String]` from `x2cpg.SourceFiles`
to retrieve a file list for your extension(s) in the input path.
`X2CpgConfig` provides three fields for the filter process:
- `ignoredFilesRegex`: a regex that the user may supply via `--exclude-regex`. Filters out files or folders (the absolute file path is matched against the regex)
- `ignoredFiles`: a list of files that the user may supply via `--exclude`. Filters out files or folders (paths relative to <input-dir> as well as absolute paths)
- `defaultIgnoredFilesRegex`: a list of regex that the frontend maintainer may provide. Filters out files or folders (the absolute file path is matched against the regex) by default (e.g., unwanted test folders).

`--exclude-regex` and `--exclude` are now available for all frontends making use of `X2Cpg.parseCommandLine`.
@johannescoetzee
Copy link
Contributor

johannescoetzee commented May 19, 2023

the absolute file path is matched against the regex

Why is the absolute used for the regex methods? This could lead to surprising (at least to me) results, for example when using a directory structure like /tmp/test/<test_repo>/.., scanning <test_repo> and adding an exclude for the test directory (which would lead to an empty CPG in this case.

Using the relative (to the project root) path when filtering would avoid this issue, while still giving users control over excluding files outside of the project dir by not providing those as input files.

@max-leuthaeuser
Copy link
Contributor Author

Yes, you are right. I will change this.

Copy link
Contributor

@johannescoetzee johannescoetzee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! This is very useful for source-based frontends in general :-)

@max-leuthaeuser max-leuthaeuser merged commit 152842f into master May 19, 2023
5 checks passed
@max-leuthaeuser max-leuthaeuser deleted the max/fileFilteringInx2Cpg branch May 19, 2023 14:36
ursachec added a commit that referenced this pull request May 22, 2023
ursachec added a commit that referenced this pull request May 22, 2023
johannescoetzee pushed a commit that referenced this pull request Jun 2, 2023
* x2cpg: extracted file exclusion by regex and file path

This lives in x2cpg now and may be used by all frontends.
jssrc2cpg and c2cpg already made use of it.

To all frontend maintainers:
If your frontend should support file exclusion please use
`def determine(inputPath: String, sourceFileExtensions: Set[String], config: X2CpgConfig[_]): List[String]` from `x2cpg.SourceFiles`
to retrieve a file list for your extension(s) in the input path.
`X2CpgConfig` provides three fields for the filter process:
- `ignoredFilesRegex`: a regex that the user may supply via `--exclude-regex`. Filters out files or folders (the absolute file path is matched against the regex)
- `ignoredFiles`: a list of files that the user may supply via `--exclude`. Filters out files or folders (paths relative to <input-dir> as well as absolute paths)
- `defaultIgnoredFilesRegex`: a list of regex that the frontend maintainer may provide. Filters out files or folders (the absolute file path is matched against the regex) by default (e.g., unwanted test folders).

`--exclude-regex` and `--exclude` are now available for all frontends making use of `X2Cpg.parseCommandLine`.

* Fix for review comment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants