Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

semgrep scan --validate fails because semgrep-core report "No such file or directory" #9845

Open
1 of 3 tasks
sflanker opened this issue Feb 26, 2024 · 1 comment
Open
1 of 3 tasks
Labels
bug Something isn't working core priority:medium

Comments

@sflanker
Copy link

Describe the bug

It would appear that semgrep scan --validate is not invoking semgrep-core correctly.

semgrep scan --validate --config="p/owasp-top-ten" --debug

[00.00][DEBUG]: setup_logging: highlight_setting=Std_msg.Auto, highlight=true
Downloading config from https://semgrep.dev/p/owasp-top-ten
Failed to decode JSON: KeyError('rule_config')
Downloaded config from https://semgrep.dev/p/owasp-top-ten
loaded 1 configs in 5.68034553527832
Downloading config from https://semgrep.dev/p/semgrep-rule-lints
Failed to decode JSON: KeyError('rule_config')
Downloaded config from https://semgrep.dev/p/semgrep-rule-lints
loaded 1 configs in 0.3521561622619629
[00.00][INFO](cli, Core_CLI): Executed as: /usr/local/lib/python3.12/site-packages/semgrep/bin/semgrep-core -json -check_rules /tmp/tmpi87nyugx.yaml p/owasp-top-ten
[00.00][INFO](cli, Core_CLI): Version: semgrep-core version: 1.62.0
Exception: Sys_error("p/owasp-top-ten: No such file or directory")
Raised by primitive operation at UFile.Legacy.files_of_dirs_or_files_no_vcs_nofilter.(fun) in file "libs/commons/UFile.ml", line 177, characters 14-33
Called from List_.fast_map in file "libs/commons/List_.ml", line 80, characters 17-20
Called from UFile.Legacy.files_of_dirs_or_files_no_vcs_nofilter in file "libs/commons/UFile.ml", line 175, characters 4-204
Called from UFile.files_of_dirs_or_files_no_vcs_nofilter in file "libs/commons/UFile.ml", line 191, characters 2-74
Called from File_type.files_of_dirs_or_files in file "libs/commons/File_type.ml", line 423, characters 2-52
Called from Check_rule.run_checks in file "src/metachecking/Check_rule.ml", line 239, characters 4-145
Called from Check_rule.check_files in file "src/metachecking/Check_rule.ml", line 286, characters 26-65
Called from Core_CLI.with_exception_trace in file "src/core_cli/Core_CLI.ml", line 743, characters 6-10


Configuration is invalid - found 1 configuration error(s), and 523 rule(s).
[ERROR] Error while running rules:
                    You are seeing this because the engine was killed.

                    The most common reason this happens is because it used too much memory.
                    If your repo is large (~10k files or more), you have three options:
                    1. Increase the amount of memory available to semgrep
                    2. Reduce the number of jobs semgrep runs with via `-j <jobs>`. We
                        recommend using 1 job if you are running out of memory.
                    3. Scan the repo in parts (contact us for help)

                    Otherwise, it is likely that semgrep is hitting the limit on only some
                    files. In this case, you can try to set the limit on the amount of memory
                    semgrep can use on each file with `--max-memory <memory>`. We recommend
                    lowering this to a limit 70% of the available memory. For CI runs with
                    interfile analysis, the default max-memory is 5000MB. Without, the default
                    is unlimited.

                    The last thing you can try if none of these work is to raise the stack
                    limit with `ulimit -s <limit>`.

                    If you have tried all these steps and still are seeing this error, please
                    contact us.

                       Error: semgrep-core exited with unexpected output

Sending pseudonymous metrics since metrics are configured to AUTO and registry usage is True

Maybe I'm misunderstanding the usage for --validate but because I would like to differentiate between semgrep bugs like #9617, invalid configuration (for example a bad ruleset name), and actual issues with the files being scanned, I think it would be reasonable to be able to invoke scan --validate in this way.

To Reproduce

Create an empty folder.
Initialize a git repo.
Add a remote (doesn't have to exist).
Run semgrep scan --validate --config="p/owasp-top-ten" --debug

Expected behavior

Semgrep downloads and validates the ruleset specified in the --config arg. I would also expect then SEMGREP_RULES environment variable to work.

What is the priority of the bug to you?

  • P0: blocking your adoption of Semgrep or workflow
  • P1: important to fix or quite annoying
  • P2: regular bug that should get fixed

Environment

Docker: semgrep/semgrep:1.62.0

Use case

In a CI context, I want to differentiate between issues running semgrep and issues detected by semgrep.

@ievans ievans added the bug Something isn't working label Feb 27, 2024
@ievans
Copy link
Member

ievans commented Feb 27, 2024

cc @aryx osemgrep related?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core priority:medium
Development

No branches or pull requests

2 participants