Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize go call analysis and make it default behavior #513

Closed
another-rex opened this issue Aug 30, 2023 · 7 comments
Closed

Stabilize go call analysis and make it default behavior #513

another-rex opened this issue Aug 30, 2023 · 7 comments
Assignees

Comments

@another-rex
Copy link
Collaborator

Go call analysis should be the default behavior if the go toolchain is detected.

TODO:
Figure out the best way to organize call graph analysis flags to allow disabling/enabling, and experimental-ness on a per language level.

@oliverchang
Copy link
Collaborator

@hogo6002 this is another potential starter project for you. @another-rex WDYT?

@G-Rath
Copy link
Collaborator

G-Rath commented Nov 18, 2023

Could we have the flags take a comma list of langs, with the presence of a flag meaning "all"? e.g.

# this has all stable call analysis being run (also same if `--call-analysis` is passed without a value)
osv-scanner .

# this enables stable call analysis for just rust and java
osv-scanner --call-analysis=rust,java

# this disables all stable call analysis
osv-scanner --no-call-analysis

# this disables stable call analysis for rust, but all others are still run
osv-scanner --no-call-analysis=rust

And then the same again for experimental.

I know it seems a bit verbose but I think logically that should be very straightforward to determine - I think something like:

func updateEnabledCallAnalysis(supportedCallAnalysis *map[string]bool, enabledFlag any, disabledFlag any) {
	langsToEnable := strings.Split(enabledFlag)
	langsToDisable := strings.Split(disabledFlag)

	if enabledFlag == "*" {
		langsToEnable = getMapKeys(supportedCallAnalysis)
	}

	if disabledFlag == "*" {
		langsToDisable = getMapKeys(supportedCallAnalysis)
	}

	for _, s := range langsToEnable {
		if _, ok := supportedCallAnalysis[s]; ok {
			supportedCallAnalysis[s] = true
		}
	}

	for _, s := range langsToDisable {
		if _, ok := supportedCallAnalysis[s]; ok {
			supportedCallAnalysis[s] = false
		}
	}
}

func main() {
	stableCallAnalysisStates := map[string]bool{
		"go":   true,
		"rust": true,
		"java": true,
	}

	experimentalCallAnalysisStates := map[string]bool{
		"ruby": true,
	}


	updateEnabledCallAnalysis(&stableCallAnalysisStates, enabledStableFlag, disabledStableFlag)
	updateEnabledCallAnalysis(&experimentalCallAnalysisStates, enabledExperimentalFlag, disabledExperimentalFlag)
}

That's rough pseudo code - I think really the more fiddly bit is probably handling the actual flag parsing to handle what's typically mapped to a bool; so I've used * here to represent "enable/disable all" given Go doesn't support unions (otherwise it'd just be a simple string | boolean), and I think empty string needs to be reserved for "nothing" otherwise we can't tell between --no-call-analysis being passed vs just not present, but the actual value should be based on how the flag parsing implementation works out.

@another-rex
Copy link
Collaborator Author

That makes sense, and it provides more control to the user than just separating experimental and non experimental languages for call analysis.

I think we can retire the experimental-call-analysis flag and just have the --call-analysis flag, with supported languages being enabled by default, and documenting which languages are still experimental. I'm still debating whether just passing --call-analysis should enable the experimental languages, or force experimental languages to be manually specified.

@hogo6002
Copy link
Contributor

hogo6002 commented Nov 20, 2023

Could we have the flags take a comma list of langs, with the presence of a flag meaning "all"? e.g.

# this has all stable call analysis being run (also same if `--call-analysis` is passed without a value)
osv-scanner .

# this enables stable call analysis for just rust and java
osv-scanner --call-analysis=rust,java

# this disables all stable call analysis
osv-scanner --no-call-analysis

# this disables stable call analysis for rust, but all others are still run
osv-scanner --no-call-analysis=rust

And then the same again for experimental.

I know it seems a bit verbose but I think logically that should be very straightforward to determine - I think something like:

func updateEnabledCallAnalysis(supportedCallAnalysis *map[string]bool, enabledFlag any, disabledFlag any) {
	langsToEnable := strings.Split(enabledFlag)
	langsToDisable := strings.Split(disabledFlag)

	if enabledFlag == "*" {
		langsToEnable = getMapKeys(supportedCallAnalysis)
	}

	if disabledFlag == "*" {
		langsToDisable = getMapKeys(supportedCallAnalysis)
	}

	for _, s := range langsToEnable {
		if _, ok := supportedCallAnalysis[s]; ok {
			supportedCallAnalysis[s] = true
		}
	}

	for _, s := range langsToDisable {
		if _, ok := supportedCallAnalysis[s]; ok {
			supportedCallAnalysis[s] = false
		}
	}
}

func main() {
	stableCallAnalysisStates := map[string]bool{
		"go":   true,
		"rust": true,
		"java": true,
	}

	experimentalCallAnalysisStates := map[string]bool{
		"ruby": true,
	}


	updateEnabledCallAnalysis(&stableCallAnalysisStates, enabledStableFlag, disabledStableFlag)
	updateEnabledCallAnalysis(&experimentalCallAnalysisStates, enabledExperimentalFlag, disabledExperimentalFlag)
}

That's rough pseudo code - I think really the more fiddly bit is probably handling the actual flag parsing to handle what's typically mapped to a bool; so I've used * here to represent "enable/disable all" given Go doesn't support unions (otherwise it'd just be a simple string | boolean), and I think empty string needs to be reserved for "nothing" otherwise we can't tell between --no-call-analysis being passed vs just not present, but the actual value should be based on how the flag parsing implementation works out.

Hi Gareth, I agree with your design that we should set call analysis per language. I made some changes based on my understanding. Feel free to leave any comments.

So far, we only have call analysis for rust and go. Call analysis for go is stable/non-experimental, so we want to make it runs as default.
Can we only create one new map here to record all languages we support and lists their states, like:

var stableCallAnalysisStates = map[string]bool{
	"go":   true,   //non-experimental 
	"rust": false,  // experimental 
}

Then, as you mentioned, we have two flags to represent if call analysis is enabled. But I think non-experimental call analysis should always be running unless we disable it. Also, I don't think we can pass --call-analysis without a value as it is a cli.StringFlag (please let me know if we can handle this), so I made some changes:

# this has all non-experimental call analysis being run, same with `--call-analysis=default` is passed (only go now)
osv-scanner .

# this enables all call analysis (both go and rust)
osv-scanner --call-analysis=all

# this enables all default call analysis and rust (call analysis for go is also enabled as it is by default.)
osv-scanner --call-analysis=rust

# this disables all call analysis
osv-scanner --disable-call-analysis=all

# this disables call analysis for go, but rust still runs
osv-scanner --call-analysis=rust --disable-call-analysis=go


@G-Rath
Copy link
Collaborator

G-Rath commented Nov 20, 2023

Also, I don't think we can pass --call-analysis without a value as it is a cli.StringFlag (please let me know if we can handle this

you should be able to do it using the `GenericFlag` - the trick is you need to implement `IsBoolFlag` which is effectively "does this flag require a value to be provided?"
Index: cmd/osv-scanner/main.go
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/cmd/osv-scanner/main.go b/cmd/osv-scanner/main.go
--- a/cmd/osv-scanner/main.go	(revision 3069934324473370c83c8da418df32881dff9d85)
+++ b/cmd/osv-scanner/main.go	(date 1700505913172)
@@ -22,6 +22,29 @@
 	date   = "n/a"
 )
 
+type callAnalysisGenericFlag struct {
+	defaultValue string
+	s            string
+}
+
+func (g *callAnalysisGenericFlag) Set(value string) error {
+	if value == "false" {
+		g.s = ""
+	} else if value == "" || value == "true" {
+		g.s = g.defaultValue
+	} else {
+		g.s = value
+	}
+
+	return nil
+}
+
+func (g *callAnalysisGenericFlag) IsBoolFlag() bool { return true }
+
+func (g *callAnalysisGenericFlag) String() string {
+	return g.s
+}
+
 func run(args []string, stdout, stderr io.Writer) int {
 	var r reporter.Reporter
 
@@ -41,6 +64,7 @@
 		Writer:    stdout,
 		ErrWriter: stderr,
 		Flags: []cli.Flag{
+			&cli.GenericFlag{Name: "call-analysis", Value: &callAnalysisGenericFlag{"go,rust", "go,rust"}},
 			&cli.StringSliceFlag{
 				Name:      "docker",
 				Aliases:   []string{"D"},
@@ -131,6 +155,9 @@
 		},
 		ArgsUsage: "[directory1 directory2...]",
 		Action: func(context *cli.Context) error {
+			fmt.Printf("call-analysis: %v\n", context.IsSet("call-analysis"))
+			fmt.Println("call-analysis:", context.Generic("call-analysis").(*callAnalysisGenericFlag).String())
+			return nil
 			format := context.String("format")
 
 			if context.Bool("json") {

Result:

osv-scanner on  support-cran [!?] via 🐹 v1.20.7 via  v20.6.1
❯ scripts/build.sh; ./osv-scanner
call-analysis: false
call-analysis: go,rust

osv-scanner on  support-cran [!?] via 🐹 v1.20.7 via  v20.6.1
❯ scripts/build.sh; ./osv-scanner --call-analysis
call-analysis: true
call-analysis: go,rust

osv-scanner on  support-cran [!?] via 🐹 v1.20.7 via  v20.6.1
❯ scripts/build.sh; ./osv-scanner --call-analysis=rust
call-analysis: true
call-analysis: rust

osv-scanner on  support-cran [!?] via 🐹 v1.20.7 via  v20.6.1
❯ scripts/build.sh; ./osv-scanner --call-analysis=false
call-analysis: true
call-analysis:

However thinking about it since the whole point is to have it enabled by default I actually don't think we need this because we should expect people will only be providing --call-analysis when they want to change the value.

I think it's probably worth talking about what the short term plans are for call analysis: do you expect there will be a large number of languages landing in the next 6 months? if not then frankly this is probably not worth much effort right now so I'd just go with having go and rust because it won't be a breaking change to later add groups like stable, all, experimental, etc.

Part of why I say this is that I realised two additional things:

  1. I'd forgotten about array-based flags, which @hogo6002 you correctly pointed out in your last codeblock; however I'm actually wondering if it might still be worth supporting the comma based list because --call-analysis is quite lengthy already so if you combine that with many languages you get a very tiring command to type very quickly
  2. you could use a + and - syntax like dig: -ca=+all,-rust, -ca=+all -ca=-rust, -ca=+stable,+experimental,-rust, etc
    * (I've used -ca here just to keep char length down, but tbh that might be a useful alias anyway)

Both of these really only matter if the plan is to have a number of languages supported in different states, and they should not be breaking to add in future.

I guess in saying that we could also just go with --call-analysis-rust and --call-analysis-go too 🤷

@another-rex
Copy link
Collaborator Author

another-rex commented Nov 21, 2023

However thinking about it since the whole point is to have it enabled by default I actually don't think we need this because we should expect people will only be providing --call-analysis when they want to change the value.

+1

I'm actually wondering if it might still be worth supporting the comma based list because --call-analysis is quite lengthy already so if you combine that with many languages you get a very tiring command to type very quickly

The vast majority of use cases will only be scanning one or 2 languages at a time, so I don't think this is something to be concerned about. So happy to go with the array-based flags.

you could use a + and - syntax like dig: -ca=+all,-rust, -ca=+all -ca=-rust, -ca=+stable,+experimental,-rust, etc

I do quite like the +/- syntax over having an additional flag, but since there's no plans to add a lot of supported languages at the moment, I don't want to over complicate this.

Let's go with this:

--call-analysis cli.StringSlice[]
--no-call-analysis cli.StringSlice[]

each of which can specify the language to enable/disable, and an additional entry all for disabling/enabling everything. And we'll require these to have a value to keep the implementation simple, we can always add GenericFlag in in the future if there's a need for it.

This should result in the behavior you already have @hogo6002, just using StringSlice instead of string so when additional languages are supported you can specify multiple flags like --call-analysis=go --call-analysis=rust, instead of comma separated to keep it simple.

EDIT:

Changed --disable-call-analysis to --no-call-analysis to keep it shorter and more consistent with existing flags (e.g. --no-ignore)

hogo6002 added a commit that referenced this issue Nov 28, 2023
Fix issue #513
- Replace `experimental-call-analysis` with `call-analysis`.
(`--call-analysis=all`, `--call-analysis=rust`)
- Adding a `--no-call-analysis` to disable call analysis.
(`--no-call-analysis=all`, `--no-call-analysis=go`). This overrides
`call-analysis`.
- Delete `call-analysis` from `experimental-config` in result report.
- Call analysis for non-experimental languages (e.g. go) is auto
enabled.
@another-rex
Copy link
Collaborator Author

Fixed in #665 and #682

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants