Skip to content

Conversation

egregius313
Copy link
Contributor

@egregius313 egregius313 commented Aug 20, 2024

This introduces documentation for the Models-as-Data library for Go.

@egregius313 egregius313 requested review from owen-mc and a team August 20, 2024 04:36
Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. We should also have a section on version-matching. (Note that if the package does not contain a major version suffix (like "/v2") then we will match all major versions. This can be disabled by putting fixed-version: at the start of the package path.)

pack: codeql/go-all
extensible: sourceModel
data:
- ["net", "", False, "Listen", "(string,string)", "", "ReturnValue", "remote", "manual"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused because this isn't a model that we currently have. (In particular it has "(string, string)" as the signature.) Is this in a PR you have open? Or are you in the middle of adapting an example from another language?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in the middle of adapting the C# docs. There are a few things which still need re-writing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example that had been used in the C# example was the API for opening a socket. As I understand it, Listen is roughly the equivalent in Go, and would make sense as a remote flow source.

The wording from the C# documentation mentioned it being from a existing model, because sockets were already modeled in C#.

I am fine opening a small PR to add Listen as a remote flow source if we think that makes sense. Alternatively, I can reword this documentation to (1) remove the wording about being an existing model or (2) change it to a different API if you have a suggestion for a better function to highlight.

Copy link
Contributor

@owen-mc owen-mc Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. I recommend you talk about this model instead, as Go is used a lot for dealing with http requests.

- ["net/http", "Request", True, "FormValue", "", "", "ReturnValue", "remote", "manual"]

Unless you think that is too close to the later example for using the field Request.Body as a source in the same package. In that case I guess you could use one of the file sources that you merged yesterday, like os.ReadFile.

Package grouping
~~~~~~~~~~~~~~~~

Since Go uses URLs for package identifiers, it is possible for packages to be imported with different paths. For example, the ``glog`` package can be imported using both the ``github.com/golang/glog`` and ``gopkg.in/glog`` paths.
Copy link
Contributor

@owen-mc owen-mc Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before go modules, this was much more true. Now the go.mod file specifies the URL that this module expects to be found at, which creates a canonical import path for a module (and hence the packages it contains). The reasons why we want to group models for several different import paths are:

  • forks (e.g. klog was forked from glog by kubernetes);
  • when a package moves, we still want to analyse code written for the old import path.

We don't have to explain this all, but I mention it so you have it in mind when writing the sentence above.

egregius313 and others added 3 commits August 20, 2024 17:00
Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
egregius313 and others added 2 commits August 20, 2024 17:04
Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
Go currently does not use `neutralModel`s and they are less relevant for Go than for Java/C#.
Go does not currently have any equivalent with regards to lambda flow
@egregius313 egregius313 marked this pull request as ready for review August 20, 2024 21:17
@egregius313 egregius313 requested a review from owen-mc August 20, 2024 21:17
@owen-mc
Copy link
Contributor

owen-mc commented Aug 20, 2024

I'll give this a proper review tomorrow.

Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
@egregius313
Copy link
Contributor Author

@owen-mc I have changed the one example to use Request::FormValue instead of Listen.

Please let me know if you think this needs any further changes or if you think we're ready for docs review.

@egregius313 egregius313 requested a review from owen-mc August 21, 2024 22:39
owen-mc
owen-mc previously approved these changes Aug 22, 2024
Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote up the last section I think is missing.

Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
@egregius313 egregius313 added the ready-for-doc-review This PR requires and is ready for review from the GitHub docs team. label Aug 22, 2024
@egregius313 egregius313 requested a review from a team August 22, 2024 14:01
subatoi
subatoi previously approved these changes Aug 23, 2024
Copy link
Contributor

@subatoi subatoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Words look good—I don't think there'll be any issues with rendering but feel free to open a follow-up and tag us again if you need any help. Thank you!

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be merged until we fix some problems that were recently found (not in this PR). Giving it a Request Changes review to avoid us accidentally merging it.

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pre-requisites have now been merged.

The sixth value should be left empty and is out of scope for this documentation.
The remaining values are used to define the ``access path``, the ``kind``, and the ``provenance`` (origin) of the sink.

- The seventh value ``Argument[0]`` is the ``access path`` to the first argument passed to the method, which means that this is the location of the sink.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to give some example with a field or element access, to make clear that it can be more than just an arg spec or similar.


- The seventh value is the access path to the input (where data flows from). ``Argument[0]`` is the access path to the first argument (``elems`` in the example) and ``Argument[1]`` is the access path to the second argument (``sep`` in the example).
- The eighth value ``ReturnValue`` is the access path to the output (where data flows to), in this case ``ReturnValue``, which means that the input flows to the return value.
- The ninth value ``taint`` is the kind of the flow. ``taint`` means that taint is propagated through the call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that this could be value and the difference between them?

Copy link
Contributor

@owen-mc owen-mc Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The equivalent page for java has a value flow example (through a lambda). Maybe we should come up with one that includes value flow and a different access path. Maybe something that stores a value in a field, or reads one out? Or something that stores a value in an array, or reads one out?

Copy link
Contributor Author

@egregius313 egregius313 Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would examples relating to the slices package make sense? Something like slices.MaxFunc is comparable to the Java example (Stream#map) in terms of:

  1. Value flow element -> func
  2. Value flow element -> return value

Though that brings up some other questions:

  • How does MaD-for-Go handle generics?
  • Are elements of slices handled by .ArrayElement? (Based on how append is modeled I think so, but want to check)

Another option is slices.Concat, but that has some annoyances because of varargs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking something like:

for func MaxFunc[S ~[]E, E any](x S, cmp func (a, b E) int) E

- ["slices", "", True, "MaxFunc", "", "", "Argument[0].ArrayElement", "Argument[1].Parameter[0]", "value", "manual"] 
- ["slices", "", True, "MaxFunc", "", "", "Argument[0].ArrayElement", "Argument[1].Parameter[1]", "value", "manual"] 
- ["slices", "", True, "MaxFunc", "", "", "Argument[0].ArrayElement", "ReturnValue", "value", "manual"] 

Copy link
Contributor

@owen-mc owen-mc Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Go currently ignores specializations and just treats them as if they're the generic type. I've never come across a situation where this has been an issue.
  • Yes, we treat slices just like arrays. They're quite interchangeable in Go. We don't currently use .Element anywhere in Go models (except some tests to make sure it would work).
  • The go dataflow library doesn't currently deal with lambdas, so I don't think the first two of those models would work. slices.Concat might be good - yes, the variadic argument makes it more complicated, but it also gives you an opportunity to explain how to model variadic arguments, which is currently missing. (Not that these docs have to be exhaustive.)

Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked through it more carefully now. I think the description of the subtypes column needs to be updated. And the value of the subtypes column in one case. And I think we should mention the ReturnValue[i] syntax.

Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
egregius313 and others added 5 commits November 24, 2024 21:24
Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
@owen-mc
Copy link
Contributor

owen-mc commented Nov 26, 2024

I found it a bit odd that the docs would mention models that we don't have yet, so I made a PR to model slices: #18108.

Co-authored-by: Owen Mansel-Chan <62447351+owen-mc@users.noreply.github.com>
@egregius313 egregius313 merged commit 1b224c1 into github:main Nov 28, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation ready-for-doc-review This PR requires and is ready for review from the GitHub docs team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants