-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[GitHub] Add a GraphQL client to the connector #3837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b3fcb92
to
c2bc286
Compare
37ec6bc
to
c067504
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty cool! It looks like these clients are all unauthenticated - out of curiosity, do you foresee them ever being authenticated?
pkg/sources/github/connector.go
Outdated
default: | ||
return nil, fmt.Errorf("unknown connection type") | ||
} | ||
} | ||
|
||
func createAPIClient(ctx context.Context, httpClient *http.Client, apiEndpoint string) (*github.Client, error) { | ||
ctx.Logger().WithName("github").V(2). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this logger name be added way higher than the call stack, at entry to the overall github
code? I presume you didn't do that because github.com/trufflesecurity/trufflehog/v3/pkg/context
doesn't support WithName
, but I don't think kludging around that by adding contextual information in the "wrong" spot like this is a good solution overall because we've seen it contribute to bifurcation of our loggers (where a chunk of code is using two loggers at once, each with different contextual information), which in turn has caused concrete debugging problems.
To be clear, I'm not worried that this specific code change will cause an issue - I'm worried that in nine months somebody else is going to copy the pattern somewhere else without thinking too hard about it and create a maintainability problem. If you want to add this piece of context, I think it's safer in the long term to add it at the entry to the github
code either using WithValue
or with a new WithName
that you add to our logging package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the current implementation doesn't make it easy to have a named logger for a package. You either have to call .WithName()
repeatedly or pass a logger
separately -- neither are ideal.
I think it's safer in the long term to add ... a new WithName that you add to our logging package.
I can try to tackle that in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WithName
functionality can't be added to the logger Context
. Calls to WithName
are appended to the logger, meaning that a context passed between packages could become trufflehog.github.git.detector.etc
.
The best approach is probably a simple helper function in the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is what you're going for, but a logger in a context can be modified via:
ctx = context.WithLogger(ctx, ctx.Logger().WithName("github"))
Not the most ergonomic, but imo not completely foreign (might be just me since I wrote trufflehog/pkg/context
and trufflehog/pkg/log
though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is what you're going for, but a logger in a context can be modified via:
ctx = context.WithLogger(ctx, ctx.Logger().WithName("github"))Not the most ergonomic, but imo not completely foreign (might be just me since I wrote
trufflehog/pkg/context
andtrufflehog/pkg/log
though).
I think this would still have the issue of 'unbounded' logger names:
Calls to
WithName
are appended to the logger, meaning that a context passed between packages could becometrufflehog.github.git.detector.etc
.
There's no way to unset or overwrite the logger name that I'm aware of.
It should inherit authentication from the HTTP client's transport: trufflehog/pkg/sources/github/connector_token.go Lines 34 to 39 in c067504
|
🤦 no idea how i missed that |
d0bd942
to
6c7f34e
Compare
b3ede0c
to
bec5710
Compare
4fd5aa6
to
9c1c695
Compare
9c1c695
to
fa55481
Compare
64467bb
to
0d733d9
Compare
6627b2c
to
7e7fcdb
Compare
7e7fcdb
to
7208c39
Compare
7208c39
to
97aa98c
Compare
Hey @rosecodym, do you have any other concerns with this PR? If not, can it be merged? |
b6d92ed
to
21fbed3
Compare
b8ececf
to
72a02fb
Compare
Hi @rgmz, could you please resolve the conflicts when you have a moment? We're planning an enhancement to the GitHub source, and this looks like a great starting point for that work. |
@rgmz Sorry, I totally missed your re-review request! My sincere apologies. I'll do it later today. |
if err != nil { | ||
return nil, fmt.Errorf("could not create API client: %w", err) | ||
} | ||
|
||
graphqlClient, err := createGraphqlClient(ctx, httpClient, apiEndpoint) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We wrap the API client creation error; we should either wrap this one as well, or wrap neither.
if err != nil { | ||
return nil, fmt.Errorf("could not create API client: %w", err) | ||
} | ||
|
||
graphqlClient, err := createGraphqlClient(ctx, httpClient, apiEndpoint) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same error-wrapping comment as above
|
||
graphqlClient, err := createGraphqlClient(ctx, httpClient, apiEndpoint) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same error-wrapping comment as above
parsedURL.Path = before + "/api/graphql" | ||
graphqlEndpoint = parsedURL.String() | ||
} | ||
getLogger(ctx).V(2).Info("Creating GraphQL client", "url", graphqlEndpoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember the original reason you used a named logger, but I don't see anything in the current iteration of the PR that requires it. Am I missing something or can you replace this with normal ctx.Logger()
stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember the original reason you used a named logger, but I don't see anything in the current iteration of the PR that requires it. Am I missing something or can you replace this with normal ctx.Logger() stuff?
Seems like good practice to use a named logger, no? Especially with #4172 coming soon, otherwise logs won't have source attribution (unless I'm missing something obvious here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other places we use logger names in the codebase is in a handful of detectors and at the very top of the program where we set the logger name to trufflehog
. I don't think it's necessarily a bad pattern, but if we're going to introduce it I'd like to do so with a plan - having our sources log things using an inconsistent set of techniques has historically caused problems when people expect things to work one way but they work a different way. (The new usage here also suggests a caller should persist the generated logger - even though this particular PR immediately throws it away - which has also historically caused logging information to be dropped.)
The fact that this source's logging is coming from a GitHub source should already be indicated by keys that are attached to the injected context higher up the call stack.
Description:
This introduces a latent GraphQL client without making any functional changes. The GraphQL API is required for certain features (e.g., #1906), and it is already being manually called in at least once place:
trufflehog/pkg/sources/github_experimental/object_discovery.go
Lines 429 to 432 in 6d1c59f
shurcooL/githubv4
is the library recommended by the authors ofgoogle/go-github
.Checklist:
make test-community
)?make lint
this requires golangci-lint)?