-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(index/fetch) Add index URL schemes support to fetch #301
feat(index/fetch) Add index URL schemes support to fetch #301
Conversation
f75c993
to
37f179e
Compare
pkg/index/fetch.go
Outdated
req, err := http.NewRequestWithContext(ctx, "GET", url, http.NoBody) | ||
if err != nil { | ||
return nil, fmt.Errorf("cannot fetch index: %w", err) | ||
type FetchFunc func(context.Context, string) ([]byte, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I set the return signature to []byte
to keep the unmarshall logic in the main fetch function. Incase it is desired to support backends that store indices not as single YAML file we could also move the unmarshal logic into a util function and have the Fetch
functions return an index entry list instead.
b856549
to
d8f36f5
Compare
Thinking out loud here some potential future backends might be using HTTP/S as their protocol language but could require more complex requests (i.e., not only a simple GET request). |
d8f36f5
to
76ad94a
Compare
changed the approach to this |
76ad94a
to
f7b0489
Compare
Context("failure", func() { | ||
When("without URL", func() { | ||
BeforeEach(func() { | ||
args = []string{indexCmd, addCmd, "--config", configFile, indexName} | ||
}) | ||
addAssertFailedBehavior(indexAddUsage, "ERRO: accepts between 2 and 3 arg(s), received 1") | ||
}) | ||
|
||
When("with invalid URL", func() { | ||
BeforeEach(func() { | ||
args = []string{indexCmd, addCmd, "--config", configFile, indexName, "NOTAPROTOCAL://something"} | ||
}) | ||
addAssertFailedBehavior(indexAddUsage, "ERRO: unable to add index: unable to fetch index \"testName\""+ | ||
" with URL \"NOTAPROTOCAL://something\": unable to fetch index: cannot fetch index: Get "+ | ||
"\"notaprotocal://something\": unsupported protocol scheme \"notaprotocal\"") | ||
}) | ||
|
||
When("with invalid backend", func() { | ||
BeforeEach(func() { | ||
args = []string{indexCmd, addCmd, "--config", configFile, indexName, "http://noindex", "notabackend"} | ||
}) | ||
addAssertFailedBehavior(indexAddUsage, "ERRO: unable to add index: unable to fetch index \"testName\" "+ | ||
"with URL \"http://noindex\": unsupported index backend type: notabackend") | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only added failure tests for the new behavior atm, because i noticed that the current implementation of
falcoctl index add
behaves strangely or could be considered buggy i.e.,
If you call index add
for and index name
for which there already exists in the cache
the CMD will always return Index %q successfully added
and do no modification. For example consider the following sequence
$ cat example.yaml
cat: example.yaml: No such file or directory
$ ./falcoctl index add --config example.yaml example https://falcosecurity.github.io/falcoctl/index.yaml
INFO Adding index
INFO Index "example" successfully added
$ cat example.yaml
indexes:
- name: falcosecurity
url: https://falcosecurity.github.io/falcoctl/index.yaml
backend: ""
- name: example
url: https://falcosecurity.github.io/falcoctl/index.yaml
backend: ""
$ ./falcoctl index add --config example.yaml example NOPE://NOPE
INFO Adding index
INFO Index "example" successfully added
$ cat example.yaml
indexes:
- name: falcosecurity
url: https://falcosecurity.github.io/falcoctl/index.yaml
backend: ""
- name: example
url: https://falcosecurity.github.io/falcoctl/index.yaml
backend: ""
One would expect the add command to error that the index already exists or that the index is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @max-frank, the initial idea was to have the add command
return an error only in cases when there is not possible to add an index. I agree with you that the message should be different if the index already exists, but it should not error.
Anyway, feel free to modify the current implementation and extend the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood if its modified behavior I can modify the output in a follow up PR.
I'll try to add more tests relevant to this PR I initial did not because of confusion regarding the above interaction and that I a wall in making the cache and config clean up work correctly in between test runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alacuku For the tests i have run into a slight problem here. For the add tests to be stable and isolated I need to make sure to clean all index config and cache files after each run, but the location of those files is right now not configurable (see code here).
Now since I cant use an isolated tmp config dir it would mean that index add
tests would have side effects on developer machines, i.e., Indices from tests are added to their real local configuration and if we were to do a clean up after a test case we would delete all their index config.
Is it ok if i skip the positive tests here for now until this issue is addressed?
f7b0489
to
a6f6e7a
Compare
222926e
to
d1cca11
Compare
I was playing around with the follow up PR that would add GCS support on top of the changes made here and noticed that we kinda need to pass the index entry config to the fetch function. Since if we dont getting the backend config for an entry becomes tedious. So I updated the PR, but since the fetch logic now references the index config entry struct i had to move it to a separate package to avoid import loops. With that also came some linting required changes e.g., |
b49c468
to
747ca23
Compare
Hi @max-frank ,
Could you elaborate more on the |
@alacuku I was mostly thinking of storage backends that use the same protocol (thus same URI scheme), but work differently e.g., in case of HTTP we can have a simple server/API that expects a simple GET request with maybe basic auth, oauth, or some authorization token. That minus the Auth would be the currently supported logic. Now we can take Azure Blob Storage this can be accessed via HTTP as well, but is a lot more complicated. Looking at the documentation of the REST API it requires certain headers to be set and authorized access uses special Azure specific authorization types some simple like shared keys some less so with dynamic signatures that have to be put in the header. Stacking support for these kinds of things into a single scheme would get complicated quickly I believe. Though I am aware that I am thinking a bit ahead here in terms of considering support for these things. |
It would be great to have them both. If the user does not explicitly set the backend Do you think we can do it in this PR? |
I can add some simple scheme to backend mapping with one default for each scheme. |
856b156
to
3491946
Compare
Added the logic for now its just http and https mapping to the same string, but once there are more backends it will not look as pointless e.g., for Google Cloud Storage Also I kept the behavior to try to use HTTP if we cant resolve to a backend type. |
3491946
to
aefc12d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left two minor comments. It seems ok to me. We are a rebase away from merging this one.
Nice job @max-frank! ❤️
Signed-off-by: Maximilian Frank <mfrank@mercari.com>
This makes it possible to pass complex backend configs to the fetch logic Signed-off-by: Maximilian Frank <mfrank@mercari.com>
Signed-off-by: Maximilian Frank <mfrank@mercari.com>
Signed-off-by: Maximilian Frank <mfrank@mercari.com>
aefc12d
to
b5c5b7e
Compare
Comment updated, dead code removed and rebased to latest main 👍 |
Also I've been working offline on the followup PR to add GCS as backend. That one will be ready for review soon as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/LGTM
LGTM label has been added. Git tree hash: d5cb49af4bb0fbc0df84677eced457e9f54c85b1
|
I would be happy to review it. And again, thank you for your efforts in the falcoctl project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Thanks also @alacuku for the review!
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alacuku, FedeDP, max-frank The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Adding different fetchers per URL scheme allows falcoctl indices to be hosted on many different platforms.
What type of PR is this?
/kind feature
Any specific area of the project related to this PR?
/area library
What this PR does / why we need it:
To pave way to support index hosting backends other than plain HTTP/S.
Which issue(s) this PR fixes:
First step to implementing #300
Special notes for your reviewer: