New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
White/Black List Filtering and Multiple Registries Refactor #271
Conversation
|
This is obviously pretty big, I'm going to try to follow this up with a test plan and try to point out the stuff that needs review attention. It's a little misleading in +/- because reorg'd a bunch of files into different packages. |
etc/example-config.yaml
Outdated
| @@ -23,3 +29,5 @@ broker: | |||
| launch_apb_on_bind: false | |||
| recovery: true | |||
| output_request: true | |||
| whitelist: "/etc/ansible-service-broker/whitelist.yaml" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eriknelson I think this needs to be moved to the registry here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, missed that one.
pkg/apb/types_test.go
Outdated
| @@ -20,6 +20,7 @@ const SpecImage = "fusor/etherpad-apb" | |||
| const SpecBindable = false | |||
| const SpecAsync = "optional" | |||
| const SpecDescription = "A note taking webapp" | |||
| const SpecRegistryName = "test" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed missed it in the cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
pkg/registries/registry.go
Outdated
| } | ||
|
|
||
| if len(filteredNames) != 0 { | ||
| r.log.Info("Filtered APBs:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the images might not be APB's?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels very wrong to me for the adapter to return a list of image names for filtering that contain images that are not APBs. I suppose it could happen, and the adapter's LoadSpecs will validate the labels and metadata further, but if GetImageNames is returning non-APBs, something is going awry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it feels that wrong then we need to discuss how to change the dockerhub adapter. Currently, the dockerhub adapter will get all the images in an org. This does not know anything about APB's because we can not check the label at this time. We could do something like if the name does not contain "apb" then do not return it. I am having trouble coming up with another solution that does not make another network request. I also think that this might want to be an issue/card that we prioritize for next sprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm realizing my initial assumption was wrong, but I still think the set of image names that hits the filter should ideally be only APBs. I originally thought the adapter was pushing the query serverside (GET /?name=*-apb) or something to that effect, but that's actually not how its ever worked. It has always fetched every name under an org, and we filtered on our side based on *-apb. It was always a limitation of their API, from the work we did back in CAP days.
My thoughts on an ideal scenario:
- Adapters can query a service and get back only APBs
- Registry is able to validate this set and confirm that the names are indeed apbs
- Vetted set of names hits filter
- By this point, we're confident we have a set of images we know we care about, and we load the specs. (Possibly validate additional labels like apb.spec version).
Obviously we need to make some concessions because:
- The service a particular adapter is talking to for fetching APB names may or may not support server-side filtering. In Dockerhub's case, it definitely does not.
- Since we're strictly dealing with names, we don't have a lot of options for determining whether or not an image is actually an APB. Until now, it has always been
-apbsuffix.
I'd propose Registry.LoadSpecs to filter any image names that do not match -apb$ before they hit Filter.Run. It's a filter that applies to all registries, so it belongs at the Registry level and not in the adapters. Potential drawback here is possible apbs will be filtered if they don't follow this naming convention, but that has always been the case until now, and to your point, we should put a card on the backlog to think about how we can accomplish this better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will work on this now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 when I ask the adapter for the list of images, I expect the list of APBs. I understand if the adapter needs to get ALL of them because of an api limitation but it should filter it down by name first.
pkg/registries/registry.go
Outdated
|
|
||
| if len(validNames) != 0 { | ||
| r.log.Debugf("Passing APBs:", r.config.Name) | ||
| for _, name := range validNames { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about putting in a goroutine. We could do the whole loop off by itself with little effort and would not block if slice gets big
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want this loop's logs to get interlaced with other logs running concurrently. What do you think about building up a single string with line breaks inside a goroutine and outputting once, so you get the full list displayed atomically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to go through the filter piece. Looking good 👍
| Both are optional file paths that are yaml files containing arrays | ||
| of regexes. Broker will construct the set of eligible APBs from this list. | ||
|
|
||
| If only whitelist is present, strictest mode of operation. Any APB that does *not* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rephrase this.
| If only whitelist is present, strictest mode of operation. Any APB that does *not* | ||
| match on a regex in the whitelist will be filtered. | ||
|
|
||
| If only blacklist is present, all discovered APBs will be eligible *unless* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also rephrase this so it's obvious what blacklist is. Something like If only blacklist.yml exists, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we make a table?
The lists will have the following behavior:
| present | allowed | blocked |
|---|---|---|
| only whitelist | matches a regex in list | ANY APB that does not match |
| only blacklist | ALL APBs that do not match | APBs that match a regex in list |
| both present | matches regex in whitelist but NOT in blacklist | APBs that match a regex in blacklist |
| If only blacklist is present, all discovered APBs will be eligible *unless* | ||
| they match on a regex found in the blacklist. | ||
|
|
||
| If both are present, same behavior as whitelist, but if a value matches on white |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be specific about what's present and add more detail to this sentence.
| If both are present, same behavior as whitelist, but if a value matches on white | ||
| and black, blacklist will take precedence and block the whitelist match. | ||
|
|
||
| These can be easily configured by cluster operators via config maps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can there be a section on configuration? This gets mentioned out of no where.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not intended to be considered operator documentation, I need to follow this up with that. It was mostly scratch design thoughts prior to writing any code.
| @@ -0,0 +1,80 @@ | |||
| # APB Filter Design Doc | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, this feels inbetween a spec and a guide. Should this doc be purely focused on how to use the filter instead of how it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is distinctly a "Spec" or a "Design" doc that we started doing this sprint for the major features. The intent is to have a plan that has been vetted and reviewed before implementation. It was a bit awkward because we were still figuring out how we wanted to accomplish this, but its the same type of document as the one @shawn-hurley sent out for multi-reg and @jmrodri sent out for auth. I think we're trying to settle on posting them as PRs against his repo moving forwards, the auth one being a good reference.
I do still need to document white/blacklist usage in the relevant config doc though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm conflicted because I think the spec should ultimately reflect the implementation as was committed. If we go with the PR and specifications directory approach, we can have a history of the conversation. Seeing how it began and how it ultimately ended.
I was planning on updating my auth spec to reflect the implementation. Then leave it be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmrodri @rthallisey so what action should I take based on this? Does this document need to get updated, or is it okay to leave it reflective of design thoughts before the implementation and update separate official usage documentation for operators?
pkg/broker/broker.go
Outdated
| return nil, err | ||
| } | ||
| if err != nil { | ||
| registryErrors = append(registryErrors, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also want to output the error here? Whatever err is never gets pushed to stdout when r.Fail is false.
pkg/registries/registry.go
Outdated
| return specs, len(imageNames), nil | ||
| } | ||
|
|
||
| // Fail - should this registry and error cause a failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rephrase this comment
pkg/broker/broker.go
Outdated
|
|
||
| return &BootstrapResponse{SpecCount: len(specs), ImageCount: imageCount}, nil | ||
| } | ||
|
|
||
| func addNameAndIDForSpec(specs []*apb.Spec, registryName string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think lint may yell at you for not adding a comment before this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an unexported method, so it wont complain. Do you think there needs to be a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment would be good. Looking at the title, it wasn't clear to me at first what this function did.
4cf6978
to
3c49e24
Compare
| return specs, len(imageNames), nil | ||
| } | ||
|
|
||
| func registryFilterImagesForAPBs(imageNames []string) []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eriknelson can you make sure this is what you were thinking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawn-hurley perfect, thanks. We should discuss your thoughts more too, think you're right; there's a better way to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that was a big one :) All in all great job. Some minor nits with typos or document changes.
|
|
||
| `whitelist: "/etc/ansible-service-broker/whitelist.yaml"` | ||
| `blacklist: "/etc/ansible-service-broker/blacklist.yaml"` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to be more readable:
whitelist: "/etc/ansible-service-broker/whitelist.yaml"
blacklist: "/etc/ansible-service-broker/blacklist.yaml"
| `blacklist: "/etc/ansible-service-broker/blacklist.yaml"` | ||
|
|
||
| Both are optional file paths that are yaml files containing arrays | ||
| of regexes. Broker will construct the set of eligible APBs from this list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both are yaml files containing arrays of regular expressions. Both file paths are optional. Broker will construct the set of eligible APBs from this list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the problem I mentioned to you. These aren't files anymore, they're arrays in the main configuration file on registries. What should I do now that implementation has changed since spec design?
| If only whitelist is present, strictest mode of operation. Any APB that does *not* | ||
| match on a regex in the whitelist will be filtered. | ||
|
|
||
| If only blacklist is present, all discovered APBs will be eligible *unless* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we make a table?
The lists will have the following behavior:
| present | allowed | blocked |
|---|---|---|
| only whitelist | matches a regex in list | ANY APB that does not match |
| only blacklist | ALL APBs that do not match | APBs that match a regex in list |
| both present | matches regex in whitelist but NOT in blacklist | APBs that match a regex in blacklist |
| on Registries. Can extend LoadSpecs() method to accept a "filter" object, | ||
| so they can easily filter discovered apbs. Registries should filter from | ||
| total list of discovered apb names before loading specs from metadata. Cuts | ||
| down on unecessary work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some typos and formatting :)
NOTE: Broker filtering of APBs should be performed behind LoadSpecs() method on Registries. We can extend LoadSpecs() method to accept a "filter" object, so they can easily filter discovered APBs. Registries should filter from total list of discovered APB names before loading specs from metadata. This will cut down on unnecessary work.
| total list of discovered apb names before loading specs from metadata. Cuts | ||
| down on unecessary work. | ||
|
|
||
| Loudly log strange corner cases like no specs available due to empty but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| "sync" | ||
| ) | ||
|
|
||
| type filterMode uint8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this C? 😆
| return reduceChunks(matchChunksChan) | ||
| } | ||
|
|
||
| func reduceChunks(chunks <-chan []string) <-chan matchSetT { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, feels very CompSci-y
| out := make(chan matchSetT) | ||
| go func() { | ||
| matchSet := make(matchSetT) | ||
| // Read and process chunks as they complete until input chan is cloesd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TYPO: cloesd -> closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| const testBlacklistFile = "blacklist.yaml" | ||
| const testBlacklistOverrideFile = "blacklist_override.yaml" | ||
|
|
||
| func testGetRegexFromFile(file string) []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually run? I always thought they had to be TestXYZ. I learned something new today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Believe it's a utility function, not a test. Think it deserves a name change?
| Pass string | ||
| Org string | ||
| Type string | ||
| Name string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do the optional fields need to be marked omitempty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are never marshaling registry to JSON so I don't see a need. does anyone else have thoughts?
|
Spoke with @rthallisey and @jmrodri re: |
Adding ability to catch ID collison and to warn user which spec is overwritting. fix build issues when refactoring registry to its own package. Adding registry failure bahavior * if all fail, fail the bootstrap * if 1 fails and is marked on fail_on_error then fail the bootstrap\ this does not continue the bootstrap * Adding method that will take an error and determine if the registry should fail refactor so that the dockerhub and rhcc are using the same image data -> spec. * stop pulling down entire image for dockerhub * using regisitry.dockerhub.com to get the manifest which is the same manifest as RHCC * able to combine image data -> spec. Multiple registries that are using the same manifest EP. * Allows us to generate the spec ID. * attempts to make the service name unique by keeping the registry with it. fixing bug with name not valid.
* Tests are failing and need to write more. * need to make dockerhub FetchSpec faster with goroutines
6539b41
to
dd868e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. Minor nits
pkg/broker/broker.go
Outdated
|
|
||
| return &BootstrapResponse{SpecCount: len(specs), ImageCount: imageCount}, nil | ||
| } | ||
|
|
||
| func addNameAndIDForSpec(specs []*apb.Spec, registryName string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment would be good. Looking at the title, it wasn't clear to me at first what this function did.
| config := Configuration{URL: u} | ||
| adapter := RHCCAdapter{Config: config, Log: log} | ||
| imageNames, err := adapter.GetImageNames() | ||
| //specs, num, err := reg.LoadSpecs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a left over comment
| func registryFilterImagesForAPBs(imageNames []string) []string { | ||
| newNames := []string{} | ||
| for _, imagesName := range imageNames { | ||
| if strings.HasSuffix(strings.ToLower(imagesName), "-apb") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This filters for the *-apb tag in the name. In the past we're we reading labels for an image? Do we document that '-apb' is required per image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Dockerhub adapter has always pulled the entire population of image names from the API and processed them broker-side in this way (due to a limitation of the API), then loaded the metadata from a separate endpoint and confirmed the label. We had a discussion about this elsewhere on the PR I'll try to link, but this was the conclusion until we can address this directly.
+1 re: documentation, that's a must.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be followed up and documented here https://github.com/fusor/ansible-playbook-bundle imo. @eriknelson @jmrodri @rthallisey thoughts?
| filterModeNone | ||
| ) | ||
|
|
||
| type failedRegexp struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: Make a type.go for all the structs and consts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually am not a huge fan of this, but I could be swayed. @shawn-hurley what's most golang-like, in your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernetes seems to use the type file. the standard lib does not. a good example is the net/http package. https://golang.org/src/net/http/header.go. Usually, when talking about good style/idioms I side with the standard library.
|
@eriknelson ack from me once my comments are addressed |
1 similar comment
…t#271) * initial attempt at multiple registries. Adding ability to catch ID collison and to warn user which spec is overwritting. fix build issues when refactoring registry to its own package. Adding registry failure bahavior * if all fail, fail the bootstrap * if 1 fails and is marked on fail_on_error then fail the bootstrap\ this does not continue the bootstrap * Adding method that will take an error and determine if the registry should fail refactor so that the dockerhub and rhcc are using the same image data -> spec. * stop pulling down entire image for dockerhub * using regisitry.dockerhub.com to get the manifest which is the same manifest as RHCC * able to combine image data -> spec. Multiple registries that are using the same manifest EP. * Allows us to generate the spec ID. * attempts to make the service name unique by keeping the registry with it. fixing bug with name not valid. * Initial filter commit + tests (openshift#261) * fixing util * first pass at refactor * Tests are failing and need to write more. * need to make dockerhub FetchSpec faster with goroutines * fixing tests and fixing vet, lint issues * Adding test cases for registry updating name for get images * fixing name generation for uniqueness and dns compliant. * fixing test cases * Precompiled filter regex * documentation changes for registry config * Addressing PR comments regarding config and registry * rebase error * Adding registry ability to filter out images that do not end in -apb * addressing comments on typos, and documentation * fixing the build upon rebasing master * Addressing pull request comments.
No description provided.