-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/ip block reservation #12
Feature/ip block reservation #12
Conversation
/ok-to-test |
pkg/util/ip_reservation.go
Outdated
} | ||
|
||
// Increment the given IP value by the provided step | ||
func incrementIP(ip net.IP, step byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of modifying the ip in-place, it would be better to create and return a new one
pkg/csi_driver/controller.go
Outdated
@@ -107,6 +120,7 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu | |||
return nil, status.Error(codes.Internal, err.Error()) | |||
} | |||
} | |||
s.config.mutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to use defer s.config.mutex.Unlock()
after Lock()
so you don't miss unlocking when you return early from an error
pkg/csi_driver/controller.go
Outdated
@@ -81,7 +85,16 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu | |||
if err := s.config.driver.validateVolumeCapabilities(req.GetVolumeCapabilities()); err != nil { | |||
return nil, status.Error(codes.InvalidArgument, err.Error()) | |||
} | |||
s.config.mutex.Lock() | |||
instances, err := s.config.fileService.ListInstances(ctx, "-") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid holding locks across calls that could potentially block for a long time. I'm not sure if we need to lock in this case since each CreateVolume is doing its own ListInstances call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also move this call to just before CreateInstance
to avoid unnecessary cloud provider calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To read the updated list of reserved IPs, we have to wait until this call succeeds to avoid read after write hazard, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are thinking about the scenario on how to coordinate multiple CreateVolume requests and make them select different IPs.
One way to do it is to lock the entire CreateVolume call so only one volume create can occur at a time. This will work, but I think it will be undesirable because every cloud call could take on the order of minutes. We could try this way just for now to get it working, but we'll definitely need to improve on it.
In the end, we want to allow parallel CreateVolume operations to occur, but need to serialize IP allocation. I think this will require some sort of caching/shared allocator.
pkg/csi_driver/controller.go
Outdated
paramTier = "tier" | ||
paramLocation = "location" | ||
paramNetwork = "network" | ||
paramReservedIPV4CIDR = "cidr" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change the actual user-visible parameter to "reserved-ipv4-cidr"
pkg/util/ip_reservation.go
Outdated
buffer := bytes.NewBufferString("") | ||
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; err = incrementIP(cidrIPBlock, 8) { | ||
overLap := false | ||
buffer.WriteString(cidrIPBlock.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can wait to convert to a string until the very end when a non-conflicting IP has been found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on that note, we don't need to use a buffer at all. After finding a nonoverlapping cidr we can just construct the necessary string in a one line concatenation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overlap validation function needs two cidrs as arguments. This concatenation was made to convert the IP in CIDR to /29 cidr. Maybe I am missing something here but not sure how we can convert IP to CIDR without concatenation
pkg/util/ip_reservation.go
Outdated
} | ||
|
||
if ipnet1.Contains(ipnet2.IP) || ipnet2.Contains(ipnet1.IP) { | ||
return fmt.Errorf("The cidr ranges %s and %s overlap", cidr1, cidr2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really an error. This function is more of a isCIDROverlap
and should return a bool along with an error. Errors should be reserved for "problems" not a false value
pkg/util/ip_reservation.go
Outdated
buffer := bytes.NewBufferString("") | ||
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; err = incrementIP(cidrIPBlock, 8) { | ||
overLap := false | ||
buffer.WriteString(cidrIPBlock.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on that note, we don't need to use a buffer at all. After finding a nonoverlapping cidr we can just construct the necessary string in a one line concatenation.
pkg/util/ip_reservation.go
Outdated
return "", fmt.Errorf("Unable to parse CIDR %s", cidr) | ||
} | ||
buffer := bytes.NewBufferString("") | ||
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; err = incrementIP(cidrIPBlock, 8) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the number 8
in this case? Maybe we should make a const for it or a comment here to specify what it is. I noticed in the function definition its of type byte
, this is somewhat unusual and is worth clarifying so there are no gotchas in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8 is used to get to the next /29 address. I will add a constant for this.
pkg/util/ip_reservation.go
Outdated
for reservedIPRange := range reservedIPRanges { | ||
err = validateCIDROverlap(buffer.String(), reservedIPRange) | ||
if err != nil { | ||
overLap = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because validateCIDROverlap
returns an error for both the overlap case and a malformed cidr, this statement is not always correct. It could just be returning an error because the CIDR is malformed.
pkg/util/ip_reservation.go
Outdated
|
||
// Increment the given IP value by the provided step | ||
func incrementIP(ip net.IP, step byte) error { | ||
if uint8(255)-uint8(ip[len(ip)-1]) < uint8(step) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a valid increment?
for example 0.0.0.255
incremented by 2
isn't an overflow but is actually 0.0.1.2
? I may be misunderstanding whats going on here. Would be helpful to put some comments explaining what these pieces are, like what is the last array entry in ip
, why 255
(use a const) etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will come up with a more concrete check for this.
pkg/util/ip_reservation.go
Outdated
return fmt.Errorf("IP overflow occured when incrementing ip %s with step %d", ip.String(), step) | ||
} | ||
for j := len(ip) - 1; j >= 0; j-- { | ||
ip[j] += step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we incrementing each part of the IP by step then breaking
if its over 0? This will always break
after the first iteration unless the step
is 0 and the original ip
end was also 0. So in practice this will only increment the last part of the ip because if step is non-zero the break
will trigger.
Could you please clarify with comments or fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to handle carry when rollover happens. I'll update this to catch rollover with a custom step size.
pkg/util/ip_reservation_test.go
Outdated
) | ||
|
||
func TestAllIPBlocksAvailable(t *testing.T) { | ||
ips := [8]string{"192.168.92.32/29", "192.168.92.40/29", "192.168.92.48/29", "192.168.92.56/29"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these reserved ip's or unreserved ips?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All /29 IPs in a /27 CIDR
pkg/util/ip_reservation_test.go
Outdated
} | ||
} | ||
|
||
func TestSomeIPBlocksAvailable(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these test functions are prime for Table Driven Tests. They are testing same functions with different data:
https://golang.org/wiki/TableDrivenTests
pkg/util/ip_reservation_test.go
Outdated
|
||
} | ||
|
||
func TestIncrementIP(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Table driven tests. This one is not comprehensive. Try adding some edge cases and funky input. Error cases are also good
pkg/util/ip_reservation_test.go
Outdated
|
||
func TestCloneIP(t *testing.T) { | ||
originalIP := net.ParseIP("192.168.92.32") | ||
cloneIP := cloneIP(originalIP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should also test that these are not actually the same object
pkg/csi_driver/controller.go
Outdated
@@ -90,6 +95,18 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu | |||
return nil, status.Error(codes.InvalidArgument, err.Error()) | |||
} | |||
|
|||
reservedIPV4CIDR := req.GetParameters()[paramReservedIPV4CIDR] // Get the reserved-IPV4-cidr from the request. It was not added in the service instance as it is only consumed by the controller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: can you place the all the comments as a line above the code instead of at the end of the line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if cidr range is not set? Maybe generateNewFileInstance should return the cidr range from parameters as another argument
pkg/csi_driver/controller.go
Outdated
return nil, err | ||
} | ||
|
||
reservedIPRange, err := s.getUnreservedIPBlock(ctx, newFiler, reservedIPV4CIDR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be done after the GetInstance() call. And we should not unreserve the IP after every failure. The reason is a bit tricky.
In Kubernetes, the timeout value that gets passed into CreateVolume() is very short. This causes the create instance operation on the first time to always return with a timeout error. However, the create operation is still running, therefore the IP is still taken. So we should only allocate a new IP if we're making a new CreateInstance call.
For the failure case, we may have to add some extra checking, and only release the IP if we cannot find an instance.
pkg/csi_driver/controller.go
Outdated
return nil, status.Error(codes.Internal, err.Error()) | ||
} | ||
} | ||
return &csi.CreateVolumeResponse{Volume: fileInstanceToCSIVolume(filer, modeInstance)}, nil | ||
} | ||
|
||
// getUnreservedIPBlock returns the available IP in the cidr. It is synchronized in order to avoid data race while getting unreserved IP range in a CIDR | ||
func (s *controllerServer) getUnreservedIPBlock(ctx context.Context, filer *file.ServiceInstance, cidr string) (string, error) { | ||
s.config.mutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid holding global locks over network calls as much as possible because they can block for a very long time.
pkg/csi_driver/controller.go
Outdated
@@ -222,7 +262,8 @@ func (s *controllerServer) generateNewFileInstance(name string, capBytes int64, | |||
location = v | |||
case paramNetwork: | |||
network = v | |||
// Unused | |||
case paramReservedIPV4CIDR: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment that reserved cidrs will be processed later
pkg/csi_driver/controller_test.go
Outdated
@@ -74,6 +75,9 @@ func TestCreateVolume(t *testing.T) { | |||
}, | |||
}, | |||
}, | |||
Parameters: map[string]string{ | |||
"reserved-ipv4-cidr": "192.168.92.32/26", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reserved cidr should not be a required argument, so this should be a new test case
pkg/util/ip_reservation.go
Outdated
// ValidateCIDR function validates whether a particular cidr is a valid IP range | ||
func (ipAllocator *IPAllocator) ValidateCIDR(cidr string) error { | ||
if len(cidr) == 0 { | ||
return fmt.Errorf("Passed Reserved IPV4 CIDR parameter evaluated to be empty") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the ParseCIDR function already take care of this case?
pkg/util/ip_reservation.go
Outdated
|
||
// IPAllocator struct maintains the current list of reserved /29 IPs in the system. Read access to this struct is synchronized | ||
type IPAllocator struct { | ||
reserved map[string]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment describing the format of this map, along with an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think storing the ipnet type instead of the string representation will be more efficient. Especially later on in GetUnreservedIPBlock
, you can avoid converting between ipnet and string.
pkg/util/ip_reservation.go
Outdated
} | ||
|
||
// NewIPAllocator is the constructor to initialize the IPAllocator object | ||
func NewIPAllocator(reserved map[string]bool) *IPAllocator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment describing what this argument means?
pkg/util/ip_reservation_test.go
Outdated
cidr1 := "192.168.92.32/27" | ||
cidr2 := "192.168.92.48/26" | ||
|
||
overlap := isOverlap(cidr1, cidr2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests can be table driven too
pkg/util/ip_reservation_test.go
Outdated
currentIP := "192.168.92.32" | ||
nextIP := "192.168.92.40" | ||
ip := net.ParseIP(currentIP) | ||
err := incrementIP(ip, step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests can be table driven too
pkg/csi_driver/controller.go
Outdated
reservedIPV4CIDR, reservedIPV4CIDRPresent := req.GetParameters()[paramReservedIPV4CIDR] | ||
|
||
if !reservedIPV4CIDRPresent { | ||
return nil, fmt.Errorf("The required parameter reserved-ipv4-cidr was not provided") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a required parameter. If it is not specified, then we leave the ReservedIPRange field blank and let Filestore choose an available IP
pkg/csi_driver/controller.go
Outdated
// Get the reserved-IPV4-cidr from the request. It was not added in the service instance as it is only consumed by the controller | ||
reservedIPV4CIDR, reservedIPV4CIDRPresent := req.GetParameters()[paramReservedIPV4CIDR] | ||
if !reservedIPV4CIDRPresent { | ||
return nil, fmt.Errorf("The required parameter reserved-ipv4-cidr was not provided") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a required parameter. If it was not specified, then we leave ReservedIpRange empty
pkg/csi_driver/controller.go
Outdated
if err != nil { | ||
return "", err | ||
} | ||
s.config.ipAllocator.Reserve(unreservedIPBlock) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Reserve
call should happen inside GetUnreservedIPBlock
pkg/csi_driver/controller.go
Outdated
return &csi.CreateVolumeResponse{Volume: fileInstanceToCSIVolume(filer, modeInstance)}, nil | ||
} | ||
|
||
// getUnreservedIPBlock returns the available IP in the cidr. It is synchronized in order to avoid data race while getting unreserved IP range in a CIDR | ||
func (s *controllerServer) getUnreservedIPBlock(ctx context.Context, filer *file.ServiceInstance, cidr string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better name could be 'reserveIPRangeto match the eventual argument, and to avoid confusion with the
GetUnreservedIPBlock` call in the IPAllocator
pkg/csi_driver/controller.go
Outdated
@@ -222,7 +272,10 @@ func (s *controllerServer) generateNewFileInstance(name string, capBytes int64, | |||
location = v | |||
case paramNetwork: | |||
network = v | |||
// Unused | |||
|
|||
//Ignore the cidr flag as it is not passed to the cloud provider. It is only consumed by the controller to get an unreserved IP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention in the comment that this argument will be processed later in reserveIPRange
pkg/util/ip_reservation.go
Outdated
// IPAllocator struct maintains the current list of reserved /29 IPs pending a write operation in the cloud service. An IP Range is freed up from this list when it is available at the cloud provider end | ||
// As the create instance operation is in parallel, it is possible that an IP range was reserved by the service instance but is not updated in the cloud instance. The reserved list stores such IP ranges | ||
type IPAllocator struct { | ||
reserved map[string]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's take a step back and whiteboard this design. I don't think just a single map is sufficient for synchronizing creation and deletion.
pkg/csi_driver/controller.go
Outdated
@@ -101,15 +106,61 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu | |||
return nil, status.Error(codes.AlreadyExists, err.Error()) | |||
} | |||
} else { | |||
|
|||
// Get the reserved-IPV4-cidr from the request. It was not added in the service instance as it is only consumed by the controller | |||
reservedIPV4CIDR, reservedIPV4CIDRPresent := req.GetParameters()[paramReservedIPV4CIDR] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just use ok
here instead of reservedIPV4CIDRPresent
.
https://github.com/golang/go/wiki/CodeReviewComments#variable-names
pkg/csi_driver/controller.go
Outdated
} | ||
|
||
// Prevents lock acquisition in case the CIDR is invalid | ||
err = s.config.ipAllocator.ValidateCIDR(reservedIPV4CIDR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont use error
in place of true/false values. A validation should return true when validated and false when the cidr is invalid. It should only return error when there is a failure in the validation process itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Will change that.
pkg/csi_driver/controller.go
Outdated
// Create the instance | ||
filer, err = s.config.fileService.CreateInstance(ctx, newFiler) | ||
if err != nil { | ||
return nil, status.Error(codes.Internal, err.Error()) | ||
} | ||
} | ||
|
||
//Free up the reserved IPV4 Range from the ipAllocator as we would be recieving this value in the listInstances call as the createInstance operation was completed | ||
s.config.ipAllocator.Unreserve(filer.Network.ReservedIpRange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we reserving the IP somewhere? Why reserve and unreserve in the same call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are reserving in the getUnreservedIPBlock() function which is the inside a lock. It is unreserved when your CreateInstance call resulted in a success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little unintuitive why you would unreserve an IP once creation is successful.
Maybe better names could be Hold/Unhold.
pkg/csi_driver/controller.go
Outdated
} | ||
s.config.mutex.Lock() | ||
defer s.config.mutex.Unlock() | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this error should be pushed up to right after the ListInstances
call. You're trying to use the instances before checking the error here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Will change that.
pkg/csi_driver/controller.go
Outdated
} | ||
|
||
// Pass the reservedIPRanges obtained from ListInstances call. We have the IPRanges reserved at the service end in the IPAllocator.reserved list. The GetUnreservedIPBlock would merge these two lists to form a reserved list | ||
unreservedIPBlock, err := s.config.ipAllocator.GetUnreservedIPBlock(cidr, reservedIPRanges) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is getUnreservedIPBlock
calling GetUnreservedIPBlock
. Do they have significantly different behavior? can we just use one of them here? Is one of them misnamed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I will fix the naming for this.
pkg/util/ip_reservation_test.go
Outdated
{ | ||
name: "Invalid increment", | ||
currentIP: "255.255.255.253", | ||
expected: "0.0.0.5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Don't specify an expected answer if there is none (we expect an error here)
pkg/util/ip_reservation_test.go
Outdated
name: "Valid increment with carry forward to significant bytes", | ||
currentIP: "192.168.255.253", | ||
expected: "192.169.0.5", | ||
errorExpected: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't need to specify false values
pkg/util/ip_reservation.go
Outdated
if ipByte == 3 { | ||
ip[ipByte] += step | ||
// Addition did not result in a carry. We have the incremented value | ||
if ip[ipByte] >= step { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does ip[ipByte] roll over if overflow, does it reset to 0? Are we relying on undefined/unspecified behavior here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For overflow, after increment, the least significant byte is less than the step and for other bytes the value is 0 as the maximum value of carry is 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
245 + 20 == 10
or 245 + 20 == 0
was the question I was asking. Either way, a better way to check for overflow is before doing the operation check if BYTE_MAX - step < ip[ipByte]
. This will find an overflow cleanly without triggering actual overflow behavior.
There are even proposals to make overflow panic in golang, so I would not rely on the current behavior:
golang/go#19624
pkg/util/ip_reservation.go
Outdated
ip[ipByte]++ | ||
|
||
// Addition did not result in a carry. We have the incremented value | ||
if ip[ipByte] > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in case of most significant byte, the rollover will occur when your value changes from 255 to 0. Hence after adding the carry we check if the value is greater than 0 for MSB
pkg/util/ip_reservation_test.go
Outdated
for index, test := range cases { | ||
ipAllocator := initTestIPAllocator() | ||
for i := 0; i < index; i++ { | ||
// Distributing IPs between the reserved lists at the cloud provider end and the reserved list in the ipAllocator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this setup should probably be a part of each test case definition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other functions don't accept the reserved list so created only for this.
pkg/csi_driver/controller.go
Outdated
@@ -62,9 +64,12 @@ type controllerServerConfig struct { | |||
driver *GCFSDriver | |||
fileService file.Service | |||
metaService metadata.Service | |||
ipAllocator *util.IPAllocator | |||
mutex sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this mutex into the IPAllocator instead of the controller. The library should provide atomicity without relying on the caller.
pkg/csi_driver/controller.go
Outdated
} | ||
|
||
// Reserve the IPRange obtained from the GetUnreservedIPBlock call | ||
s.config.ipAllocator.Reserve(unreservedIPBlock) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reserve functionality can be done inside GetUnreservedIPBlock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to decouple the Get and Reserve Operations. Also, adding a mutating operation in Get conflicts with the naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the user's perspective, is there a purpose for getting an IP block without reserving it? If not, then I think it can be condensed into two interfaces. Internally, it could remain as two different functions.
pkg/csi_driver/controller.go
Outdated
// Create the instance | ||
filer, err = s.config.fileService.CreateInstance(ctx, newFiler) | ||
if err != nil { | ||
return nil, status.Error(codes.Internal, err.Error()) | ||
} | ||
} | ||
|
||
//Free up the reserved IPV4 Range from the ipAllocator as we would be recieving this value in the listInstances call as the createInstance operation was completed | ||
s.config.ipAllocator.Unreserve(filer.Network.ReservedIpRange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little unintuitive why you would unreserve an IP once creation is successful.
Maybe better names could be Hold/Unhold.
pkg/csi_driver/controller.go
Outdated
// Adding the reserved IP range to the instance object | ||
newFiler.Network.ReservedIpRange = reservedIPRange | ||
} | ||
|
||
// Create the instance | ||
filer, err = s.config.fileService.CreateInstance(ctx, newFiler) | ||
if err != nil { | ||
return nil, status.Error(codes.Internal, err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should unhold the IP in both success and error cases. A defer can be used after the Hold()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of error, instance has not been created yet. So other threads making the list call would not get the the IP as reserved and can end up holding the reserved IP as it is no longer blocked. Returning a response from CreateVolume ensures that the instance has been created and would be available in subsequent list calls. Let me know if I am missing something in my analysis here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you reserve the IP, and then CreateInstance fails, right now, it seems like the behavior is to keep it reserved so no other requests can use it. We want the opposite behavior. This may also be a good scenario to unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we would need to identify whether the CreateInstance operation is running in the background and the error is due to some other source(ex timeout) or the operation itself aborted. In the previous case, we should not release the IP but in the later case we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both cases you can release the IP. IIUC,
- If operation is running in the background, then when you
ListInstances
, you will get the IP - If operation aborted, the IP didn't get consumed by the Filestore service, so it is safe to release
pkg/csi_driver/controller.go
Outdated
} | ||
|
||
// Initialize an empty reserved list. It will be populated with all the reservedIPRanges obtained from the active instances | ||
reservedIPRanges := make(map[string]bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be a map, or can it be a slice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its useful to be a map for lookup reasons. its basically being used as a "set"
pkg/util/ip_reservation.go
Outdated
func (ipAllocator *IPAllocator) GetUnreservedIPBlock(cidr string, reservedIPRanges map[string]bool) (string, error) { | ||
// Merging the reserved lists at the cloud provider end and the service end to form a final reserved list | ||
for reservedIPRange := range ipAllocator.reserved { | ||
reservedIPRanges[reservedIPRange] = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should create and modify a new map instead of modifying the user's map. Also does this need to be a map? We seem to just be iterating over it, so it could be a slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its implementation of a Set using a Map
pkg/util/ip_reservation.go
Outdated
overLap, err = isOverlap(ipRange, reservedIPRange) | ||
// Error while processing cidr is also considered as an overlap to ignore the specific IP block | ||
if overLap || err != nil { | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is being dropped. At this point, is it expected to receive an error? If not, then we should probably return immediately with an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error is for this cidrIP and reservedIP overlap combination. We can try out other combinations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean that we should probably return if there's an error instead of continuing
pkg/util/ip_reservation.go
Outdated
if err != nil { | ||
return "", fmt.Errorf("Unable to parse CIDR %s", cidr) | ||
} | ||
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; err = incrementIP(cidrIPBlock, incrementStep29IPBlock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of modifying structures for temporary state. I think it would be simpler for incrementIP to return a new IP everytime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking pretty good. Mostly just some naming issues and nits
pkg/csi_driver/controller.go
Outdated
|
||
// Get the reserved-IPV4-cidr from the request. If value is not provided, reservedIPV4Range is defaulted to "" | ||
reservedIPV4CIDR, ok := req.GetParameters()[paramReservedIPV4CIDR] | ||
if ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: idiomatic golang style is usually to have a smaller block to handle the error for if !ok
, then the success case should be not inside a conditional block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, you implement some logic in case if the param is present and do nothing in case it is not. So there is just one block depending on the condition.
pkg/csi_driver/controller.go
Outdated
return nil, err | ||
} | ||
|
||
reservedIPRange, err := s.reserveIPV4Range(ctx, newFiler, reservedIPV4CIDR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of reserve
lets call it:
preallocate
(?)
lock
(?)
earmark
(?)
designate
(?)
allot
(?????)
@msau42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of those, i like preallocate the best, but what is the reverse? unallocate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depreunallocate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does hold and release sound appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
pkg/csi_driver/controller.go
Outdated
} | ||
|
||
// Getting unreserved IP in a CIDR needs to be synchronized | ||
s.config.mutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: name this mutex after the thing that it is mutexing. This way we wont accidentally reuse this mutex for different purposes and potentially deadlock
pkg/util/ip_reservation.go
Outdated
// Finally a final reserved list is created by merging these two lists | ||
// Potential error cases: 1) No /29 Block in the CIDR is unreserved | ||
// 2) Parsing the CIDR resulted in an error | ||
func (ipAllocator *IPAllocator) GetUnreservedIPBlock(cidr string, reservedIPRanges map[string]bool) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not super clear here that the reservedIPRanges you're passing in are being merged with those in the "ipAllocator" to me. I think this boils down to a naming issue. The code looks mostly ok but it just doesn't feel intuitive because everything is named similar things its hard to seperate concepts
pkg/util/ip_reservation.go
Outdated
if ipByte == 3 { | ||
ip[ipByte] += step | ||
// Addition did not result in a carry. We have the incremented value | ||
if ip[ipByte] >= step { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
245 + 20 == 10
or 245 + 20 == 0
was the question I was asking. Either way, a better way to check for overflow is before doing the operation check if BYTE_MAX - step < ip[ipByte]
. This will find an overflow cleanly without triggering actual overflow behavior.
There are even proposals to make overflow panic in golang, so I would not rely on the current behavior:
golang/go#19624
pkg/util/ip_reservation.go
Outdated
break | ||
} | ||
} else { | ||
// Add the carry bit from the previous addition. Value of carry can only be 1 as we are performing byte addition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we never need step
sizes over 255
? Genuine question, I don't know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an assumption here. Would be required if you want to return smaller than /24 block instead of the existing /29
pkg/util/ip_reservation.go
Outdated
// Add the carry bit from the previous addition. Value of carry can only be 1 as we are performing byte addition | ||
ip[ipByte]++ | ||
|
||
// As the rollover case is 255->0, we conclude that addition did not result in a carry. We have the incremented value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont depend on overflow behavior, see above
pkg/util/ip_reservation.go
Outdated
if ip[ipByte] > 0 { | ||
break | ||
} | ||
//We are at the most significant byte and adding 1 to it resulted in overflowing its value. IP Address rolled over. Returning error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this an error? can't we then increment the next value by 1?
0.255.255.255
step by 1
should be 1.0.0.0
not an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check is after incrementing ip[ipByte]. So, after increment, if this value is still 0, that implies that it transitioned from 255 to 0 and there is an overflow. If you are already at the Most Significant byte(ipByte== 0) and the value is 0 after increment, it means that there is an overflow in the IP range and you return with an error.
pkg/util/ip_reservation_test.go
Outdated
currentIP: "255.255.255.253", | ||
step: 167, | ||
errorExpected: true, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add test
currentIP: "0.255.255.255",
step: 3,
expected: "1.0.0.2",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think currentIP: "192.255.255.253", step: 255, expected: "193.0.0.252",
is a similar case. But can add this if you want
53dc988
to
ce69577
Compare
pkg/csi_driver/controller.go
Outdated
@@ -101,6 +104,30 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu | |||
return nil, status.Error(codes.AlreadyExists, err.Error()) | |||
} | |||
} else { | |||
|
|||
// Get the reserved-IPV4-cidr from the request. If value is not provided, reservedIPV4Range is defaulted to "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update all the comments? In this particular comment, it's referring to variables that don't exist. many of the other comments are out of date too
pkg/csi_driver/controller.go
Outdated
|
||
// Possible cases are 1) CreateInstanceAborted, 2)CreateInstance running in background | ||
// In either cases, the listInstances responses would consists of the updated result | ||
defer s.config.ipAllocator.Release(reservedIPRange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The defer should go right after you reserve the IP. That way it gets invoked if the function exits for any reason afterwards.
pkg/csi_driver/controller.go
Outdated
reservedIPV4CIDR, ok := req.GetParameters()[paramReservedIPV4CIDR] | ||
if ok { | ||
|
||
// Prevents lock acquisition in case the CIDR is invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the lock is hidden inside the util, this comment is not necessary
pkg/csi_driver/controller.go
Outdated
// Prevents lock acquisition in case the CIDR is invalid | ||
validCIDR := s.config.ipAllocator.ValidateCIDR(reservedIPV4CIDR) | ||
if !validCIDR { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err is not set to anything here?
pkg/csi_driver/controller.go
Outdated
} | ||
|
||
// Pass the reservedIPRanges obtained from ListInstances call. We have the IPRanges pending reservation in the IPAllocator.hold list | ||
// The GetUnreservedIPBlock would merge these two lists to form a reserved list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is an implementation detail and would be better suited for inside the util library
pkg/util/ip_reservation.go
Outdated
// IPAllocator struct maintains the current list of reserved /29 IPs pending a write operation in the cloud service. An IP Range is freed up from this list when it is available at the cloud provider end | ||
// As the create instance operation is in parallel, it is possible that an IP range was reserved by the service instance but is not updated in the cloud instance. The reserved list stores such IP ranges | ||
type IPAllocator struct { | ||
hold map[string]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move the comment describing this field right above the field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And can you give an example of the actual string syntax in this map, ie The key is the held ipnet in string format, ie "1.2.3.0/29"
pkg/util/ip_reservation.go
Outdated
reserved[reservedIPRange] = true | ||
} | ||
|
||
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; cidrIPBlock, err = incrementIP(cidrIPBlock, incrementStep29IPBlock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if incrementIP errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We stop iterating over the cidr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to log the error? right now it is lost
pkg/util/ip_reservation.go
Outdated
|
||
// Reserve function adds a particular IP block in the reserve pool | ||
// Argument ipBlock string is an IPV4 range which needs to be reserved | ||
func (ipAllocator *IPAllocator) Hold(ipV4Range string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a private function
pkg/util/ip_reservation.go
Outdated
// Unreserve function frees up a reserved IP | ||
// Argument ipBlock string is an IPV4 range which needs to be unreserved | ||
func (ipAllocator *IPAllocator) Release(ipV4Range string) { | ||
delete(ipAllocator.hold, ipV4Range) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a lock
Also please run hack/update-gofmt.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good! I just have nit picky naming comments.
pkg/csi_driver/controller.go
Outdated
@@ -104,28 +104,26 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu | |||
return nil, status.Error(codes.AlreadyExists, err.Error()) | |||
} | |||
} else { | |||
|
|||
// Get the reserved-IPV4-cidr from the request. If value is not provided, reservedIPV4Range is defaulted to "" | |||
// If we are creating a new instance, we need to evaluate the reservedIPRange range if reserved-ipv4-cidr is provided |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe something like "we need to pick an unused /29 range from reserved-ipv4-cidr" would be more clear
pkg/csi_driver/controller.go
Outdated
validCIDR := s.config.ipAllocator.ValidateCIDR(reservedIPV4CIDR) | ||
if !validCIDR { | ||
return nil, err | ||
return nil, fmt.Errorf("The provided reserved-ipv4-cidr %s evaluated to be invalid after parsing", reservedIPV4CIDR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error messages should start with lowercase. a simpler wording could be "invalid reserved-ipv4-cidr %s"
pkg/csi_driver/controller.go
Outdated
} | ||
|
||
reservedIPRange, err := s.reserveIPV4Range(ctx, newFiler, reservedIPV4CIDR) | ||
|
||
// Possible cases are 1) CreateInstanceAborted, 2)CreateInstance running in background | ||
// In either cases, the ListInstances responses would consist of the updated result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the ListInstances response will contain the reservedIPRange if the operation was started"
pkg/csi_driver/controller.go
Outdated
|
||
// Possible cases are 1) CreateInstanceAborted, 2)CreateInstance running in background | ||
// In either cases, the ListInstances responses would consist of the updated result | ||
defer s.config.ipAllocator.ReleaseIPV4Range(reservedIPRange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming of this function is symmetrical with the controller call, but not with the ipAllocator. It may be cleaner to make it symmetrical with ipAlloactor, and break up reserveIPRange into "getUsedIPRanges" and then call the ipAllocator.GetUnreservedIPBlock.
Let's also try to standardize on the terminology. We're using ipblock, iprange, and ipv4-cidr. Maybe we can reduce it down to two.
- ipv4-cidr to refer to the bigger ipnet that we're extracting from. This aligns with the rest of GKE naming
- ipRange to refer to the /29 ipnet that we pass to Filestore. This aligns with Filestore naming.
pkg/csi_driver/controller.go
Outdated
@@ -137,7 +135,7 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu | |||
return &csi.CreateVolumeResponse{Volume: fileInstanceToCSIVolume(filer, modeInstance)}, nil | |||
} | |||
|
|||
// getUnreservedIPBlock returns the available IP in the cidr. It is synchronized in order to avoid data race while getting unreserved IP range in a CIDR | |||
// getUnreservedIPBlock returns the available IP in the cidr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the function name is not consistent in the comment
pkg/util/ip_reservation.go
Outdated
holdMutex sync.Mutex | ||
} | ||
|
||
// NewIPAllocator is the constructor to initialize the IPAllocator object | ||
// Argument reserved map[string]bool is a set of currently reserved IPs for which the CreateInstance operation is not completed | ||
// Argument hold map[string]bool is a set of IP ranges currently reserved by service instances but pending reservation in the cloud instances | ||
func NewIPAllocator(hold map[string]bool) *IPAllocator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does hold need to be initialized by the caller, or can it be initialized inside the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be initialized by the caller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? right now I see the caller is just initializing it to empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just using it as a parameterized constructor here where the caller has the flexibility to initialize it to any value rather than enforcing a default
pkg/util/ip_reservation.go
Outdated
hold map[string]bool | ||
// hold set maintains the set of IP ranges that have been reserved by the service instances but pending reservation in the cloud instances | ||
// The key is a IP range currently reserved by a service instance e.g(192.168.92.0/29). Value is a bool to implement map as a set | ||
hold map[string]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better name could be "pendingIPRanges"
pkg/util/ip_reservation.go
Outdated
@@ -63,23 +68,21 @@ func (ipAllocator *IPAllocator) ValidateCIDR(cidr string) bool { | |||
|
|||
// GetUnreservedIPBlock returns an unreserved /29 IP block. | |||
// cidr: Provided cidr address in which we need to look for an unreserved /29 IP Block | |||
// reservedIPRanges: All the Reserved IP ranges present in the cloud provider | |||
// All the Reserved IP ranges not present in the cloud provider are extracted from the reserved list in the IPAllocator | |||
// activeInstancesReservedIPRanges: All the reserved IP ranges present in the cloud instances |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "used" makes more sense than "reserved"
eb25d90
to
95a2f1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 small nits. then most likely LGTM
pkg/csi_driver/controller.go
Outdated
// If we are creating a new instance, we need to evaluate the reservedIPRange range if reserved-ipv4-cidr is provided | ||
// If the param was not provided, we default reservedIPRange to "" and cloud provider takes care of the allocation | ||
reservedIPV4CIDR, ok := req.GetParameters()[paramReservedIPV4CIDR] | ||
if ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if only needed inside if
scope you can do:
if reservedIPV4CIDR, ok := req.GetParameters()[paramReservedIPV4CIDR]; ok {
pkg/util/ip_reservation.go
Outdated
incrementedIP = incrementedIP.To4() | ||
|
||
// Step can be added directly to the Least Significant Byte and we can return the result | ||
if incrementedIP[3] < 255-step { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 255 should be a const
b726b99
to
8035839
Compare
pkg/csi_driver/controller.go
Outdated
reservedIPRange, err := s.reserveIPV4Range(ctx, newFiler, reservedIPV4CIDR) | ||
|
||
// Possible cases are 1) CreateInstanceAborted, 2)CreateInstance running in background | ||
// In either cases, the ListInstances responses would consist of the updated result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the ListInstances response will include the reservedIPRange if the operation started"
pkg/csi_driver/controller.go
Outdated
@@ -110,6 +134,26 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu | |||
return &csi.CreateVolumeResponse{Volume: fileInstanceToCSIVolume(filer, modeInstance)}, nil | |||
} | |||
|
|||
// getUnreservedIPBlock returns the available IP in the cidr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the function name is wrong in this comment
"returns an available /29 IP range from the cidr"
pkg/util/ip_reservation.go
Outdated
type IPAllocator struct { | ||
// hold set maintains the set of IP ranges that have been reserved by the service instances but pending reservation in the cloud instances | ||
// The key is a IP range currently reserved by a service instance e.g(192.168.92.0/29). Value is a bool to implement map as a set | ||
hold map[string]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a more descriptive name may be "pendingIPRanges"
pkg/util/ip_reservation.go
Outdated
// GetUnreservedIPBlock returns an unreserved /29 IP block. | ||
// cidr: Provided cidr address in which we need to look for an unreserved /29 IP Block | ||
// activeInstancesReservedIPRanges: All the reserved IP ranges present in the cloud instances | ||
// All the reserved IP ranges in the service instances not updated in cloud instances extracted from the hold list in the IPAllocator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this comment is referring to
pkg/util/ip_reservation.go
Outdated
// activeInstancesReservedIPRanges: All the reserved IP ranges present in the cloud instances | ||
// All the reserved IP ranges in the service instances not updated in cloud instances extracted from the hold list in the IPAllocator | ||
// Finally a final reserved list is created by merging these two lists | ||
// Potential error cases: 1) No /29 Block in the CIDR is unreserved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: when enumerating a list, I would put 1) on a new line
pkg/util/ip_reservation.go
Outdated
reserved[reservedIPRange] = true | ||
} | ||
|
||
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; cidrIPBlock, err = incrementIP(cidrIPBlock, incrementStep29IPBlock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to log the error? right now it is lost
pkg/util/ip_reservation.go
Outdated
reserved[reservedIPRange] = true | ||
} | ||
|
||
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; cidrIPBlock, err = incrementIP(cidrIPBlock, incrementStep29IPBlock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cidrIPBlock is a little misleading because it's actually an ip, not an ipnet.
pkg/util/ip_reservation_test.go
Outdated
ipAllocator := initTestIPAllocator() | ||
for i := 0; i < index; i++ { | ||
// Distributing IPs between the reserved lists at the cloud provider end and the hold list in the ipAllocator | ||
if i%2 == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a good idea to generate test input based on the ordering of the test case. If a test case gets added in the middle, then all the subsequent tests will fail. It would be better make the testcase inputs an explicit parameter in the test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The distribution of IPs in the two lists is completely random and the test does not depend on the ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you moved the test case "0/4 /29 IP blocks used", to the end of the cases slice, would the tests fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. got it. thanks
errorExpected bool | ||
}{ | ||
{ | ||
name: "Overlapping CIDRs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also have test cases that check overlapping and non-overlapping CIDRs where the mask (/x) is the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
pkg/util/ip_reservation_test.go
Outdated
errorExpected bool | ||
}{ | ||
{ | ||
name: "0/4 /29 IP blocks used", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a test case where the cidr is misaligned and could cause an ip range to be returned that is smaller than /29, ie 192.168.0.253/29
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For our existing implementation, we are using masking to set up the address. Hence for the CIDR you provided, the starting /29 IP would still be 192.168.0.248(the starting IP address of that CIDR block) and you would always get a block with 8 IPs. Can add the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of masking it, we should return an error if the provided CIDR is misaligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
192.168.0.253/29 is one of the IP values in the CIDR. IP value can be anything from 192.168.0.248-255/29 in this network. When we get a reservedIPRange, its not necessarily the first IP in that CIDR but part of a network with 8 IPs. The /29 CIDR is uniquely identified by its network ID which is the first 29 bits. The last 3 bits can be anything and hence we use the mask to reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we ensure that all 8 ips are unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, when we get a reservedIPRange and pass it to Filestore, that IP range needs to contain 8 IPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup that part is taken care of by the cidr overlap function. it ensures the trailing 29 bits of your network are completely unique. so you are always going to end up with 8 unused IPs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you pass in a misaligned IP range to Filestore, will it succeed?
I think for simplicity, we should validate that the user's CIDR is aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would never end up passing a misaligned range from the implementation. It always passes in the first IP of the available /29 block. If the user passes the reserved-ipv4-cidr of 192.168.0.253/29, we assume he has all the IPs from 192.168.0.248-255/29
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine. We should at least validate the user specified CIDR params so there's no confusion if we end up creating a filestore instance with a "lower" IP
c084fd9
to
526ba19
Compare
pkg/util/ip_reservation_test.go
Outdated
cidr: "192.168.92.192", | ||
expected: false, | ||
}, | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the tests for the required alignment check
d992a8b
to
14d6cd4
Compare
KNOWN_ISSUES.md
Outdated
@@ -1,14 +1,17 @@ | |||
# v0.1.0 | |||
* IP conflicts: By default, Cloud Filestore creation will pick an unused IP subnet to allocate its | |||
* IP conflicts: By default, Cloud Filestore creation will pick an unused IP range to allocate its |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this diff is showing up because it's already like this in master. Maybe you need to rebase?
pkg/util/ip_reservation.go
Outdated
// The reserved-ipv4-cidr network size must be at least /29 | ||
cidrSize, _ := ipnet.Mask.Size() | ||
if cidrSize > ipRangeSize { | ||
return nil, nil, fmt.Errorf("the maximum size of network address in the cidr can be /%d", ipRangeSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"maximum" is sort of misleading because the bigger the number, the smaller the network is. Maybe instead the message should say "the reserved-ipv4-cidr network size must be at least /29"
pkg/util/ip_reservation.go
Outdated
|
||
// The IP specified in the reserved-ipv4-cidr must be must be aligned on the /29 network boundary | ||
if ip.String() != ip.Mask(ipRangeMask).String() { | ||
return nil, nil, fmt.Errorf("the IP specified in the reserved-ipv4-cidr must be aligned to cover /29 IP Range") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"... must be aligned on the /29 network boundary"
errorExpected: false, | ||
}, | ||
{ | ||
name: "Valid increment with step size 8 and carry forward uptil 1st byte", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add in test cases that could potentially catch off-by-one errors?
For example, test cases where the expected should be ".255", ".0" or ".1"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have added the tests
0411460
to
53a9be1
Compare
53a9be1
to
eee9991
Compare
errorExpected: false, | ||
}, | ||
{ | ||
name: "Valid increment with step size 8 and last byte expected 0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected byte .0 test
errorExpected: false, | ||
}, | ||
{ | ||
name: "Valid increment with step size 8 and last byte expected 255", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected byte .255 test
errorExpected: false, | ||
}, | ||
{ | ||
name: "Valid increment with step size 8 and last byte expected 1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected byte .1 test
|
||
// The IP specified in the reserved-ipv4-cidr must be aligned on the /29 network boundary | ||
if ip.String() != ip.Mask(ipRangeMask).String() { | ||
return nil, nil, fmt.Errorf("the IP specified in the reserved-ipv4-cidr must be aligned on the /29 network boundary") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated error message
// The reserved-ipv4-cidr network size must be at least /29 | ||
cidrSize, _ := ipnet.Mask.Size() | ||
if cidrSize > ipRangeSize { | ||
return nil, nil, fmt.Errorf("the reserved-ipv4-cidr network size must be at least /%d", ipRangeSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated error message
/lgtm |
@msau42 I think you have to explicitly approve I plan on doing one last pass-through though so please hold off until then |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: krunaljain, msau42 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
/lgtm |
…ncy-openshift-4.14-ose-gcp-filestore-csi-driver Updating ose-gcp-filestore-csi-driver images to be consistent with ART
This PR provides the GCFS user to reserve a cidr block by providing a configurable parameter named reserved-ipv4-cidr. The /29 IP Range required for the filestore instance would be exclusively allocated from this CIDR range to avoid IP conflicts with kubernetes service IPs.