Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/ip block reservation #12

Merged

Conversation

krunaljain
Copy link
Contributor

@krunaljain krunaljain commented Jul 23, 2018

This PR provides the GCFS user to reserve a cidr block by providing a configurable parameter named reserved-ipv4-cidr. The /29 IP Range required for the filestore instance would be exclusively allocated from this CIDR range to avoid IP conflicts with kubernetes service IPs.

Users can now reserve a CIDR range and the /29 IP Range required for filestore instances would be exclusively provided from this range. This functionality is intended to avoid IP conflicts between filestore instances and service instances. If the CIDR range is not provided, the current implementation would allocate the reserved IP Range. 

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 23, 2018
@msau42
Copy link
Contributor

msau42 commented Jul 23, 2018

/ok-to-test
/hold
/assign @davidz627

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 23, 2018
}

// Increment the given IP value by the provided step
func incrementIP(ip net.IP, step byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of modifying the ip in-place, it would be better to create and return a new one

@@ -107,6 +120,7 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu
return nil, status.Error(codes.Internal, err.Error())
}
}
s.config.mutex.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to use defer s.config.mutex.Unlock() after Lock() so you don't miss unlocking when you return early from an error

@@ -81,7 +85,16 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu
if err := s.config.driver.validateVolumeCapabilities(req.GetVolumeCapabilities()); err != nil {
return nil, status.Error(codes.InvalidArgument, err.Error())
}
s.config.mutex.Lock()
instances, err := s.config.fileService.ListInstances(ctx, "-")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid holding locks across calls that could potentially block for a long time. I'm not sure if we need to lock in this case since each CreateVolume is doing its own ListInstances call

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also move this call to just before CreateInstance to avoid unnecessary cloud provider calls

Copy link
Contributor Author

@krunaljain krunaljain Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To read the updated list of reserved IPs, we have to wait until this call succeeds to avoid read after write hazard, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are thinking about the scenario on how to coordinate multiple CreateVolume requests and make them select different IPs.

One way to do it is to lock the entire CreateVolume call so only one volume create can occur at a time. This will work, but I think it will be undesirable because every cloud call could take on the order of minutes. We could try this way just for now to get it working, but we'll definitely need to improve on it.

In the end, we want to allow parallel CreateVolume operations to occur, but need to serialize IP allocation. I think this will require some sort of caching/shared allocator.

paramTier = "tier"
paramLocation = "location"
paramNetwork = "network"
paramReservedIPV4CIDR = "cidr"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the actual user-visible parameter to "reserved-ipv4-cidr"

buffer := bytes.NewBufferString("")
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; err = incrementIP(cidrIPBlock, 8) {
overLap := false
buffer.WriteString(cidrIPBlock.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can wait to convert to a string until the very end when a non-conflicting IP has been found

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on that note, we don't need to use a buffer at all. After finding a nonoverlapping cidr we can just construct the necessary string in a one line concatenation.

Copy link
Contributor Author

@krunaljain krunaljain Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overlap validation function needs two cidrs as arguments. This concatenation was made to convert the IP in CIDR to /29 cidr. Maybe I am missing something here but not sure how we can convert IP to CIDR without concatenation

}

if ipnet1.Contains(ipnet2.IP) || ipnet2.Contains(ipnet1.IP) {
return fmt.Errorf("The cidr ranges %s and %s overlap", cidr1, cidr2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really an error. This function is more of a isCIDROverlap and should return a bool along with an error. Errors should be reserved for "problems" not a false value

buffer := bytes.NewBufferString("")
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; err = incrementIP(cidrIPBlock, 8) {
overLap := false
buffer.WriteString(cidrIPBlock.String())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on that note, we don't need to use a buffer at all. After finding a nonoverlapping cidr we can just construct the necessary string in a one line concatenation.

return "", fmt.Errorf("Unable to parse CIDR %s", cidr)
}
buffer := bytes.NewBufferString("")
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; err = incrementIP(cidrIPBlock, 8) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the number 8 in this case? Maybe we should make a const for it or a comment here to specify what it is. I noticed in the function definition its of type byte, this is somewhat unusual and is worth clarifying so there are no gotchas in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 is used to get to the next /29 address. I will add a constant for this.

for reservedIPRange := range reservedIPRanges {
err = validateCIDROverlap(buffer.String(), reservedIPRange)
if err != nil {
overLap = true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because validateCIDROverlap returns an error for both the overlap case and a malformed cidr, this statement is not always correct. It could just be returning an error because the CIDR is malformed.


// Increment the given IP value by the provided step
func incrementIP(ip net.IP, step byte) error {
if uint8(255)-uint8(ip[len(ip)-1]) < uint8(step) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a valid increment?

for example 0.0.0.255 incremented by 2 isn't an overflow but is actually 0.0.1.2? I may be misunderstanding whats going on here. Would be helpful to put some comments explaining what these pieces are, like what is the last array entry in ip, why 255 (use a const) etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will come up with a more concrete check for this.

return fmt.Errorf("IP overflow occured when incrementing ip %s with step %d", ip.String(), step)
}
for j := len(ip) - 1; j >= 0; j-- {
ip[j] += step

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we incrementing each part of the IP by step then breaking if its over 0? This will always break after the first iteration unless the step is 0 and the original ip end was also 0. So in practice this will only increment the last part of the ip because if step is non-zero the break will trigger.

Could you please clarify with comments or fix?

Copy link
Contributor Author

@krunaljain krunaljain Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to handle carry when rollover happens. I'll update this to catch rollover with a custom step size.

)

func TestAllIPBlocksAvailable(t *testing.T) {
ips := [8]string{"192.168.92.32/29", "192.168.92.40/29", "192.168.92.48/29", "192.168.92.56/29"}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these reserved ip's or unreserved ips?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All /29 IPs in a /27 CIDR

}
}

func TestSomeIPBlocksAvailable(t *testing.T) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these test functions are prime for Table Driven Tests. They are testing same functions with different data:
https://golang.org/wiki/TableDrivenTests


}

func TestIncrementIP(t *testing.T) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Table driven tests. This one is not comprehensive. Try adding some edge cases and funky input. Error cases are also good


func TestCloneIP(t *testing.T) {
originalIP := net.ParseIP("192.168.92.32")
cloneIP := cloneIP(originalIP)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should also test that these are not actually the same object

@@ -90,6 +95,18 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu
return nil, status.Error(codes.InvalidArgument, err.Error())
}

reservedIPV4CIDR := req.GetParameters()[paramReservedIPV4CIDR] // Get the reserved-IPV4-cidr from the request. It was not added in the service instance as it is only consumed by the controller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: can you place the all the comments as a line above the code instead of at the end of the line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if cidr range is not set? Maybe generateNewFileInstance should return the cidr range from parameters as another argument

return nil, err
}

reservedIPRange, err := s.getUnreservedIPBlock(ctx, newFiler, reservedIPV4CIDR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be done after the GetInstance() call. And we should not unreserve the IP after every failure. The reason is a bit tricky.

In Kubernetes, the timeout value that gets passed into CreateVolume() is very short. This causes the create instance operation on the first time to always return with a timeout error. However, the create operation is still running, therefore the IP is still taken. So we should only allocate a new IP if we're making a new CreateInstance call.

For the failure case, we may have to add some extra checking, and only release the IP if we cannot find an instance.

return nil, status.Error(codes.Internal, err.Error())
}
}
return &csi.CreateVolumeResponse{Volume: fileInstanceToCSIVolume(filer, modeInstance)}, nil
}

// getUnreservedIPBlock returns the available IP in the cidr. It is synchronized in order to avoid data race while getting unreserved IP range in a CIDR
func (s *controllerServer) getUnreservedIPBlock(ctx context.Context, filer *file.ServiceInstance, cidr string) (string, error) {
s.config.mutex.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid holding global locks over network calls as much as possible because they can block for a very long time.

@@ -222,7 +262,8 @@ func (s *controllerServer) generateNewFileInstance(name string, capBytes int64,
location = v
case paramNetwork:
network = v
// Unused
case paramReservedIPV4CIDR:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment that reserved cidrs will be processed later

@@ -74,6 +75,9 @@ func TestCreateVolume(t *testing.T) {
},
},
},
Parameters: map[string]string{
"reserved-ipv4-cidr": "192.168.92.32/26",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reserved cidr should not be a required argument, so this should be a new test case

// ValidateCIDR function validates whether a particular cidr is a valid IP range
func (ipAllocator *IPAllocator) ValidateCIDR(cidr string) error {
if len(cidr) == 0 {
return fmt.Errorf("Passed Reserved IPV4 CIDR parameter evaluated to be empty")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the ParseCIDR function already take care of this case?


// IPAllocator struct maintains the current list of reserved /29 IPs in the system. Read access to this struct is synchronized
type IPAllocator struct {
reserved map[string]bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment describing the format of this map, along with an example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think storing the ipnet type instead of the string representation will be more efficient. Especially later on in GetUnreservedIPBlock, you can avoid converting between ipnet and string.

}

// NewIPAllocator is the constructor to initialize the IPAllocator object
func NewIPAllocator(reserved map[string]bool) *IPAllocator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment describing what this argument means?

cidr1 := "192.168.92.32/27"
cidr2 := "192.168.92.48/26"

overlap := isOverlap(cidr1, cidr2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests can be table driven too

currentIP := "192.168.92.32"
nextIP := "192.168.92.40"
ip := net.ParseIP(currentIP)
err := incrementIP(ip, step)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests can be table driven too

reservedIPV4CIDR, reservedIPV4CIDRPresent := req.GetParameters()[paramReservedIPV4CIDR]

if !reservedIPV4CIDRPresent {
return nil, fmt.Errorf("The required parameter reserved-ipv4-cidr was not provided")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a required parameter. If it is not specified, then we leave the ReservedIPRange field blank and let Filestore choose an available IP

// Get the reserved-IPV4-cidr from the request. It was not added in the service instance as it is only consumed by the controller
reservedIPV4CIDR, reservedIPV4CIDRPresent := req.GetParameters()[paramReservedIPV4CIDR]
if !reservedIPV4CIDRPresent {
return nil, fmt.Errorf("The required parameter reserved-ipv4-cidr was not provided")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a required parameter. If it was not specified, then we leave ReservedIpRange empty

if err != nil {
return "", err
}
s.config.ipAllocator.Reserve(unreservedIPBlock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Reserve call should happen inside GetUnreservedIPBlock

return &csi.CreateVolumeResponse{Volume: fileInstanceToCSIVolume(filer, modeInstance)}, nil
}

// getUnreservedIPBlock returns the available IP in the cidr. It is synchronized in order to avoid data race while getting unreserved IP range in a CIDR
func (s *controllerServer) getUnreservedIPBlock(ctx context.Context, filer *file.ServiceInstance, cidr string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better name could be 'reserveIPRangeto match the eventual argument, and to avoid confusion with theGetUnreservedIPBlock` call in the IPAllocator

@@ -222,7 +272,10 @@ func (s *controllerServer) generateNewFileInstance(name string, capBytes int64,
location = v
case paramNetwork:
network = v
// Unused

//Ignore the cidr flag as it is not passed to the cloud provider. It is only consumed by the controller to get an unreserved IP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention in the comment that this argument will be processed later in reserveIPRange

// IPAllocator struct maintains the current list of reserved /29 IPs pending a write operation in the cloud service. An IP Range is freed up from this list when it is available at the cloud provider end
// As the create instance operation is in parallel, it is possible that an IP range was reserved by the service instance but is not updated in the cloud instance. The reserved list stores such IP ranges
type IPAllocator struct {
reserved map[string]bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take a step back and whiteboard this design. I don't think just a single map is sufficient for synchronizing creation and deletion.

@@ -101,15 +106,61 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu
return nil, status.Error(codes.AlreadyExists, err.Error())
}
} else {

// Get the reserved-IPV4-cidr from the request. It was not added in the service instance as it is only consumed by the controller
reservedIPV4CIDR, reservedIPV4CIDRPresent := req.GetParameters()[paramReservedIPV4CIDR]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just use ok here instead of reservedIPV4CIDRPresent.

https://github.com/golang/go/wiki/CodeReviewComments#variable-names

}

// Prevents lock acquisition in case the CIDR is invalid
err = s.config.ipAllocator.ValidateCIDR(reservedIPV4CIDR)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont use error in place of true/false values. A validation should return true when validated and false when the cidr is invalid. It should only return error when there is a failure in the validation process itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will change that.

// Create the instance
filer, err = s.config.fileService.CreateInstance(ctx, newFiler)
if err != nil {
return nil, status.Error(codes.Internal, err.Error())
}
}

//Free up the reserved IPV4 Range from the ipAllocator as we would be recieving this value in the listInstances call as the createInstance operation was completed
s.config.ipAllocator.Unreserve(filer.Network.ReservedIpRange)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we reserving the IP somewhere? Why reserve and unreserve in the same call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are reserving in the getUnreservedIPBlock() function which is the inside a lock. It is unreserved when your CreateInstance call resulted in a success.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little unintuitive why you would unreserve an IP once creation is successful.

Maybe better names could be Hold/Unhold.

}
s.config.mutex.Lock()
defer s.config.mutex.Unlock()
if err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this error should be pushed up to right after the ListInstances call. You're trying to use the instances before checking the error here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Will change that.

}

// Pass the reservedIPRanges obtained from ListInstances call. We have the IPRanges reserved at the service end in the IPAllocator.reserved list. The GetUnreservedIPBlock would merge these two lists to form a reserved list
unreservedIPBlock, err := s.config.ipAllocator.GetUnreservedIPBlock(cidr, reservedIPRanges)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is getUnreservedIPBlock calling GetUnreservedIPBlock. Do they have significantly different behavior? can we just use one of them here? Is one of them misnamed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I will fix the naming for this.

{
name: "Invalid increment",
currentIP: "255.255.255.253",
expected: "0.0.0.5",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Don't specify an expected answer if there is none (we expect an error here)

name: "Valid increment with carry forward to significant bytes",
currentIP: "192.168.255.253",
expected: "192.169.0.5",
errorExpected: false,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't need to specify false values

if ipByte == 3 {
ip[ipByte] += step
// Addition did not result in a carry. We have the incremented value
if ip[ipByte] >= step {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does ip[ipByte] roll over if overflow, does it reset to 0? Are we relying on undefined/unspecified behavior here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For overflow, after increment, the least significant byte is less than the step and for other bytes the value is 0 as the maximum value of carry is 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

245 + 20 == 10 or 245 + 20 == 0 was the question I was asking. Either way, a better way to check for overflow is before doing the operation check if BYTE_MAX - step < ip[ipByte]. This will find an overflow cleanly without triggering actual overflow behavior.

There are even proposals to make overflow panic in golang, so I would not rely on the current behavior:
golang/go#19624

ip[ipByte]++

// Addition did not result in a carry. We have the incremented value
if ip[ipByte] > 0 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in case of most significant byte, the rollover will occur when your value changes from 255 to 0. Hence after adding the carry we check if the value is greater than 0 for MSB

for index, test := range cases {
ipAllocator := initTestIPAllocator()
for i := 0; i < index; i++ {
// Distributing IPs between the reserved lists at the cloud provider end and the reserved list in the ipAllocator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this setup should probably be a part of each test case definition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other functions don't accept the reserved list so created only for this.

@@ -62,9 +64,12 @@ type controllerServerConfig struct {
driver *GCFSDriver
fileService file.Service
metaService metadata.Service
ipAllocator *util.IPAllocator
mutex sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this mutex into the IPAllocator instead of the controller. The library should provide atomicity without relying on the caller.

}

// Reserve the IPRange obtained from the GetUnreservedIPBlock call
s.config.ipAllocator.Reserve(unreservedIPBlock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reserve functionality can be done inside GetUnreservedIPBlock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to decouple the Get and Reserve Operations. Also, adding a mutating operation in Get conflicts with the naming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the user's perspective, is there a purpose for getting an IP block without reserving it? If not, then I think it can be condensed into two interfaces. Internally, it could remain as two different functions.

// Create the instance
filer, err = s.config.fileService.CreateInstance(ctx, newFiler)
if err != nil {
return nil, status.Error(codes.Internal, err.Error())
}
}

//Free up the reserved IPV4 Range from the ipAllocator as we would be recieving this value in the listInstances call as the createInstance operation was completed
s.config.ipAllocator.Unreserve(filer.Network.ReservedIpRange)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little unintuitive why you would unreserve an IP once creation is successful.

Maybe better names could be Hold/Unhold.

// Adding the reserved IP range to the instance object
newFiler.Network.ReservedIpRange = reservedIPRange
}

// Create the instance
filer, err = s.config.fileService.CreateInstance(ctx, newFiler)
if err != nil {
return nil, status.Error(codes.Internal, err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should unhold the IP in both success and error cases. A defer can be used after the Hold()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of error, instance has not been created yet. So other threads making the list call would not get the the IP as reserved and can end up holding the reserved IP as it is no longer blocked. Returning a response from CreateVolume ensures that the instance has been created and would be available in subsequent list calls. Let me know if I am missing something in my analysis here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you reserve the IP, and then CreateInstance fails, right now, it seems like the behavior is to keep it reserved so no other requests can use it. We want the opposite behavior. This may also be a good scenario to unit test.

Copy link
Contributor Author

@krunaljain krunaljain Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we would need to identify whether the CreateInstance operation is running in the background and the error is due to some other source(ex timeout) or the operation itself aborted. In the previous case, we should not release the IP but in the later case we should.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both cases you can release the IP. IIUC,

  1. If operation is running in the background, then when you ListInstances, you will get the IP
  2. If operation aborted, the IP didn't get consumed by the Filestore service, so it is safe to release

}

// Initialize an empty reserved list. It will be populated with all the reservedIPRanges obtained from the active instances
reservedIPRanges := make(map[string]bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be a map, or can it be a slice?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its useful to be a map for lookup reasons. its basically being used as a "set"

func (ipAllocator *IPAllocator) GetUnreservedIPBlock(cidr string, reservedIPRanges map[string]bool) (string, error) {
// Merging the reserved lists at the cloud provider end and the service end to form a final reserved list
for reservedIPRange := range ipAllocator.reserved {
reservedIPRanges[reservedIPRange] = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should create and modify a new map instead of modifying the user's map. Also does this need to be a map? We seem to just be iterating over it, so it could be a slice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its implementation of a Set using a Map

overLap, err = isOverlap(ipRange, reservedIPRange)
// Error while processing cidr is also considered as an overlap to ignore the specific IP block
if overLap || err != nil {
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is being dropped. At this point, is it expected to receive an error? If not, then we should probably return immediately with an error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error is for this cidrIP and reservedIP overlap combination. We can try out other combinations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean that we should probably return if there's an error instead of continuing

if err != nil {
return "", fmt.Errorf("Unable to parse CIDR %s", cidr)
}
for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; err = incrementIP(cidrIPBlock, incrementStep29IPBlock) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of modifying structures for temporary state. I think it would be simpler for incrementIP to return a new IP everytime.

Copy link

@davidz627 davidz627 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking pretty good. Mostly just some naming issues and nits


// Get the reserved-IPV4-cidr from the request. If value is not provided, reservedIPV4Range is defaulted to ""
reservedIPV4CIDR, ok := req.GetParameters()[paramReservedIPV4CIDR]
if ok {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: idiomatic golang style is usually to have a smaller block to handle the error for if !ok, then the success case should be not inside a conditional block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, you implement some logic in case if the param is present and do nothing in case it is not. So there is just one block depending on the condition.

return nil, err
}

reservedIPRange, err := s.reserveIPV4Range(ctx, newFiler, reservedIPV4CIDR)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of reserve lets call it:
preallocate (?)
lock (?)
earmark (?)
designate (?)
allot (?????)
@msau42

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of those, i like preallocate the best, but what is the reverse? unallocate?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depreunallocate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does hold and release sound appropriate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

}

// Getting unreserved IP in a CIDR needs to be synchronized
s.config.mutex.Lock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name this mutex after the thing that it is mutexing. This way we wont accidentally reuse this mutex for different purposes and potentially deadlock

// Finally a final reserved list is created by merging these two lists
// Potential error cases: 1) No /29 Block in the CIDR is unreserved
// 2) Parsing the CIDR resulted in an error
func (ipAllocator *IPAllocator) GetUnreservedIPBlock(cidr string, reservedIPRanges map[string]bool) (string, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not super clear here that the reservedIPRanges you're passing in are being merged with those in the "ipAllocator" to me. I think this boils down to a naming issue. The code looks mostly ok but it just doesn't feel intuitive because everything is named similar things its hard to seperate concepts

if ipByte == 3 {
ip[ipByte] += step
// Addition did not result in a carry. We have the incremented value
if ip[ipByte] >= step {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

245 + 20 == 10 or 245 + 20 == 0 was the question I was asking. Either way, a better way to check for overflow is before doing the operation check if BYTE_MAX - step < ip[ipByte]. This will find an overflow cleanly without triggering actual overflow behavior.

There are even proposals to make overflow panic in golang, so I would not rely on the current behavior:
golang/go#19624

break
}
} else {
// Add the carry bit from the previous addition. Value of carry can only be 1 as we are performing byte addition

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we never need step sizes over 255? Genuine question, I don't know.

Copy link
Contributor Author

@krunaljain krunaljain Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an assumption here. Would be required if you want to return smaller than /24 block instead of the existing /29

// Add the carry bit from the previous addition. Value of carry can only be 1 as we are performing byte addition
ip[ipByte]++

// As the rollover case is 255->0, we conclude that addition did not result in a carry. We have the incremented value

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont depend on overflow behavior, see above

if ip[ipByte] > 0 {
break
}
//We are at the most significant byte and adding 1 to it resulted in overflowing its value. IP Address rolled over. Returning error

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this an error? can't we then increment the next value by 1?

0.255.255.255 step by 1 should be 1.0.0.0 not an error

Copy link
Contributor Author

@krunaljain krunaljain Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check is after incrementing ip[ipByte]. So, after increment, if this value is still 0, that implies that it transitioned from 255 to 0 and there is an overflow. If you are already at the Most Significant byte(ipByte== 0) and the value is 0 after increment, it means that there is an overflow in the IP range and you return with an error.

currentIP: "255.255.255.253",
step: 167,
errorExpected: true,
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test

currentIP:     "0.255.255.255",
step:          3,
expected: "1.0.0.2",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think currentIP: "192.255.255.253", step: 255, expected: "193.0.0.252", is a similar case. But can add this if you want

@krunaljain krunaljain force-pushed the feature/ip_block_reservation branch from 53dc988 to ce69577 Compare July 31, 2018 23:58
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 31, 2018
@@ -101,6 +104,30 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu
return nil, status.Error(codes.AlreadyExists, err.Error())
}
} else {

// Get the reserved-IPV4-cidr from the request. If value is not provided, reservedIPV4Range is defaulted to ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update all the comments? In this particular comment, it's referring to variables that don't exist. many of the other comments are out of date too


// Possible cases are 1) CreateInstanceAborted, 2)CreateInstance running in background
// In either cases, the listInstances responses would consists of the updated result
defer s.config.ipAllocator.Release(reservedIPRange)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defer should go right after you reserve the IP. That way it gets invoked if the function exits for any reason afterwards.

reservedIPV4CIDR, ok := req.GetParameters()[paramReservedIPV4CIDR]
if ok {

// Prevents lock acquisition in case the CIDR is invalid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the lock is hidden inside the util, this comment is not necessary

// Prevents lock acquisition in case the CIDR is invalid
validCIDR := s.config.ipAllocator.ValidateCIDR(reservedIPV4CIDR)
if !validCIDR {
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err is not set to anything here?

}

// Pass the reservedIPRanges obtained from ListInstances call. We have the IPRanges pending reservation in the IPAllocator.hold list
// The GetUnreservedIPBlock would merge these two lists to form a reserved list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is an implementation detail and would be better suited for inside the util library

// IPAllocator struct maintains the current list of reserved /29 IPs pending a write operation in the cloud service. An IP Range is freed up from this list when it is available at the cloud provider end
// As the create instance operation is in parallel, it is possible that an IP range was reserved by the service instance but is not updated in the cloud instance. The reserved list stores such IP ranges
type IPAllocator struct {
hold map[string]bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the comment describing this field right above the field?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And can you give an example of the actual string syntax in this map, ie The key is the held ipnet in string format, ie "1.2.3.0/29"

reserved[reservedIPRange] = true
}

for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; cidrIPBlock, err = incrementIP(cidrIPBlock, incrementStep29IPBlock) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if incrementIP errors?

Copy link
Contributor Author

@krunaljain krunaljain Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We stop iterating over the cidr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to log the error? right now it is lost


// Reserve function adds a particular IP block in the reserve pool
// Argument ipBlock string is an IPV4 range which needs to be reserved
func (ipAllocator *IPAllocator) Hold(ipV4Range string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a private function

// Unreserve function frees up a reserved IP
// Argument ipBlock string is an IPV4 range which needs to be unreserved
func (ipAllocator *IPAllocator) Release(ipV4Range string) {
delete(ipAllocator.hold, ipV4Range)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a lock

@msau42
Copy link
Contributor

msau42 commented Aug 1, 2018

Also please run hack/update-gofmt.sh

Copy link
Contributor

@msau42 msau42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! I just have nit picky naming comments.

@@ -104,28 +104,26 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu
return nil, status.Error(codes.AlreadyExists, err.Error())
}
} else {

// Get the reserved-IPV4-cidr from the request. If value is not provided, reservedIPV4Range is defaulted to ""
// If we are creating a new instance, we need to evaluate the reservedIPRange range if reserved-ipv4-cidr is provided
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe something like "we need to pick an unused /29 range from reserved-ipv4-cidr" would be more clear

validCIDR := s.config.ipAllocator.ValidateCIDR(reservedIPV4CIDR)
if !validCIDR {
return nil, err
return nil, fmt.Errorf("The provided reserved-ipv4-cidr %s evaluated to be invalid after parsing", reservedIPV4CIDR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error messages should start with lowercase. a simpler wording could be "invalid reserved-ipv4-cidr %s"

}

reservedIPRange, err := s.reserveIPV4Range(ctx, newFiler, reservedIPV4CIDR)

// Possible cases are 1) CreateInstanceAborted, 2)CreateInstance running in background
// In either cases, the ListInstances responses would consist of the updated result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the ListInstances response will contain the reservedIPRange if the operation was started"


// Possible cases are 1) CreateInstanceAborted, 2)CreateInstance running in background
// In either cases, the ListInstances responses would consist of the updated result
defer s.config.ipAllocator.ReleaseIPV4Range(reservedIPRange)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of this function is symmetrical with the controller call, but not with the ipAllocator. It may be cleaner to make it symmetrical with ipAlloactor, and break up reserveIPRange into "getUsedIPRanges" and then call the ipAllocator.GetUnreservedIPBlock.

Let's also try to standardize on the terminology. We're using ipblock, iprange, and ipv4-cidr. Maybe we can reduce it down to two.

  1. ipv4-cidr to refer to the bigger ipnet that we're extracting from. This aligns with the rest of GKE naming
  2. ipRange to refer to the /29 ipnet that we pass to Filestore. This aligns with Filestore naming.

@@ -137,7 +135,7 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu
return &csi.CreateVolumeResponse{Volume: fileInstanceToCSIVolume(filer, modeInstance)}, nil
}

// getUnreservedIPBlock returns the available IP in the cidr. It is synchronized in order to avoid data race while getting unreserved IP range in a CIDR
// getUnreservedIPBlock returns the available IP in the cidr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the function name is not consistent in the comment

holdMutex sync.Mutex
}

// NewIPAllocator is the constructor to initialize the IPAllocator object
// Argument reserved map[string]bool is a set of currently reserved IPs for which the CreateInstance operation is not completed
// Argument hold map[string]bool is a set of IP ranges currently reserved by service instances but pending reservation in the cloud instances
func NewIPAllocator(hold map[string]bool) *IPAllocator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does hold need to be initialized by the caller, or can it be initialized inside the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be initialized by the caller

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? right now I see the caller is just initializing it to empty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just using it as a parameterized constructor here where the caller has the flexibility to initialize it to any value rather than enforcing a default

hold map[string]bool
// hold set maintains the set of IP ranges that have been reserved by the service instances but pending reservation in the cloud instances
// The key is a IP range currently reserved by a service instance e.g(192.168.92.0/29). Value is a bool to implement map as a set
hold map[string]bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better name could be "pendingIPRanges"

@@ -63,23 +68,21 @@ func (ipAllocator *IPAllocator) ValidateCIDR(cidr string) bool {

// GetUnreservedIPBlock returns an unreserved /29 IP block.
// cidr: Provided cidr address in which we need to look for an unreserved /29 IP Block
// reservedIPRanges: All the Reserved IP ranges present in the cloud provider
// All the Reserved IP ranges not present in the cloud provider are extracted from the reserved list in the IPAllocator
// activeInstancesReservedIPRanges: All the reserved IP ranges present in the cloud instances
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "used" makes more sense than "reserved"

Copy link

@davidz627 davidz627 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 small nits. then most likely LGTM

// If we are creating a new instance, we need to evaluate the reservedIPRange range if reserved-ipv4-cidr is provided
// If the param was not provided, we default reservedIPRange to "" and cloud provider takes care of the allocation
reservedIPV4CIDR, ok := req.GetParameters()[paramReservedIPV4CIDR]
if ok {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if only needed inside if scope you can do:
if reservedIPV4CIDR, ok := req.GetParameters()[paramReservedIPV4CIDR]; ok {

incrementedIP = incrementedIP.To4()

// Step can be added directly to the Least Significant Byte and we can return the result
if incrementedIP[3] < 255-step {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 255 should be a const

reservedIPRange, err := s.reserveIPV4Range(ctx, newFiler, reservedIPV4CIDR)

// Possible cases are 1) CreateInstanceAborted, 2)CreateInstance running in background
// In either cases, the ListInstances responses would consist of the updated result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the ListInstances response will include the reservedIPRange if the operation started"

@@ -110,6 +134,26 @@ func (s *controllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolu
return &csi.CreateVolumeResponse{Volume: fileInstanceToCSIVolume(filer, modeInstance)}, nil
}

// getUnreservedIPBlock returns the available IP in the cidr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the function name is wrong in this comment

"returns an available /29 IP range from the cidr"

type IPAllocator struct {
// hold set maintains the set of IP ranges that have been reserved by the service instances but pending reservation in the cloud instances
// The key is a IP range currently reserved by a service instance e.g(192.168.92.0/29). Value is a bool to implement map as a set
hold map[string]bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a more descriptive name may be "pendingIPRanges"

// GetUnreservedIPBlock returns an unreserved /29 IP block.
// cidr: Provided cidr address in which we need to look for an unreserved /29 IP Block
// activeInstancesReservedIPRanges: All the reserved IP ranges present in the cloud instances
// All the reserved IP ranges in the service instances not updated in cloud instances extracted from the hold list in the IPAllocator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this comment is referring to

// activeInstancesReservedIPRanges: All the reserved IP ranges present in the cloud instances
// All the reserved IP ranges in the service instances not updated in cloud instances extracted from the hold list in the IPAllocator
// Finally a final reserved list is created by merging these two lists
// Potential error cases: 1) No /29 Block in the CIDR is unreserved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: when enumerating a list, I would put 1) on a new line

reserved[reservedIPRange] = true
}

for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; cidrIPBlock, err = incrementIP(cidrIPBlock, incrementStep29IPBlock) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to log the error? right now it is lost

reserved[reservedIPRange] = true
}

for cidrIPBlock := cloneIP(ip.Mask(ipnet.Mask)); ipnet.Contains(cidrIPBlock) && err == nil; cidrIPBlock, err = incrementIP(cidrIPBlock, incrementStep29IPBlock) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cidrIPBlock is a little misleading because it's actually an ip, not an ipnet.

ipAllocator := initTestIPAllocator()
for i := 0; i < index; i++ {
// Distributing IPs between the reserved lists at the cloud provider end and the hold list in the ipAllocator
if i%2 == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a good idea to generate test input based on the ordering of the test case. If a test case gets added in the middle, then all the subsequent tests will fail. It would be better make the testcase inputs an explicit parameter in the test case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distribution of IPs in the two lists is completely random and the test does not depend on the ordering

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you moved the test case "0/4 /29 IP blocks used", to the end of the cases slice, would the tests fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. got it. thanks

errorExpected bool
}{
{
name: "Overlapping CIDRs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also have test cases that check overlapping and non-overlapping CIDRs where the mask (/x) is the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

errorExpected bool
}{
{
name: "0/4 /29 IP blocks used",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a test case where the cidr is misaligned and could cause an ip range to be returned that is smaller than /29, ie 192.168.0.253/29

Copy link
Contributor Author

@krunaljain krunaljain Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our existing implementation, we are using masking to set up the address. Hence for the CIDR you provided, the starting /29 IP would still be 192.168.0.248(the starting IP address of that CIDR block) and you would always get a block with 8 IPs. Can add the test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of masking it, we should return an error if the provided CIDR is misaligned.

Copy link
Contributor Author

@krunaljain krunaljain Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

192.168.0.253/29 is one of the IP values in the CIDR. IP value can be anything from 192.168.0.248-255/29 in this network. When we get a reservedIPRange, its not necessarily the first IP in that CIDR but part of a network with 8 IPs. The /29 CIDR is uniquely identified by its network ID which is the first 29 bits. The last 3 bits can be anything and hence we use the mask to reset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we ensure that all 8 ips are unused?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, when we get a reservedIPRange and pass it to Filestore, that IP range needs to contain 8 IPs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup that part is taken care of by the cidr overlap function. it ensures the trailing 29 bits of your network are completely unique. so you are always going to end up with 8 unused IPs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you pass in a misaligned IP range to Filestore, will it succeed?

I think for simplicity, we should validate that the user's CIDR is aligned.

Copy link
Contributor Author

@krunaljain krunaljain Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would never end up passing a misaligned range from the implementation. It always passes in the first IP of the available /29 block. If the user passes the reserved-ipv4-cidr of 192.168.0.253/29, we assume he has all the IPs from 192.168.0.248-255/29

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. We should at least validate the user specified CIDR params so there's no confusion if we end up creating a filestore instance with a "lower" IP

@krunaljain krunaljain force-pushed the feature/ip_block_reservation branch 3 times, most recently from c084fd9 to 526ba19 Compare August 2, 2018 21:21
cidr: "192.168.92.192",
expected: false,
},
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the tests for the required alignment check

@krunaljain krunaljain force-pushed the feature/ip_block_reservation branch 5 times, most recently from d992a8b to 14d6cd4 Compare August 7, 2018 18:23
KNOWN_ISSUES.md Outdated
@@ -1,14 +1,17 @@
# v0.1.0
* IP conflicts: By default, Cloud Filestore creation will pick an unused IP subnet to allocate its
* IP conflicts: By default, Cloud Filestore creation will pick an unused IP range to allocate its
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this diff is showing up because it's already like this in master. Maybe you need to rebase?

// The reserved-ipv4-cidr network size must be at least /29
cidrSize, _ := ipnet.Mask.Size()
if cidrSize > ipRangeSize {
return nil, nil, fmt.Errorf("the maximum size of network address in the cidr can be /%d", ipRangeSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"maximum" is sort of misleading because the bigger the number, the smaller the network is. Maybe instead the message should say "the reserved-ipv4-cidr network size must be at least /29"


// The IP specified in the reserved-ipv4-cidr must be must be aligned on the /29 network boundary
if ip.String() != ip.Mask(ipRangeMask).String() {
return nil, nil, fmt.Errorf("the IP specified in the reserved-ipv4-cidr must be aligned to cover /29 IP Range")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... must be aligned on the /29 network boundary"

errorExpected: false,
},
{
name: "Valid increment with step size 8 and carry forward uptil 1st byte",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add in test cases that could potentially catch off-by-one errors?

For example, test cases where the expected should be ".255", ".0" or ".1"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have added the tests

@krunaljain krunaljain force-pushed the feature/ip_block_reservation branch 3 times, most recently from 0411460 to 53a9be1 Compare August 8, 2018 01:47
errorExpected: false,
},
{
name: "Valid increment with step size 8 and last byte expected 0",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected byte .0 test

errorExpected: false,
},
{
name: "Valid increment with step size 8 and last byte expected 255",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected byte .255 test

errorExpected: false,
},
{
name: "Valid increment with step size 8 and last byte expected 1",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected byte .1 test


// The IP specified in the reserved-ipv4-cidr must be aligned on the /29 network boundary
if ip.String() != ip.Mask(ipRangeMask).String() {
return nil, nil, fmt.Errorf("the IP specified in the reserved-ipv4-cidr must be aligned on the /29 network boundary")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated error message

// The reserved-ipv4-cidr network size must be at least /29
cidrSize, _ := ipnet.Mask.Size()
if cidrSize > ipRangeSize {
return nil, nil, fmt.Errorf("the reserved-ipv4-cidr network size must be at least /%d", ipRangeSize)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated error message

@msau42
Copy link
Contributor

msau42 commented Aug 8, 2018

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Aug 8, 2018
@davidz627
Copy link

davidz627 commented Aug 8, 2018

@msau42 I think you have to explicitly approve

I plan on doing one last pass-through though so please hold off until then

@msau42
Copy link
Contributor

msau42 commented Aug 8, 2018

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: krunaljain, msau42

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 8, 2018
@msau42
Copy link
Contributor

msau42 commented Aug 8, 2018

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 8, 2018
@davidz627
Copy link

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 8, 2018
@k8s-ci-robot k8s-ci-robot merged commit ee1c110 into kubernetes-sigs:master Aug 8, 2018
bertinatto pushed a commit to bertinatto/gcp-filestore-csi-driver that referenced this pull request Oct 13, 2023
…ncy-openshift-4.14-ose-gcp-filestore-csi-driver

Updating ose-gcp-filestore-csi-driver images to be consistent with ART
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants