Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init global settings from object store #1043

Merged
merged 1 commit into from
Nov 9, 2017
Merged

Init global settings from object store #1043

merged 1 commit into from
Nov 9, 2017

Conversation

tiewei
Copy link
Contributor

@tiewei tiewei commented Nov 1, 2017

Currently netplugin is trying to init from default settings if global
setting is not ready or wrong, and when the settings don't match it
will re-init netplugin. This is very easy to go wrong when there are
resources created before global settings are ready.
This commit makes netplugin wait until the global settings ready, and
initiate itself according to it. If the global settings are mis-match,
it will return error.

Signed-off-by: Wei Tie wtie@cisco.com

@kahou82
Copy link
Member

kahou82 commented Nov 1, 2017

Can you add unit test?

InitGlobalSettings(p.StateDriver, &pluginConfig.Instance)
err = InitGlobalSettings(p.StateDriver, &pluginConfig.Instance)
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if global setting != instant setting, are we going to do anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be handled here

	// Init the driver plugins..
	err = netPlugin.Init(*pluginConfig)
	if err != nil {
		log.Fatalf("Failed to initialize the plugin. Error: %s", err)
	}

which will exit(1) after log fatal

} else {
if inst.FwdMode != gCfg.FwdMode {
logrus.Errorf("Global forwarding mode setting doesn't match local settings, global=%v , local=%v", gCfg.FwdMode, inst.FwdMode)
return core.Errorf("Global forwarding mode setting doesn't match local settings, global=%v , local=%v", gCfg.FwdMode, inst.FwdMode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use fmt.Errorf and assign it to a variable instead of having two duplicated string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's actually every odd we had core.Errorf and its logic .... yeah I will change to use fmt.Errorf , but codes with core.Errorf has been used in many places through the project

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dseevr ^ we need to pick one way or the other and keep it consistent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or use errors pkg

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get rid of core.Error entirely IMO. We don't need the stack trace stuff in it.

Based on the people who wrote it (erik/madhav) and the dates, it looks like it was a precursor to errored.

I'll add an issue for this tomorrow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say let's use fmt.Errorf() or errors.New() for now. As Kahou said, let's also avoid duplicating strings. There's a lot of those that need to be cleaned up in the codebase(s)...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

} else {
logrus.Errorf("Error reading private subnet cluster store")
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(blank line)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

logrus.Errorf("Error reading forwarding mode from cluster store")
} else {
logrus.Errorf("Error reading private subnet cluster store")
}
Copy link
Member

@kahou82 kahou82 Nov 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be:
if gCfg.FwdMode == "" { logrus.Warnf }
if gCfg.PvtSubnet == "" { logrus.Warnf}

inst.HostPvtNW = net
logrus.Infof("HostPvtNW: %v", net)
logrus.Errorf("Error convert private subnet %v from CIDR to mask, error %v", gCfg.PvtSubnet, err)
return core.Errorf("Error convert private subnet %v from CIDR to mask, error %v", gCfg.PvtSubnet, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use fmt.Errorf and assign it to a variable instead of having two duplicated string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@tiewei
Copy link
Contributor Author

tiewei commented Nov 1, 2017

UT will be followed


/*
Query global settings from state store
1. if forward mode or private net is empty, retry 1 sec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"retry after 1 second"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// wait until able to get fwd mode and private subnet
for {
err := gCfg.Read("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the error is only checked immediately following the function call, it's Go convention to just check it inline in the if statement:

if err := gCfg.Read(""); err != nil {
        logrus.Errorf("Error reading global settings from cluster store")
}

This style of assignment is also okay to use even if there is an existing error object in the current scope. You can always use := and the err var does not leak out into the function scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

} else {
logrus.Errorf("Error reading private subnet cluster store")
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(blank line)

}

} else {
logrus.Infof("Get global forwarding mode: %v", gCfg.FwdMode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Got" ?

@@ -50,8 +51,6 @@ type NetPlugin struct {
PluginConfig Config
}

const defaultPvtSubnet = 0xac130000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you are removing the default svc net,
can you add it in netmaster/netctl so that it uses this svc subnet by default ?

Copy link
Contributor Author

@tiewei tiewei Nov 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it does, i'll add more stuff on the netmaster side after this PR, it currently still having default value to init global settings as bridge instead of wait until it set

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

for {
err := gCfg.Read("")
if err != nil {
logrus.Errorf("Error reading global settings from cluster store")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sleep & continue ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improved in other way

if inst.FwdMode == "" {
inst.FwdMode = gCfg.FwdMode
} else {
if inst.FwdMode != gCfg.FwdMode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check should be in netmaster/netctl.
netplugin being a worker node, it should accept the value from the netmaster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically local fwd mode will always be empty or the one get from kv store with this code, it doesn't hurt if anything wired happens here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently inst.FwdMode is "" because during init there's no default value set

@tiewei
Copy link
Contributor Author

tiewei commented Nov 3, 2017

test case added

@unclejack
Copy link
Contributor

build PR

1 similar comment
@tiewei
Copy link
Contributor Author

tiewei commented Nov 3, 2017

build PR

} else {
inst.HostPvtNW = net
logrus.Infof("HostPvtNW: %v", net)
err := fmt.Errorf("Error convert private subnet %v from CIDR to mask, error %v", gCfg.PvtSubnet, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: convert->converting, net->netmask, colon at the end is nice to delineate the error message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
inst.HostPvtNW = net
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we lost the message of what we initialized the Private network to

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@tiewei
Copy link
Contributor Author

tiewei commented Nov 3, 2017

build PR

1 similar comment
@tiewei
Copy link
Contributor Author

tiewei commented Nov 3, 2017

build PR

if inst.FwdMode == "" {
inst.FwdMode = gCfg.FwdMode
} else {
if inst.FwdMode != gCfg.FwdMode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this happen ever here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it depends, if we moves the fwd mode into the CLI input for netplugin, it will, but not today

@@ -116,7 +116,7 @@ func NewAPIController(router *mux.Router, objdbClient objdb.API, storeURL string
NetworkInfraType: "default",
Vlans: "1-4094",
Vxlans: "1-10000",
FwdMode: "bridge",
FwdMode: "", // set empty fwd mode by default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed, variables are initialized by default

Copy link
Contributor Author

@tiewei tiewei Nov 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would leave as empty string here and in next commit to make it accept from CLI, and I think defined explicitly is always better

@tiewei
Copy link
Contributor Author

tiewei commented Nov 6, 2017

build PR

3 similar comments
@tiewei
Copy link
Contributor Author

tiewei commented Nov 7, 2017

build PR

@tiewei
Copy link
Contributor Author

tiewei commented Nov 7, 2017

build PR

@tiewei
Copy link
Contributor Author

tiewei commented Nov 7, 2017

build PR

@@ -921,23 +921,16 @@ func HostIPToGateway(hostIP string) (string, error) {
return ip[0] + "." + ip[1] + ".255.254", nil
}

// CIDRToMask converts a mask to corresponding netmask
// CIDRToMask converts a CIRD to corresponding network number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CIDR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ArpMode: "proxy",
PvtSubnet: "172.19.0.0/16",
}), IsNil)
time.Sleep(40 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since everything but the FwdMode string is the same, just make that a variable set via an if else and pass in the variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ArpMode: "proxy",
PvtSubnet: "172.19.0.0/16",
}), IsNil)
time.Sleep(120 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, and why 2 minute sleep vs 40s as above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

@tiewei tiewei Nov 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess it was because it's faster on metal server, the sleep time was in old code as well

@@ -86,20 +86,18 @@ func NewNPCluster(its *integTestSuite) (*NPCluster, error) {
time.Sleep(10 * time.Second)

// set forwarding mode if required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's always required, no longer if :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@tiewei
Copy link
Contributor Author

tiewei commented Nov 7, 2017

build PR

"state": "fakedriver",
"container": "docker",
},
"plugin-instance": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra tab?

func TestNetPluginInit(t *testing.T) {
initFakeStateDriver(t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what you are testing?

}

func TestNetPluginInitInvalidConfigInvalidPrivateSubnet(t *testing.T) {
initFakeStateDriver(t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what you are testing?

})

if err != nil {
log.Fatalf("Error creating global state. Err: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exit if it fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was doing so, you want me to change the behavior ?

}

s.setGlobalSettings(c, s.fwdMode)
time.Sleep(40 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 40 seconds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was set to 40 sec on baremetal testing and 120 sec on vagrant case, this magic number comes from previous testing, i think

@@ -1438,21 +1442,12 @@ func (s *systemtestSuite) SetUpTestVagrant(c *C) {
}
}

s.setGlobalSettings(c, s.fwdMode)
time.Sleep(120 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, why 120 seconds?

return err
}
}
inst.FwdMode = gCfg.FwdMode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it looks like we are going all in with config from etcd, should we warn if inst.FwdMode is configured whever that is created?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no chance it would be configured though ... this function only got called when the netplugin starts, and we are not providing a way to set the fwdmode today

@tiewei
Copy link
Contributor Author

tiewei commented Nov 7, 2017

build PR

"db-url": "etcd://127.0.0.1:4001"
}
}`
"drivers" : {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to tabs before the chars, if we think spaces are better, i can switch it

@tiewei
Copy link
Contributor Author

tiewei commented Nov 8, 2017

@dseevr @kahou82 @unclejack @chrisplo please let me know if it meets the comments

} else {
if gCfg.FwdMode == "" || gCfg.PvtSubnet == "" {
if gCfg.FwdMode == "" {
logrus.Warnf("Error reading forwarding mode from cluster store")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cosmetic: this is not read error but waiting for config.
to dump all elements in struct xxxf("%+v", gCfg)

// wait until able to get fwd mode and private subnet
for {
if err := gCfg.Read(""); err != nil {
logrus.Errorf("Error reading global settings from cluster store")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include err in log message

@chrisplo
Copy link
Contributor

chrisplo commented Nov 8, 2017

All my comments have been addressed, will defer to the reset of the comments

@tiewei
Copy link
Contributor Author

tiewei commented Nov 8, 2017

build PR

Copy link
Contributor

@rchirakk rchirakk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Currently netplugin is trying to init from default settings if global
setting is not ready or wrong, and when the settings don't match it
will re-init netplugin. This is very easy to go wrong when there are
resources created before global settings are ready.
This commit makes netplugin wait until the global settings ready, and
initiate itself according to it. If the global settings are mis-match,
it will return error.
Also set default fwd mode to empty on netmaster and Refactor CIDRToMask

Signed-off-by: Wei Tie <wtie@cisco.com>
@tiewei
Copy link
Contributor Author

tiewei commented Nov 8, 2017

build PR

3 similar comments
@tiewei
Copy link
Contributor Author

tiewei commented Nov 8, 2017

build PR

@tiewei
Copy link
Contributor Author

tiewei commented Nov 9, 2017

build PR

@tiewei
Copy link
Contributor Author

tiewei commented Nov 9, 2017

build PR

@tiewei
Copy link
Contributor Author

tiewei commented Nov 9, 2017

build PR

Copy link
Contributor

@unclejack unclejack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@unclejack unclejack merged commit d05de09 into contiv:master Nov 9, 2017
chrisplo added a commit to chrisplo/netplugin that referenced this pull request Nov 11, 2017
Docker expects the netplugin socket to be available within 10 seconds
before it fails enabling (or installing) the v2plugin.

Due to contiv#1043, netplugin is blocking waiting for the forward mode to be
set, which is done by netctl calling netmaster, but netmaster is not
started until the plugin is activating.

Instead of backgrounding the plugin install/enabling then letting
ansible set the forward mode, do it in the plugin script to avoid
ansible's unpredictable round trip delays.

Signed-off-by: Chris Plock <chrisplo@cisco.com>
chrisplo added a commit to chrisplo/netplugin that referenced this pull request Nov 11, 2017
Docker expects the netplugin socket to be available within 10 seconds
before it fails enabling (or installing) the v2plugin.

Due to contiv#1043, netplugin is blocking waiting for the forward mode to be
set, which is done by netctl calling netmaster, but netmaster is not
started until the plugin is activating.

Instead of backgrounding the plugin install/enabling then letting
ansible set the forward mode, do it in the plugin script to avoid
ansible's unpredictable round trip delays.

Signed-off-by: Chris Plock <chrisplo@cisco.com>
chrisplo added a commit to chrisplo/netplugin that referenced this pull request Nov 11, 2017
Docker expects the netplugin socket to be available within 10 seconds
before it fails enabling (or installing) the v2plugin.

Due to contiv#1043, netplugin is blocking waiting for the forward mode to be
set, which is done by netctl calling netmaster, but netmaster is not
started until the plugin is activating.

Instead of backgrounding the plugin install/enabling then letting
ansible set the forward mode, do it in the plugin script to avoid
ansible's unpredictable round trip delays.

Signed-off-by: Chris Plock <chrisplo@cisco.com>
chrisplo added a commit to chrisplo/netplugin that referenced this pull request Nov 11, 2017
Docker expects the netplugin socket to be available within 10 seconds
before it fails enabling (or installing) the v2plugin.

Due to contiv#1043, netplugin is blocking waiting for the forward mode to be
set, which is done by netctl calling netmaster, but netmaster is not
started until the plugin is activating.

Instead of backgrounding the plugin install/enabling then letting
ansible set the forward mode, do it in the plugin script to avoid
ansible's unpredictable round trip delays.

Signed-off-by: Chris Plock <chrisplo@cisco.com>
tiewei pushed a commit that referenced this pull request Nov 14, 2017
* v2plugin set forward mode when netmaster up

Docker expects the netplugin socket to be available within 10 seconds
before it fails enabling (or installing) the v2plugin.

Due to #1043, netplugin is blocking waiting for the forward mode to be
set, which is done by netctl calling netmaster, but netmaster is not
started until the plugin is activating.

Instead of backgrounding the plugin install/enabling then letting
ansible set the forward mode, do it in the plugin script to avoid
ansible's unpredictable round trip delays.

v2plugin’s startcontiv.sh errors when fwd_mode not set

Signed-off-by: Chris Plock <chrisplo@cisco.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants