Skip to content

Conversation

@vakalapa
Copy link
Contributor

@vakalapa vakalapa commented Jan 24, 2022

Reason for Change:

Adding DPShim component,

DPShim works as a translator, cache and transporter of Resource events into sets/policies and relays them to gRPC server. The gRPC server will distribute these goal states to all the registered daemon clients.

DPShim:

  1. Cache locally all the sets and policies
  2. On Apply Dataplane, convert dirty cache to relevant GoalState events and send to gRPC server
  3. periodically delete unused sets
  4. Provide a function for clients on first time registration to get the state of whole cache

Issue Fixed:

Requirements:

Notes:

@vakalapa vakalapa added enhancement npm Related to NPM. labels Jan 24, 2022
}

func (dp *DPShim) getIPSet(setName string) *controlplane.ControllerIPSets {
return dp.setCache[setName]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be getCache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getIPSet is in line with our interface, so prefer to keep it this. Unless you have strong reservations ?

Comment on lines 123 to 124
dp.Lock()
defer dp.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might make more sense to move the lock/unlock to the helper functions deleteIPSet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am following the practice of Exposed function deal with locks and internal functions to not worry about locks. Internal functions by default we can assume that they are to be handled with care. Also, createIPSet is called from some external functions and we cannot deal with locking and unlocking always. Wdyt ?

Comment on lines 145 to 146
dp.Lock()
defer dp.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have this in createIPSet we wouldn't have to lock on a large code block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except for the exists check everything needs to be locked anyway, so i believe trying to lock for each loop might add more overhead. wdyt ?

}

policy := dp.policyCache[policyKey]
toApplyPolicies[idx] = policy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont need to use ID - we can just do this,

toApplyPolicies = append(toApplyPolicies, policy)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you use append, you should set the initial capacity of the slice to avoid resizing: make([]thing, 0, len(cache))

return nil, npmerrors.Errorf(npmerrors.AddPolicy, false, fmt.Sprintf("policy %s not found", policyKey))
}

policy := dp.policyCache[policyKey]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the policy doesnt exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats the check we are doing one line above ?

idx := 0

for policyKey := range dp.dirtyCache.toDeletePolicies {
toDeletePolicies[idx] = policyKey
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same - make use of append

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding resizing since we know the size of data.

dp.checkSetReferences()
err := dp.ApplyDataPlane()
if err != nil {
klog.Errorf("deleteUnusedSets: failed to apply dataplane %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been curious about error propagation in go routines. Should we create an error channel to feedback errors and handle them in the caller code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is across our codebase I mean

Copy link
Contributor

@nitishm nitishm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality looks good - just some nit picking

Copy link
Contributor

@huntergregory huntergregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dpshim looks pretty good to me. Made several suggestions. This PR seems dependent on #1202 for validation logic

func DecodeStrings(payload *bytes.Buffer) ([]string, error) {
if payload == nil {
return "", npmerrors.SimpleError("failed to decode, payload is nil")
return nil, npmerrors.SimpleError("failed to decode, payload is nil")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we check if len(payload) == 0 instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly for other decode funcs

IPSetMetadata: metadata,
IPPodMetadata: make(map[string]*dp.PodMetadata),
MemberIPSets: make(map[string]*ipsets.IPSetMetadata),
ipsetReference: make(map[string]struct{}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this to be a map? What about having referCount like in ipsets.IPSet?

}

set := dp.setCache[prefixedSetName]
if set.IPSetMetadata.GetSetKind() != ipsets.HashSet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should generalize addToSet, deleteFromSet, addToList, ... validations into a dp func for both dpshim and Dataplane

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed lets plan on how we can achieve this properly.

}

if len(goalStates) == 0 {
klog.Info("ApplyDataPlane: No changes to apply")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this ever happen?

}

policy := dp.policyCache[policyKey]
toApplyPolicies[idx] = policy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you use append, you should set the initial capacity of the slice to avoid resizing: make([]thing, 0, len(cache))

// prometheus handlers
if config.Toggles.EnablePrometheusMetrics {
rs.router.Handle(api.NodeMetricsPath, metrics.GetHandler(metrics.NodeMetrics))
rs.router.Handle(api.ClusterMetricsPath, metrics.GetHandler(metrics.ClusterMetrics))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this nil check for the encoder important for v1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not relevant in V1 case, this is specifically for daemon where we will need the PProf endpoint but epmencoder is not yet supported. That is a todo item for later.

}()
dp.Lock()
defer dp.Unlock()
dp.lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlock in the defer func? with current code, dp will be unlocked before dp.ApplyDataplane() because defers are processed LIFO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that ApplyDataplane locks too. Maybe we make a little applyDataplane that doesn't lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional and keeping the same design from our existing DP, we do not anticipate another process coming in between and making changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it does i do not think there is a major issue, except for one, i.e. if a Set is added/created and immediately deleted then kernel might complain about it does not exists. But i think you had suggested a fix for it to use --exists, is that in master ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will it add system overhead to lock/unlock twice for each Policy call though? We haven't solved the created then deleted edge case yet (in the non-fan-out model, we thought that situation shouldn't happen)

Copy link
Contributor

@nitishm nitishm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic seems wrong in b7ffd35#r806133602

Copy link
Contributor

@nitishm nitishm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@huntergregory huntergregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@vakalapa vakalapa merged commit ad516c0 into master Feb 15, 2022
@vakalapa vakalapa deleted the vakr/dpshim branch February 15, 2022 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement npm Related to NPM.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants