Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: pre filter groups before enforcing #4296 #6651

Merged
merged 3 commits into from
Jul 26, 2021

Conversation

farodin91
Copy link
Contributor

@farodin91 farodin91 commented Jul 7, 2021

Part of: #4296

Impact of my change for 10000 applications and a user with 50 groups. I reduced the hits of EnforceRuntimePolicy from 500000 below 30000.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • The title of the PR states what changed and the related issues number (used for the release note).
  • I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • Does this PR require documentation updates?
  • I've updated documentation as required by this PR.
  • Optional. My organization is added to USERS.md.
  • I have signed off all my commits as required by DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My build is green (troubleshooting builds).

@jannfis
Copy link
Member

jannfis commented Jul 7, 2021

Thanks @farodin91, looks promising. Can you please explain a little more of what this change exactly does? It touches a pretty critical path of the API server, so should not go uncommented.

@farodin91
Copy link
Contributor Author

My idea is to prefilter groups in the claims to only the once that are used defined policy model.

So if a user is in 50 oidc groups but only 3 groups are used, i have to check only these three groups.

I currently check the tests are failing.

Part of: argoproj#4296

Signed-off-by: Jan Jansen <jan.jansen@gdata.de>
@codecov
Copy link

codecov bot commented Jul 7, 2021

Codecov Report

Merging #6651 (5c4a05d) into master (e0db23b) will increase coverage by 0.01%.
The diff coverage is 76.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6651      +/-   ##
==========================================
+ Coverage   41.30%   41.31%   +0.01%     
==========================================
  Files         156      156              
  Lines       20702    20711       +9     
==========================================
+ Hits         8550     8556       +6     
- Misses      10944    10946       +2     
- Partials     1208     1209       +1     
Impacted Files Coverage Δ
util/jwt/jwt.go 60.00% <0.00%> (-1.91%) ⬇️
server/rbacpolicy/rbacpolicy.go 78.78% <87.50%> (-0.25%) ⬇️
util/rbac/rbac.go 70.86% <100.00%> (+0.59%) ⬆️
util/settings/settings.go 47.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e0db23b...5c4a05d. Read the comment docs.

Signed-off-by: Jan Jansen <jan.jansen@gdata.de>
@farodin91
Copy link
Contributor Author

@jannfis The second commit is another speed improvement.

Should i add extra comments in the code?

I'm also looked into the casbin which also supports caching what could improve massivly.

@farodin91
Copy link
Contributor Author

I used go benchmark to compare with and without caching.
Setup 12 groups

run cached time
1. 2800ms
2. 2200ms
1. ✔️ 500ms
2. ✔️ 90ms

It looks like their is potential improvement also to prevent duplicated execute of casbin Enforce.

@farodin91
Copy link
Contributor Author

Here is my code to test cached variant.

diff --git a/util/rbac/rbac.go b/util/rbac/rbac.go
index c43deb92c..490af1fb6 100644
--- a/util/rbac/rbac.go
+++ b/util/rbac/rbac.go
@@ -35,6 +35,14 @@ const (
 	defaultRBACSyncPeriod = 10 * time.Minute
 )
 
+type WrapperEnforcer interface {
+	EnableLog(bool)
+	Enforce(rvals ...interface{}) bool
+	LoadPolicy() error
+	GetGroupingPolicy() [][]string
+	EnableEnforce(bool)
+}
+
 // Enforcer is a wrapper around an Casbin enforcer that:
 // * is backed by a kubernetes config map
 // * has a predefined RBAC model
@@ -42,7 +50,7 @@ const (
 // * supports a user-defined policy
 // * supports a custom JWT claims enforce function
 type Enforcer struct {
-	*casbin.Enforcer
+	WrapperEnforcer
 	adapter            *argocdAdapter
 	clientset          kubernetes.Interface
 	namespace          string
@@ -55,11 +63,14 @@ type Enforcer struct {
 // ClaimsEnforcerFunc is func template to enforce a JWT claims. The subject is replaced
 type ClaimsEnforcerFunc func(claims jwt.Claims, rvals ...interface{}) bool
 
-func newEnforcerSafe(params ...interface{}) (e *casbin.Enforcer, err error) {
-	enfs, err := casbin.NewEnforcerSafe(params...)
-	if err != nil {
-		return nil, err
-	}
+func newEnforcerSafe(params ...interface{}) (e WrapperEnforcer, err error) {
+	defer func() {
+		if r := recover(); r != nil {
+			err = fmt.Errorf("%v", r)
+			e = nil
+		}
+	}()
+	enfs := casbin.NewCachedEnforcer(params...)
 	enfs.AddFunction("globMatch", func(args ...interface{}) (interface{}, error) {
 		if len(args) < 2 {
 			return false, nil
@@ -88,7 +99,7 @@ func NewEnforcer(clientset kubernetes.Interface, namespace, configmap string, cl
 	}
 	enf.EnableLog(false)
 	return &Enforcer{
-		Enforcer:           enf,
+		WrapperEnforcer:     enf,
 		adapter:            adapter,
 		clientset:          clientset,
 		namespace:          namespace,
@@ -98,6 +109,21 @@ func NewEnforcer(clientset kubernetes.Interface, namespace, configmap string, cl
 	}
 }
 
+func (e *Enforcer) EnableEnforce(s bool){
+	e.WrapperEnforcer.EnableEnforce(s)
+	if invalidator, ok := e.WrapperEnforcer.(interface{ InvalidateCache() }); ok {
+		invalidator.InvalidateCache()
+	}
+}
+
+func (e *Enforcer) LoadPolicy() error {
+	err := e.WrapperEnforcer.LoadPolicy()
+	if invalidator, ok := e.WrapperEnforcer.(interface{ InvalidateCache() }); ok {
+		invalidator.InvalidateCache()
+	}
+	return err
+}
+
 // SetDefaultRole sets a default role to use during enforcement. Will fall back to this role if
 // normal enforcement fails
 func (e *Enforcer) SetDefaultRole(roleName string) {
@@ -114,7 +140,7 @@ func (e *Enforcer) SetClaimsEnforcerFunc(claimsEnforcer ClaimsEnforcerFunc) {
 // Enforce is a wrapper around casbin.Enforce to additionally enforce a default role and a custom
 // claims function
 func (e *Enforcer) Enforce(rvals ...interface{}) bool {
-	return enforce(e.Enforcer, e.defaultRole, e.claimsEnforcerFunc, rvals...)
+	return enforce(e.WrapperEnforcer, e.defaultRole, e.claimsEnforcerFunc, rvals...)
 }
 
 // EnforceErr is a convenience helper to wrap a failed enforcement with a detailed error about the request
@@ -154,27 +180,27 @@ func (e *Enforcer) EnforceRuntimePolicy(policy string, rvals ...interface{}) boo
 	return e.EnforceWithCustomEnforcer(enf, rvals...)
 }
 
-func (e *Enforcer) CreateEnforcerWithRuntimePolicy(policy string) *casbin.Enforcer {
-	var enf *casbin.Enforcer
+func (e *Enforcer) CreateEnforcerWithRuntimePolicy(policy string) WrapperEnforcer {
+	var enf WrapperEnforcer
 	var err error
 	if policy == "" {
-		enf = e.Enforcer
+		enf = e.WrapperEnforcer
 	} else {
 		enf, err = newEnforcerSafe(newBuiltInModel(), newAdapter(e.adapter.builtinPolicy, e.adapter.userDefinedPolicy, policy))
 		if err != nil {
 			log.Warnf("invalid runtime policy: %s", policy)
-			enf = e.Enforcer
+			enf = e.WrapperEnforcer
 		}
 	}
 	return enf
 }
 
-func (e *Enforcer) EnforceWithCustomEnforcer(enf *casbin.Enforcer, rvals ...interface{}) bool {
+func (e *Enforcer) EnforceWithCustomEnforcer(enf WrapperEnforcer, rvals ...interface{}) bool {
 	return enforce(enf, e.defaultRole, e.claimsEnforcerFunc, rvals...)
 }
 
 // enforce is a helper to additionally check a default role and invoke a custom claims enforcement function
-func enforce(enf *casbin.Enforcer, defaultRole string, claimsEnforcerFunc ClaimsEnforcerFunc, rvals ...interface{}) bool {
+func enforce(enf WrapperEnforcer, defaultRole string, claimsEnforcerFunc ClaimsEnforcerFunc, rvals ...interface{}) bool {
 	// check the default role
 	if defaultRole != "" && len(rvals) >= 2 {
 		if enf.Enforce(append([]interface{}{defaultRole}, rvals[1:]...)...) {

Copy link
Member

@jannfis jannfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few questions, please see below. Caching also looks promising, but we need to be very careful here.

@@ -139,12 +148,11 @@ func newTestAppServer(objects ...runtime.Object) *Server {
objects = append(objects, defaultProj, myProj, projWithSyncWindows)

fakeAppsClientset := apps.NewSimpleClientset(objects...)
factory := appinformer.NewFilteredSharedInformerFactory(fakeAppsClientset, 0, "", func(options *metav1.ListOptions) {})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other function was deprecated. (Just a small cleanup)

Comment on lines -146 to -147
_ = enforcer.SetBuiltinPolicy(assets.BuiltinPolicyCSV)
enforcer.SetDefaultRole("role:admin")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can find the replacement here: https://github.com/argoproj/argo-cd/pull/6651/files#diff-d1d93bfccd6732925bb2dac566cc4d0038a7484a83efa097cc1f5d233f18bea8R78-R79 .

It role:admin is much faster, than special groups.

Comment on lines +15 to +17
if mapClaims, ok := claims.(*jwtgo.MapClaims); ok {
return *mapClaims, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this refactoring need to be part of this PR? Also, I was wondering, if this type conversion fails, would the following marshal/unmarshal operations fail too and are redundant then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It improve the performance by ~10% as follow up of the first commit.

I think their are 4 other implementation of jwtgo.Claims. For those, it should work fine. If convert a struct into Json than convert map[string]interface{} it should nearly all the time.

I looked also in the code for odic and it seems that the default used type is jwtgo.MapClaims.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks pretty safe to me. Good we can avoid marshaling/unmarshaling if claims already have required value.

@jannfis
Copy link
Member

jannfis commented Jul 8, 2021

Should i add extra comments in the code?

Yes please. The more understandable it is, the better imho.

@farodin91
Copy link
Contributor Author

Caching also looks promising, but we need to be very careful here.

I would create a follow up PR for it.

@farodin91
Copy link
Contributor Author

@jannfis I added some comments

Signed-off-by: Jan Jansen <jan.jansen@gdata.de>
@farodin91
Copy link
Contributor Author

@jannfis Would you like to review it again?

@alexmt
Copy link
Collaborator

alexmt commented Jul 16, 2021

Looking as well

@farodin91
Copy link
Contributor Author

Looking as well

Thank you

@alexmt alexmt added this to the v2.1 milestone Jul 22, 2021
return true

// Get groups to reduce the amount to checking groups
groupingPolicies := enforcer.GetGroupingPolicy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Glad you did not have to implement your own policy parsing to get the list of groups.

@alexmt alexmt self-assigned this Jul 26, 2021
Copy link
Collaborator

@alexmt alexmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, @farodin91 !

I could not find any caveats in the logic. It should be safe to skip checking the group if that group is never mentioned in the policy. So LGTM.

@alexmt alexmt merged commit af516e9 into argoproj:master Jul 26, 2021
@alexmt alexmt added the needs-verification PR requires pre-release verification label Jul 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-verification PR requires pre-release verification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants