Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Duplicate Cluster parsers in Fluent-bit config. #853

Merged

Conversation

karan56625
Copy link
Collaborator

What this PR does / why we need it:

If we have multiple namespace FB config, don't Create cluster parser for each namespace config. There should be one single cluster parser for all namespace.

Which issue(s) this PR fixes:

Fixes #852

Does this PR introduced a user-facing change?


Additional documentation, usage docs, etc.:


Copy link
Member

@adiforluls adiforluls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes the functionality of being able to use a clusterParser via a FluentBitConfig CR.

I think somehow the clusterParsers were being wrongfully selected, only those ClusterParsers are to be selected that have a labelSelector defined in FluentBitConfig CR. Somehow when no labelSelector is defined all the ClusterParsers are being selected, which shouldn't happen. That needs to be fixed.

@adiforluls
Copy link
Member

Just so that nothing gets lost in translation, the intended behaviour was to create a copy of a clusterParser in parsers.conf for only those FluentBitConfig/Namespaces that actually are selecting a particular ClusterParser to parse logs for that namespace. Right now every ClusterParser is getting selected if no labelSelector is defined for clusterParser in FluentBitConfig CR.

We'd want to keep the functionality of being able to consume clusterParsers via namespaced Filters, a better logic than forming dups can also be proposed.

@karan56625
Copy link
Collaborator Author

This change removes the functionality of being able to use a clusterParser via a FluentBitConfig CR.

I think somehow the clusterParsers were being wrongfully selected, only those ClusterParsers are to be selected that have a labelSelector defined in FluentBitConfig CR. Somehow when no labelSelector is defined all the ClusterParsers are being selected, which shouldn't happen. That needs to be fixed.

With this change Also, ClusterParser can be used via a FluentBitConfig CR. When we are processing namespaces FB Configmap, we are just not processing the parsers now. as we have are doing for filters. May there is some specific use-case you are suggesting that could fail with this change. let me know, I can test that.

@adiforluls
Copy link
Member

This change removes the functionality of being able to use a clusterParser via a FluentBitConfig CR.
I think somehow the clusterParsers were being wrongfully selected, only those ClusterParsers are to be selected that have a labelSelector defined in FluentBitConfig CR. Somehow when no labelSelector is defined all the ClusterParsers are being selected, which shouldn't happen. That needs to be fixed.

With this change Also, ClusterParser can be used via a FluentBitConfig CR. When we are processing namespaces FB Configmap, we are just not processing the parsers now. as we have are doing for filters. May there is some specific use-case you are suggesting that could fail with this change. let me know, I can test that.

Create a ClusterParser, now try to use this parser using a namespaced filter. If this works then everything is fine, otherwise this is broken. It's been a while since I wrote this feature so we can think of a better approach to maintain the functionality if this doesn't work with your change.

@karan56625
Copy link
Collaborator Author

This change removes the functionality of being able to use a clusterParser via a FluentBitConfig CR.
I think somehow the clusterParsers were being wrongfully selected, only those ClusterParsers are to be selected that have a labelSelector defined in FluentBitConfig CR. Somehow when no labelSelector is defined all the ClusterParsers are being selected, which shouldn't happen. That needs to be fixed.

With this change Also, ClusterParser can be used via a FluentBitConfig CR. When we are processing namespaces FB Configmap, we are just not processing the parsers now. as we have are doing for filters. May there is some specific use-case you are suggesting that could fail with this change. let me know, I can test that.

Create a ClusterParser, now try to use this parser using a namespaced filter. If this works then everything is fine, otherwise this is broken. It's been a while since I wrote this feature so we can think of a better approach to maintain the functionality if this doesn't work with your change.

Just thinking, does namespaced parser have any significance? can't we have just parsers(no namespace level parsers). parsers alone can't be used for log processing. It would be always part of a filter & Filter can be namespace level and cluster level. Thoughts?

@karan56625 karan56625 changed the title Don't create Cluster Parser corresponding to Each namespace fluentbit config. Remove Duplicate Cluster parsers in Fluent-bit config. Jul 31, 2023
Signed-off-by: karan k <karan.k@oracle.com>
@karan56625 karan56625 force-pushed the karak/fix_duplicate_cluster_fb_parser branch from 343e50c to 0e867b1 Compare July 31, 2023 12:11
@karan56625
Copy link
Collaborator Author

This change removes the functionality of being able to use a clusterParser via a FluentBitConfig CR.
I think somehow the clusterParsers were being wrongfully selected, only those ClusterParsers are to be selected that have a labelSelector defined in FluentBitConfig CR. Somehow when no labelSelector is defined all the ClusterParsers are being selected, which shouldn't happen. That needs to be fixed.

With this change Also, ClusterParser can be used via a FluentBitConfig CR. When we are processing namespaces FB Configmap, we are just not processing the parsers now. as we have are doing for filters. May there is some specific use-case you are suggesting that could fail with this change. let me know, I can test that.

Create a ClusterParser, now try to use this parser using a namespaced filter. If this works then everything is fine, otherwise this is broken. It's been a while since I wrote this feature so we can think of a better approach to maintain the functionality if this doesn't work with your change.

Just thinking, does namespaced parser have any significance? can't we have just parsers(no namespace level parsers). parsers alone can't be used for log processing. It would be always part of a filter & Filter can be namespace level and cluster level. Thoughts?

@adiforluls As of now, I have just added simple fix to remove the duplicate cluster parsers. It will help to get rid of parser not registered error. But we should re-think about this parser namespace architecture. Honestly, It's not making sense to me to introduce namespace parsers in Fluent-operator. Although Namespace filters are making sense as we are introducing the namespace logging, But namespace parsers are something, we don't need to categorised at cluster and namespace level.

@adiforluls
Copy link
Member

adiforluls commented Jul 31, 2023

@karan56625 The change looks okay but I'll review more carefully tomorrow hopefully and approve then.

For your other questions, I'll look to answer them tomorrow but I think there's some context missing, for that I suggest you to go through the original problem statement. #580

Overall, the gist is to let namespaced users create k8s resources that wraps FluentBit configs, previously only a ClusterAdmin would be able to create cluster level Filter/Parser/Output. Now imagine that I chose to exclude namespaced parser from the design, as a namespaced user/tenant/admin, for your namespaced Filter, you'd then need a ClusterAdmin to create ClusterParser resource for you for your filter to work. For just one namespace this doesn't look like a problem, but imagine this scenario for 100s of namespaces (namespace as a service).

Copy link
Member

@adiforluls adiforluls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes duplicate parser warnings so it's a fix. There's still a low prio bug where ClusterParsers are being selected inspite of not having a labelSelector in FluentbitConfig CR right? Could you open a backlog issue for that, I'd like to get to it when I have time.

@adiforluls
Copy link
Member

Also cc @benjaminhuo this can be merged unless you have some changes you'd like to request

@karan56625
Copy link
Collaborator Author

yeah, If we create one clusterParser, it is getting replicate for each

This fixes duplicate parser warnings so it's a fix. There's still a low prio bug where ClusterParsers are being selected inspite of not having a labelSelector in FluentbitConfig CR right? Could you open a backlog issue for that, I'd like to get to it when I have time.

Filed issue to track that #857.

@karan56625
Copy link
Collaborator Author

@karan56625 The change looks okay but I'll review more carefully tomorrow hopefully and approve then.

For your other questions, I'll look to answer them tomorrow but I think there's some context missing, for that I suggest you to go through the original problem statement. #580

Overall, the gist is to let namespaced users create k8s resources that wraps FluentBit configs, previously only a ClusterAdmin would be able to create cluster level Filter/Parser/Output. Now imagine that I chose to exclude namespaced parser from the design, as a namespaced user/tenant/admin, for your namespaced Filter, you'd then need a ClusterAdmin to create ClusterParser resource for you for your filter to work. For just one namespace this doesn't look like a problem, but imagine this scenario for 100s of namespaces (namespace as a service).

To cover that use-case, we are creating duplicate parser which are doing the same job. Suppose if we have 100 namespace config and one cluster parser. So there will be 101 entry for the same cluster in parsers.conf that seems redundant. Even if we fix it, and 94 namespace configMap are referring to one cluster parser, there will be 95 entries in parsers.conf. This is for just one cluster parser. suppose if namespaced config is referring to many cluster parsers.
If we are concern about the namespaced user/tenant, some RBAC can be provided to those to create Cluster parses if they don't have any cluster parser that can be utilised. OR there could be several other way. But with current scenario, multiple duplicates entry in parsers.conf will confuse them. Something we need to discuss or brought up some optimisation in current approach.

@karan56625
Copy link
Collaborator Author

@benjaminhuo if you are fine with this simple fix to get rid of error message in fluentbit logs, can I seek your approval and request you to merge this if there is no issue.

@adiforluls
Copy link
Member

@karan56625 The change looks okay but I'll review more carefully tomorrow hopefully and approve then.
For your other questions, I'll look to answer them tomorrow but I think there's some context missing, for that I suggest you to go through the original problem statement. #580
Overall, the gist is to let namespaced users create k8s resources that wraps FluentBit configs, previously only a ClusterAdmin would be able to create cluster level Filter/Parser/Output. Now imagine that I chose to exclude namespaced parser from the design, as a namespaced user/tenant/admin, for your namespaced Filter, you'd then need a ClusterAdmin to create ClusterParser resource for you for your filter to work. For just one namespace this doesn't look like a problem, but imagine this scenario for 100s of namespaces (namespace as a service).

To cover that use-case, we are creating duplicate parser which are doing the same job. Suppose if we have 100 namespace config and one cluster parser. So there will be 101 entry for the same cluster in parsers.conf that seems redundant. Even if we fix it, and 94 namespace configMap are referring to one cluster parser, there will be 95 entries in parsers.conf. This is for just one cluster parser. suppose if namespaced config is referring to many cluster parsers. If we are concern about the namespaced user/tenant, some RBAC can be provided to those to create Cluster parses if they don't have any cluster parser that can be utilised. OR there could be several other way. But with current scenario, multiple duplicates entry in parsers.conf will confuse them. Something we need to discuss or brought up some optimisation in current approach.

Okay I'll answer questions one by one.

  • I agree on not forming namespaced dups of ClusterParser in the final configs, they are all identical at the end. Functionally this wouldn't change the feature so I'll call it low-mid priority, since parsers are something that will differ based on log structures so it's kinda rare that many applications will have the same/similar log structures.
  • I think you're missing the point here. You most definitely do not need any namespaced CRs if you can grant permissions to users to create cluster level resources. But that's always a privilege escalation, and no matter how small it is, it's not the right approach in any case. In case of ClusterParsers, if you grant permission to a malicious namespace user who can try to mutate the resource to break functionality, cause chaos etc, you'd be in a loss.
  • If I were to come up with this operator from scratch, I'd think about characterising parsers definitely, since these configs aren't part of the main pipeline. But since the project already had a ClusterParser, you need to have namespace Parser to let users create resources without a privilege escalation.

@benjaminhuo
Copy link
Member

@benjaminhuo if you are fine with this simple fix to get rid of error message in fluentbit logs, can I seek your approval and request you to merge this if there is no issue.

I'll take a look at this PR.
Thanks for your contribution @karan56625 and @adiforluls‘s review

@benjaminhuo benjaminhuo merged commit 03e3df1 into fluent:master Aug 1, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: Duplicate Fluentbit cluster parser are getting created when we create multiple namespace fb config.
5 participants