Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsure why topic and ACL resources have a required credentials attribute #37

Closed
krishnan-mani opened this issue May 17, 2022 · 19 comments
Closed
Labels
question Further information is requested

Comments

@krishnan-mani
Copy link

When declaring a topic, one is forced to specify the credentials attribute

This is also the case for ACL resources.

Why would a user that is configuring these resources need to provide this information? There is nothing in the "domain abstraction" for a topic or ACL (either in Apache Kafka or Confluent Cloud Kafka) that requires a reference to some credentials, when working with topics (and ACLs)

No other experience for creating or configuring topics and ACLs involves such credentials. i.e., the behaviour of topics and ACLs is independent of some form of authentication/use of credentials by whatever mechanism was used to manage these. Furthermore, the ability to work with topics (and the ACLs) is possible using many different forms of authentication and access control.

If a credentials attribute is introduced for these resources, what kinds of changes will be reported by terraform as configuration drift?

@linouk23 linouk23 added the question Further information is requested label May 17, 2022
@bluedog13
Copy link

bluedog13 commented May 17, 2022

To create topics - what account would one use to create and would we want every account to create topics without any checks in place?

Not everyone can create topics in Confluent Cloud is my understanding, unless they have the right level of access associated with their user account to do so.

While provisioning, we don't want to use user accounts - but service accounts that have the respective privileges (CloudClusterAdmin) to create the underlying resources. Hence we mention "credentials" block to ensure we are provisioning topics using the service account which has the right privileges to create the topic.

Same applies to ACL as well.

@krishnan-mani
Copy link
Author

krishnan-mani commented May 17, 2022

To create topics - what account would one use to create and would we want every account to create topics without any checks in place?

(As is indicated in the documentation for the package, ...) just one of the recommended ways appears to be to generate an API key and secret associated with a service account.

When authenticating using these credentials, the caller is able to perform the actions the service account is privileged for (per the role assignment, say CloudClusterAdmin). Any particular action to create or configure the topic will either succeed or fail, based on the validity of the credentials (i.e., authn) as well as the associated privilege (i.e., authz), at that point in time. This form of action can be considered a control plane action

It is my assertion that any particular control plane action to create or change the topic is entirely independent of how the topic is configured, how it behaves, and how it is used by producers and consumers. Each control plane action is also independent of subsequent control plane actions that make more changes to the topic (by authenticating suitably)

This is also supported by how one can create any number of users as well as service accounts with different roles with varying privileges, however, several of them at any given time (with the right privileges) can administer a particular topic. For example, a user may have created the topic earlier this year, and now that the provider is available, I hope to terraform import the same into configuration, and at some later date, we may choose to administer it using a different tool or practice (and move it out of terraform state)

The actions to produce or consume messages with the topic are governed by a different mechanism, and for any given topic, there may be a large number of producers and consumers that have each been assigned different sets of credentials and are able to perform different actions independent of each other. These can be considered as data plane actions

In other words, for both control plane actions and data plane actions over time, any credentials used for a particular action are not a "structural" aspect of the topic (as declared using resource attributes in terraform)

Not everyone can create topics in Confluent Cloud is my understanding, unless they have the right level of access associated with their user account to do so.

Let us use an analogy (hopefully one that is well-understood) of another resource that is commonly managed using infrastructure-as-code with terraform: an EC2 AutoScaling Group in AWS.

Any number of human users and/or automation (with suitable IAM credentials and roles) can act to create or configure it at different points in time. Whether a particular change to the ASG was performed by some terraform apply or a user messing around in the console has no bearing on the behaviour of the ASG itself and is not a resource attribute. Who or what can make such changes over time, is entirely independent of the configuration and behaviour of the resource itself.

In particular, if a human user messed with it independently of terraform, then the configuration drift reported at the next terraform run is only for the behavioural/structural attributes of the resource (such as max_size), and not for who or what changed it.

While provisioning, we don't want to use user accounts - but service accounts that have the respective privileges (CloudClusterAdmin) to create the underlying resources. Hence we mention "credentials" block to ensure we are provisioning topics using the service account which has the right privileges to create the topic.

Same applies to ACL as well.

Hope the above clarifies. Also, please consider the commonly-accepted usage model for infrastructure-as-code.
My colleagues and I may be responsible for creating and maintaining common configuration for a cluster and resources checked into version control. Each of us may check out this code and attempt, say, a terraform plan by setting TF_VAR_confluent_cloud_api_key, etc. to the independent credentials that each of us has been assigned. Whether someone has the role and privileges changes over time as people join and leave, and with a change in their responsibilities, and is managed independently of the topics.

Moreover, when the change is applied with automation using terraform, say in a pipeline, it will use an independent set of credentials (for service account, etc.) specifically assigned for the purpose.

The similar arguments apply to ACLs as well.

@krishnan-mani
Copy link
Author

As customers and users of Kafka on Confluent Cloud, we have been eagerly anticipating various features, releases, and tools; including this terraform provider to support "automated administration of all resources in Confluent Cloud, at scale". Hopefully, the above comments clarify why we consider the required credentials attribute on topics and ACLs, etc. as a significant limitation, (if not flaw) that impacts the value of the provider. I am happy to provide further technical arguments to advance this view.

We humbly request a change, and are happy to participate and contribute. (we have also done so, in the recent past, to other tools used by customers of Kafka in Confluent Cloud, such as julie). Hope, this issue can be labeled with more than a question.

@bluedog13
Copy link

Thanks for the detailed explanation.

If I have to understand the ask (I am not part of Confluent team and I am just trying to understand for my sake since we have been using terraform to provision our resources in non-prod and have not found this to be a significant limitation and find it helpful to limit the folks/accounts that can create topics at will)

  • the "credentials" block must not be in the topic or acl resource block declaration
  • By means of an environment variable or a way to inject credentials globally, the provider must be able to use them and provision the resources?

@krishnan-mani
Copy link
Author

krishnan-mani commented May 17, 2022

@bluedog13 , thank you for a speedy response! Both the points you mention above are generally accurate.

It is my understanding, that many terraform providers support a mechanism as part of the provider configuration, that let the user, (in a flexible and context-sensitive way) supply the necessary identity and authentication information that is required at each run. As the provider ultimately (under the covers) defers to other libraries and APIs for the underlying operations (in this case, the APIs hosted by the Confluent Cloud), they tend to support, in a transparent and pass-through manner the same forms of authentication, etc. supported by the underlying API.

An example of this approach is the AWS terraform provider. In this case, the provider ultimately supports all of the multiple mechanisms to authenticate that are supported for the use of the AWS SDKs with IAM.

In particular, the provider also supports one form of configuration which requires no references to any authentication information at all, since all of the AWS SDKs also support a form of authentication, for e.g., by simply setting environment variables (such as AWS_PROFILE)

@bluedog13
Copy link

I agree with you that having the ability of not supplying "credentials" in every topic/acl resource block would be a welcome option to have. It will save time and from writing repetitive code.

That saying, I would not want the current feature of "credentials" to be dropped fully, but made optional. I say this because when a super user or a service account with CloudClusterAdmin is removed from the system, at times we would not want a scenario where the key/secret has to be generated outside and the key/vault be updated with the new keys for our use case. At times, it may perhaps be good to have the entire process in Terraform itself.

@krishnan-mani
Copy link
Author

@bluedog13 My understanding is that you would like to also administer users, service accounts, and credentials (i.e., the "entire process ...") in terraform. However, to be clear, this is feasible without having any references to credentials from topic resources.

imho, it will be an error to introduce attribute for credentials into topic, as this will have the opposite effect: to hinder the ease with which credentials can be administered. In general, we should be able to generate, use, change, and delete credentials with a minimum of fuss as and when needed (whether in terraform or elsewhere), and any references to these from other resources will complicate matters, due to dependencies.

@linouk23
Copy link
Collaborator

linouk23 commented May 17, 2022

@krishnan-mani thanks for creating an issue!

I definitely agree that it's a limitation. In short, Confluent Cloud requires both Cloud and Kafka API Keys (ACLs / Topic) to manage all the resources which is why we introduce credentials block (we made a design decision to have all resources in a single confluent TF provider rather then introducing a separate one for ACLs / Topics).

On a brighter side, we're looking to add OAuth to TF Provider (i.e., single pair of keys to manage all the resources that would be passed / accepted similarly to Cloud API Keys via provider's arguments) and making credentials block optional (for ACL / Topic resources) -- kind of similar to Azure TF Provider's credentials model. Let me know if you think it's a reasonable idea that long term will fix the mentioned issue.

@linouk23
Copy link
Collaborator

@krishnan-mani by the way, do you have any ideas about #38 (comment)?

@mirkg
Copy link

mirkg commented May 23, 2022

What about using ENV vars to provide KAFKA_API_KEY and KAFKA_API_SECRET?
Why this is allowed for confluent_kafka_cluster resource but not for topics or acls?
This is most secure solution to get directly from client env and do not store in any variables or state file at all.
You may still keep credentials as optional for people who do not care about state file but please allow to use ENV as well.

@mirkg
Copy link

mirkg commented May 30, 2022

Only drawback of env variables for kafka resources is that you need API keys per cluster. so for applying it for more than one cluster in same run it would not work.
But how often do you manage multilple clusters resources in single terraform state?
Before you apply changes for next cluster you always set corresponding key in env and all works.

Only other secure option is to store credentials references with resource and get secrets from some secrets manager at runtime.

@linouk23
Copy link
Collaborator

@mirkg @bluedog13 @krishnan-mani check out our latest 0.13.0 release where we

Added support for kafka_api_key, kafka_api_secret, kafka_rest_endpoint attributes in a provider block to make rest_endpoint attribute and credentials block optional for confluent_kafka_acl and confluent_kafka_topic resources (#37, #54).

... added a new managing-single-cluster example.

@bluedog13
Copy link

Thanks for the update.

Why do we still need this in the resource block when all the required data for the cluster has been moved to the provider "confluent" block. An incorrect combination of kafka_api_key/kafka_api_secret and kafka_cluster won't work anyway.

kafka_cluster {
    id = var.kafka_cluster_id
  }

@linouk23
Copy link
Collaborator

@bluedog13 unfortunately our APIs require kafka_cluster_id ({cluster_id}) on top of the kafka_rest_endpoint:

curl --request GET \
  --url 'https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/{cluster_id}/topics' \
  --header 'Authorization: Basic REPLACE_BASIC_AUTH'

@bluedog13
Copy link

Is it possible to move the kafka_cluster_id block to the provider "confluent" block to avoid the repetition as well?

@linouk23
Copy link
Collaborator

@bluedog13 it might work but we didn't want to deviate too much from our original design.

Arguably, having credentials was a really bad design but having cluster_id is OK since it is a part of topic definition.

@linouk23
Copy link
Collaborator

linouk23 commented Dec 28, 2022

update: we've been thinking about adding

# Option #3
provider "confluent" {
  cloud_api_key       = var.confluent_cloud_api_key
  cloud_api_secret    = var.confluent_cloud_api_secret

  kafka_rest_endpoint = var.kafka_rest_endpoint
  kafka_api_key       = var.kafka_api_key
  kafka_api_secret    = var.kafka_api_secret
  kafka_cluster_id    = var.kafka_cluster_id
}

resource "confluent_kafka_topic" "orders" {
  topic_name    = "orders"
}

in addition to

# Option #2
provider "confluent" {
  cloud_api_key       = var.confluent_cloud_api_key
  cloud_api_secret    = var.confluent_cloud_api_secret

  kafka_rest_endpoint = var.kafka_rest_endpoint
  kafka_api_key       = var.kafka_api_key
  kafka_api_secret    = var.kafka_api_secret
}

resource "confluent_kafka_topic" "orders" {
  kafka_cluster {
    id = var.kafka_cluster_id
  }
  topic_name    = "orders"
}

Or in other words, it's more like Option #3 may become a preferred one over Option #2.

What do you think cc @mirkg @krishnan-mani

The reason why we came back to this was this thread.

@linouk23
Copy link
Collaborator

@linouk23
Copy link
Collaborator

linouk23 commented Jan 5, 2023

👋 @mirkg @krishnan-mani @nseddik @kislerdm @pyvandenbussche @grampurohitp, we've just

in 1.24.0 version of TF Provider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants