New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use airbyte
service account when creating pods
#11697
Use airbyte
service account when creating pods
#11697
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The orchestrator piece is currently only used in our cloud platform. For injecting arbitrary service accounts, it seems like we should be reading an environment variable with the name of the service account in EnvConfigs
and then using that as an input to the KubeProcessFactory
.
@mohamagdy does that sound reasonable to you?
…be-pods-with-airbyte-service-account
f7acad8
to
27eda41
Compare
Hey @jrhizor thanks for taking a look. Yeah, that makes total sense. I did the changes and would love to test. Is there an easy way to test it? I was thinking of building the |
airbyte-workers/src/main/java/io/airbyte/workers/process/KubePodProcess.java
Show resolved
Hide resolved
@mohamagdy I left one comment. I think this is on the right track except for that one suggestion. For testing Kubernetes-related changes, we generally run this locally using docker for desktop. https://docs.airbyte.com/contributing-to-airbyte/developing-on-kubernetes has some more info |
…be-pods-with-airbyte-service-account
@mohamagdy I'm trying to understand what you are trying to do. The job pods use the default Kubernetes service account for added security since the default Kube service account has permissions similar to an unauthenticated user. We don't want the job pods to assume anything about the runtime environment, since it should be runtime agnostic. The orchestrator pod has higher permissions since it's explicitly creating other job pods. However this is a Cloud only feature and not exposed to OSS users (for now). With that context, what issue here are you trying to solve for? |
Hey @davinchia, what I wanna introduce here is using service accounts and OIDC roles https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html. A use case is Redshift destination that writes to S3. In the Redshift destination, there is an option to add the |
…be-pods-with-airbyte-service-account
@mohamagdy thanks for explaining. I'm not sure this is the right approach. This will also need destination redshift connector changes to make the env var variables optional. I'm guessing we want to do something like this. However, in that case, we want to be able to specify service accounts per connector and/or connection and not mount the same service account on all pods. This will require some thinking on our part as we have to make sure the approach supports all our various deployment options e.g. docker, GKE, EKS etc. Is this blocking? I think the best way forward is to either comment on #5942 or create a separate issue focusing on the redshift destination. Either we discuss there and figure out a good solution before implementing or someone on our side can help implement this. FYI @grishick any thoughts here? |
@davinchia thank you too for reviewing. Regarding the Redshift destination configuration as far as I can see the AWS credentials (key and secret) are optional already. This will work on non K8s environments where the right AWS role was assumed, but on K8s as the comment you posted requires OIDC. Luckily, from what I can see this change only applies to K8s deployment only and didn't touch other deployments. Is this blocking? Well, we are trying to remove the AWS access key ID and secret from the Redshift connector (and everywhere it is used) and this is a one step to do this cleanup as using AWS access key ID and secret might not be the best approach security-wise. |
@mohamagd you are right, these are already optional fields. Thanks for pointing that out. The downside of this change is the service account is applied to all pods, which is probably less secure than specifying the aws key and secret in the config json. The right way to do this is to apply service account changes at the connector or connection stage, but that's trickier since it requires our connector configs to interact with our process layer abstractions. This is why I think we should rethink this. |
@davinchia I see your point. You suggest having an option per connector to choose the service account to assume when the pod launches. I think this is a good idea but needs more work as you mentioned but the same time, I am not sure why using service accounts could be less secure than AWS credentials? The AWS credentials are both exposed + might be as powerful as the role that the service account will assume (that what I am thinking of) Also creating a service account per connector requires some changes in the HelmCharts to create service accounts on demand (currently I see a single service account for the whole chart https://github.com/airbytehq/airbyte/blob/master/charts/airbyte/values.yaml#L25-L28. |
@mohamagdy it's less secure because with our current set up, every pod will have the service account mounted, and thus have access to redshift/whatever roles the service account has. With the secret keys, only the pod running the job have access. The creds are mounted per job pod and stored in the DB. What do you mean when you say the AWS credentials are exposed?
Yeap. I would actually leave this out of the chart and let OSS users manage this separately. Otherwise the chart will get complicated. We also have to figure out how to associate spec configuration with pod creation. Because there are multiple pieces here, I suggest closing this PR and moving this to an issue. |
I got your point. It will be the same "security" concern for the default What I meant by the AWS credentials are exposed, if anyone has access to the connector pods, he can easily read the AWS credentials. And I agree regarding the complexity of adding "dynamic" service accounts in the HelmChart. This PR in my point of view adds flexibility to choose between using AWS credentials or using the OIDC when using K8s by flexibly setting the |
Yeap. We don't have this concern today because we don't assign any service account to the job pods today besides the default account which has no permissions. The orchestrator pod is special because it's in charge of creating other pods. There is less risk here with assigning a service account because it's code is tightly controlled. Connector code typically has less strict requirements compared to core code, since anyone can and is encouraged to contribute a connector, and the airbyte protocol also makes it easy to use custom connectors as long as they implement the protocol.
Agree. However, the changes presented here only help in the read-only Kube access case. If a user has write/edit access, these changes here make it easier for someone to abuse credentials, since the service account is generally available in Kube. e.g. create a pod vs finding a completed pod and reading it's creds. In addition to that, since we know this is not an approach we want to take, we are adding tech debt that will make it more confusing for users in the future when we roll out a better solution. |
I am closing this and moving the discussion to #11866. |
What
When a source/destination pod is created, there is no service account assigned (
default
)How
This PR changes that behaviour to use
airbyte
service account which help in avoid adding the AWS Access Key ID and Secret to manage AWS resources and use OIDBC roles instead (wired with service accounts)Pre-merge Checklist
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.docs/SUMMARY.md