Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access Denied when using apoc.import #482

Closed
colintkn opened this issue Aug 29, 2023 · 13 comments
Closed

Access Denied when using apoc.import #482

colintkn opened this issue Aug 29, 2023 · 13 comments

Comments

@colintkn
Copy link

colintkn commented Aug 29, 2023

I've deployed neo4j on kubernetes.

I've created a serviceAccount with annotations to a role which permissions to run S3 commands. The necessary jar files has been downloaded as specified here https://neo4j.com/docs/apoc/5/import/web-apis/#_using_s3_protocol.

The jar files are in the /plugins directory.

neo4j@neo4j-0-0:~/plugins$ ls
README.txt            aws-java-sdk-core-1.12.136.jar  httpclient-4.5.13.jar
apoc-5.10.0-core.jar  aws-java-sdk-s3-1.12.136.jar    httpcore-4.4.15.jar
aws                   awscliv2.zip                    joda-time-2.10.13.jar

But an APOC call will not work in the cypher-shell of the pod.

CALL apoc.import.csv([{fileName: "s3://s3.ap-southeast-1.amazonaws.com/xxxx/
neo4j_import/xxxx.csv", labels: ["entities"
]}], [], {ignoreDuplicateNodes: true, ignoreBlankString: true});

It yields this error:

Failed to invoke procedure `apoc.import.csv`: Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;

Is there a reason why the APOC plugin does not get credentials from the serviceAccount. When the node where the pod is running on has the S3 policies attached to it, then the apoc command works. It seems like APOC is using the permissions from the node rather than the service account.

Here are the configurations

apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxx:role/neo4j-eks
  creationTimestamp: "2023-08-23T02:11:45Z"
  name: neo4j
  namespace: neo4j

A check on the pod shows the correct role being used

    - name: AWS_ROLE_ARN
      value: arn:aws:iam::xxxxxx:role/neo4j-eks
    - name: AWS_WEB_IDENTITY_TOKEN_FILE
      value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token

Here are the environment variables from the pod

neo4j@neo4j-0-0:~$ env | grep AWS
AWS_DEFAULT_REGION=ap-southeast-1
AWS_REGION=ap-southeast-1
AWS_ROLE_ARN=arn:aws:iam::xxxxxxx:role/neo4j-eks
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_STS_REGIONAL_ENDPOINTS=regional

apoc_config:

apoc_config:
  apoc.import.file.enabled: "true"
  apoc.import.file.use_neo4j_config: "true"
  apoc.export.file.enabled: "true"

configuration:

config:
  server.metrics.prometheus.enabled: "true"
  server.metrics.prometheus.endpoint: "0.0.0.0:12004"
  dbms.logs.http.enabled: "true"
  server.logs.debug.enabled: "true"
  # Configure and installl APOC core https://neo4j.com/docs/operations-manual/current/kubernetes/plugins/
  server.config.strict_validation.enabled: "false"
  dbms.security.procedures.allowlist: "apoc.*"
  dbms.security.procedures.unrestricted: "apoc.*,bloom.*"
@gem-neo4j
Copy link
Contributor

Looking at the APOC docs, the options for using S3 are:

The S3 URL must be in the following format:

  • s3://accessKey:secretKey[:sessionToken]@endpoint:port/bucket/key (where the sessionToken is optional)
  • s3://endpoint:port/bucket/key?accessKey=accessKey&secretKey=secretKey[&sessionToken=sessionToken] (where the sessionToken is optional)
  • s3://endpoint:port/bucket/key if the accessKey, secretKey, and the optional sessionToken are provided in the environment variables

I am not sure I entirely understood what you meant by "Only when permissions are added to the node role, then it works."

@colintkn
Copy link
Author

Yes, even with the command as specified in the 3rd example, it fails

CALL apoc.import.csv([{fileName: "s3://s3.ap-southeast-1.amazonaws.com:443/xxx/xxxx.csv", labels: ["entities"]}], [], {ignoreDuplicateNodes: true, ignoreBlankString: true});

Here are the environment variables

neo4j@neo4j-0-0:~$ env | grep AWS
AWS_DEFAULT_REGION=ap-southeast-1
AWS_REGION=ap-southeast-1
AWS_ROLE_ARN=arn:aws:iam::xxxxxxx:role/neo4j-eks
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_STS_REGIONAL_ENDPOINTS=regional

"Only when permissions are added to the node role, then it works." - When the node where the pod is running on has the S3 policies attached to it, the apoc command works. Seems like its using the permissions from the node rather than the service account.

@gem-neo4j
Copy link
Contributor

Looking at the code, if the access and secret key are not passed in the url (the first 2 options), then APOC defaults to use DefaultAWSCredentialsProviderChain which is provided by the AWS java sdk, this is what that looks for:

AWS credentials provider chain that looks for credentials in this order:

  • Environment Variables - AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY (RECOMMENDED since they are recognized by all the AWS SDKs and CLI except for .NET), or AWS_ACCESS_KEY and AWS_SECRET_KEY (only recognized by Java SDK)
  • Java System Properties - aws.accessKeyId and aws.secretKey
  • Web Identity Token credentials from the environment or container
  • Credential profiles file at the default location (~/.aws/credentials) shared by all AWS SDKs and the AWS CLI
  • Credentials delivered through the Amazon EC2 container service if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI" environment variable is set and security manager has permission to access the variable,
  • Instance profile credentials delivered through the Amazon EC2 metadata service

Do any of these options work for you? I see your env variables aren't including it either.

@colintkn
Copy link
Author

hey gem!

these are the env variables:

neo4j@neo4j-0-0:~$ env | grep AWS
AWS_DEFAULT_REGION=ap-southeast-1
AWS_REGION=ap-southeast-1
AWS_ROLE_ARN=arn:aws:iam::xxxxxxx:role/neo4j-eks
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_STS_REGIONAL_ENDPOINTS=regional

I assumed AWS_WEB_IDENTITY_TOKEN_FILE would correspond to the web identity token credentials

Is there something i am missing.

@colintkn
Copy link
Author

colintkn commented Aug 30, 2023

would it have to do with the version of the java sdk?

https://neo4j.com/labs/apoc/5/import/web-apis/#_using_s3_protocol

@gem-neo4j
Copy link
Contributor

Right, that env should correspond to the file, hmm the documentation is a little outdated, I have aws-java-sdk-core-1.12.425 locally at least, could you try with that jar? (if that works I'll fix the docs asap :) )

@colintkn
Copy link
Author

colintkn commented Sep 1, 2023

Hmm. Your version still fails with access denied.

@KulykDmytro
Copy link

KulykDmytro commented Sep 1, 2023

same issue with aws-java-sdk-s3/core: 1.12.540

@colintkn
Copy link
Author

colintkn commented Sep 4, 2023

Any updates regarding this issue?

@gem-neo4j
Copy link
Contributor

Hi, I can't see any immediate issue with APOC as it seems to just be calling a default method in aws sdk, so I'll make a bug card for my team to investigate later :) sorry for the inconvenience!

@ncordon
Copy link
Collaborator

ncordon commented Sep 13, 2023

Any updates regarding this issue?

Hello @colintkn @KulykDmytro. Could you please try to add another jar to the plugins folder? I think you need:

aws-java-sdk-s3-1.12.425 
aws-java-sdk-sts-1.12.425 

For the service accounts to work sts needs to be in the classpath, as described in aws/aws-sdk-java#2136.

If that fixes the issue, we can add the sdk-sts to the aws dependencies in apoc.

@colintkn
Copy link
Author

Hi,
Thanks. it works now. Would be helpful to add it to the list of dependencies here.

https://neo4j.com/labs/apoc/5/import/web-apis/#_using_s3_protocol

@ncordon
Copy link
Collaborator

ncordon commented Sep 20, 2023

We have packaged those dependencies in a single jar that can be downloaded from the extended releases page (still not updated, you should see the new bundled jar with sts when 5.12.0 gets released):

We have work in progress to update that page you mentioned, so hopefully in a few days we should have better docs on how to add those extra dependencies.

@ncordon ncordon closed this as completed Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants