Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS session token support #1008

Merged
merged 1 commit into from Apr 17, 2019
Merged

AWS session token support #1008

merged 1 commit into from Apr 17, 2019

Conversation

michallula
Copy link

Relevant Issue

Where to put aws_session_token information #651

Details
Add a support for AWS session token

@gaul
Copy link
Member

gaul commented Apr 9, 2019

Please add documentation to --help and man pages and squash these commits. Also can you share how to test this?

@michallula michallula force-pushed the master branch 4 times, most recently from 560a524 to a8e88df Compare April 14, 2019 18:37
@ggtakec
Copy link
Member

ggtakec commented Apr 16, 2019

@michallula Thank you for PR.
I was wondering whether to support session tokens.
The reason is that s3fs is supposed to output an EIO error on every operation when the token expires.
However, I think that it is good to support just like your PR.

And please fix conflicting lines in src/s3fs_util.cpp because another PR merged.

If @gaul doesn't matter, I would like to merge this PR.

@gaul I think we should discuss in new issue if necessary.
I think s3fs should be able to clarify errors due to token expiration. (ex. log output).
We also want to consider how users can respond to token updates.
For example, output a message prompting to restart s3fs, etc.

@michallula
Copy link
Author

michallula commented Apr 16, 2019

Conflicts resolved. I'm testing it using airlock project (https://github.com/ing-bank/airlock). Airlock provides docker-compose file which allows you to start locally sophisticated environment with Ceph storage and STS proxy.

To setup such environment the following steps are required:

  1. clone airlock
  2. Invoke docker-compose up -d. It takes a while to start all components (around 5-10 minutes on my laptop)
  3. invoke setupS3Env.sh script from airlock directory
  4. run sbt run to start airlock proxy
  5. edit ranger policy following airlock README file instructions

Now it is a time to configure S3 client. I use the following script to configure AWS_* environment variables (please remember to invoke it with source):

#!/bin/bash -x

SESSION_DURATION=${1:-3600}

export KEYCLOAK_TOKEN=`curl -s -d 'client_id=sts-airlock'\
 -d 'username=testuser' \
 -d 'password=password' \
 -d 'grant_type=password' 'http://localhost:8080/auth/realms/auth-airlock/protocol/openid-connect/token' | jq -r '.access_token'`

AWS_CREDS=`aws sts get-session-token  --duration-seconds $SESSION_DURATION --token-code $KEYCLOAK_TOKEN --endpoint-url http://localhost:12345`

export AWS_ACCESS_KEY_ID=`echo $AWS_CREDS | jq -r '.Credentials.AccessKeyId'`
export AWS_SECRET_ACCESS_KEY=`echo $AWS_CREDS | jq -r '.Credentials.SecretAccessKey'`
export AWS_SESSION_TOKEN=`echo $AWS_CREDS | jq -r '.Credentials.SessionToken'`

The final part is a Dockerfile with s3fs-fuse installed. I use the following Dockerfile for tests:

FROM centos:7

USER root

ENV HOME=/opt/app-root \
    MOUNT_POINT=/data \
    SRC_DIR=/usr/src/s3fs-fuse

RUN yum -y install \
    fuse \
    fuse-devel \
    libcurl-devel \
    libxml2-devel \
    openssl-devel \
    gcc-c++ \
    make \
    automake \
    autoconf \
    wget \
    git \
    && rm -rf /var/cache/yum/* \
    && yum clean all

RUN useradd -u 1001 -r -g 0 -d ${HOME} -s /sbin/nologin \
        -c "Default Application User" default

RUN mkdir -p ${HOME}

RUN chown -R 1001:0 ${HOME}

ARG REPO_URL="https://github.com/michallula/s3fs-fuse.git"

RUN mkdir -p $SRC_DIR \
   && git clone $REPO_URL $SRC_DIR

RUN cd $SRC_DIR \
   && ./autogen.sh \
   && ./configure --prefix=/usr \
   && make \
   && make install

WORKDIR /

ADD fuse.conf /etc/fuse.conf
ADD mount-s3.sh /mount-s3.sh
ADD entrypoint.sh /entrypoint.sh

RUN mkdir -p $MOUNT_POINT \
    && chown -R 1001:0 $MOUNT_POINT

USER 1001

ENTRYPOINT ["/entrypoint.sh"]

My fuse.conf:

user_allow_other

mount-s3.sh

#!/bin/bash

AWS_ACCESS_KEY_ID=${1}
AWS_SECRET_ACCESS_KEY=${2}
AWS_SESSION_TOKEN=${3}
S3_ENDPOINT=${4}
S3_BUCKET=${5}
S3_OBJECT_KEY=${6:-}
MOUNT_POINT=${7:-/data}

: "${AWS_ACCESS_KEY_ID:?Need to set AWS_ACCESS_KEY_ID non-empty}"
: "${AWS_SECRET_ACCESS_KEY:?Need to set AWS_SECRET_ACCESS_KEY non-empty}"
: "${AWS_SESSION_TOKEN:?Need to set AWS_SESSION_TOKEN non-empty}"
: "${S3_ENDPOINT:?Need to set S3_ENDPOINT non-empty}"
: "${S3_BUCKET:?Need to set S3_BUCKET non-empty}"
: "${MOUNT_POINT:?Need to set MOUNT_POINT non-empty}"

export AWSACCESSKEYID="$AWS_ACCESS_KEY_ID"
export AWSSECRETACCESSKEY="$AWS_SECRET_ACCESS_KEY"
export AWSSESSIONTOKEN="$AWS_SESSION_TOKEN"

if [[ ! -z "$S3_OBJECT_KEY" ]]
then
  S3FS_PATH=":/$S3_OBJECT_KEY"
else
  S3FS_PATH=""
fi

# Mount s3 bucket from environment variable
s3fs $S3_BUCKET$S3FS_PATH $MOUNT_POINT \
  -o url=$S3_ENDPOINT \
  -o use_path_request_style \
  -o notsup_compat_dir \
  -o allow_other,uid=`id -u`,umask=0077,mp_umask=0077

echo "bucket: $S3_ENDPOINT/$S3_BUCKET/$S3_OBJECT_KEY was mounted at $MOUNT_POINT"

entrypoint.sh

#!/bin/bash

set -e

./mount-s3.sh "$AWS_ACCESS_KEY_ID" "$AWS_SECRET_ACCESS_KEY" "$AWS_SESSION_TOKEN" "$S3_ENDPOINT" "$S3_BUCKET" "$S3_OBJECT_KEY" "$MOUNT_POINT" &

exec "$@"

I use the following command to run a container:

docker build -t local-s3fs-fuse .
docker run --cap-add SYS_ADMIN --device /dev/fuse --network host -it -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY -e AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN  -e S3_BUCKET=demobucket -e S3_ENDPOINT="http://host.docker.internal:8010" local-s3fs /bin/bash

Data from s3 demobucket is available under /data directory.

We are using s3fs-fuse with session token support patch extensively in ING data science platform to mount S3 buckets inside jupyter notebooks. Everything works smoothly so far...
Session token support is a key feature for us. Our notebooks are ephemeral and dies when session token is expired. It is desired behaviour for us, because users have temporal access to data. Without session token support we wouldn't be allowed to use s3fs-fuse for mount s3 buckets in bank environment.

I hope this PR will be usefull for others as well.

@ggtakec
Copy link
Member

ggtakec commented Apr 16, 2019

Thank you for the detailed explanation.
I understand what you are testing.

@gaul please check and review.

@michallula
Copy link
Author

michallula commented Apr 17, 2019

To test reading credentials from ~/.aws/credentials file instead of env variables you could use the following Docker image:

FROM centos:7

USER root

ENV HOME=/opt/app-root \
    MOUNT_POINT=/data \
    SRC_DIR=/usr/src/s3fs-fuse \
    AWS_DEFAULT_REGION=us-east-1

RUN yum -y install \
    fuse \
    fuse-devel \
    libcurl-devel \
    libxml2-devel \
    openssl-devel \
    gcc-c++ \
    make \
    automake \
    autoconf \
    wget \
    git \
    && rm -rf /var/cache/yum/* \
    && yum clean all

RUN useradd -u 1001 -r -g 0 -d ${HOME} -s /sbin/nologin \
        -c "Default Application User" default

RUN mkdir -p ${HOME}

RUN chown -R 1001:0 ${HOME}

ARG REPO_URL="https://github.com/michallula/s3fs-fuse.git"

RUN mkdir -p $SRC_DIR \
   && git clone $REPO_URL $SRC_DIR

RUN cd $SRC_DIR \
   && ./autogen.sh \
   && ./configure --prefix=/usr \
   && make \
   && make install

WORKDIR /

ADD fuse.conf /etc/fuse.conf
ADD mount-s3.sh /mount-s3.sh
ADD entrypoint.sh /entrypoint.sh

RUN mkdir -p $MOUNT_POINT \
    && chown -R 1001:0 $MOUNT_POINT

USER 1001

ENTRYPOINT ["/entrypoint.sh"]

entrypoint.sh:

#!/bin/bash

set -e

# setup aws defaults
mkdir -p ~/.aws/

cat > ~/.aws/config << EOF
[default]
output = json
region = ${AWS_DEFAULT_REGION}
s3 =
    endpoint_url = ${S3_ENDPOINT}
s3api =
    endpoint_url = ${S3_ENDPOINT}
EOF

cat > ~/.aws/credentials <<-EOF
[default]
aws_access_key_id = ${AWS_ACCESS_KEY_ID}
aws_secret_access_key = ${AWS_SECRET_ACCESS_KEY}
aws_session_token = ${AWS_SESSION_TOKEN}
EOF

./mount-s3.sh "$S3_ENDPOINT" "$S3_BUCKET" "$S3_OBJECT_KEY" "$MOUNT_POINT" &

exec "$@"

mount-s3.sh

#!/bin/bash

S3_ENDPOINT=${1}
S3_BUCKET=${2}
S3_OBJECT_KEY=${3:-}
MOUNT_POINT=${4:-/data}

: "${S3_ENDPOINT:?Need to set S3_ENDPOINT non-empty}"
: "${S3_BUCKET:?Need to set S3_BUCKET non-empty}"
: "${MOUNT_POINT:?Need to set MOUNT_POINT non-empty}"

if [[ ! -z "$S3_OBJECT_KEY" ]]
then
  S3FS_PATH=":/$S3_OBJECT_KEY"
else
  S3FS_PATH=""
fi

# Mount s3 bucket from environment variable
s3fs $S3_BUCKET$S3FS_PATH $MOUNT_POINT \
  -o url=$S3_ENDPOINT \
  -o use_path_request_style \
  -o notsup_compat_dir \
  -o allow_other,uid=`id -u`,umask=0077,mp_umask=0077

echo "bucket: $S3_ENDPOINT/$S3_BUCKET/$S3_OBJECT_KEY was mounted at $MOUNT_POINT"

@gaul gaul merged commit 381835e into s3fs-fuse:master Apr 17, 2019
@gaul
Copy link
Member

gaul commented Apr 17, 2019

Thank you for your contribution @michallula!

@nandipati
Copy link

nandipati commented Nov 22, 2019

@gaul Is there plan for release as we are looking forward and waiting on . As it is still not yet in 1.85 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants