This repository demonstrates the solution presented in the following blog: Secure Amazon SageMaker Studio presigned URLs Part 3: Multi-account private API access to Studio.
It shows how to create an accessing solution for Sagemaker Studio Domains in a multi account environment and in a private and secure way by using presigned domain urls.
- AWS CLI to run the commands
- SAM CLI installed
- 3 Accounts and a profile in the accounts for deployments:
- Networking and Access resources -> Shared Services Account
- Sagemaker Account A
- Sagemaker Account B
Setting up named profiles for AWS CLI
- First, clone this repository by running:
git clone git@github.com:aws-samples/multiaccount-sagemaker-studio-private-access.git
- Once the repository is cloned move into the root of it and open it in your favorite code editor:
cd multiaccount-sagemaker-studio-private-access
- The following parameters can be configured in the general parameters file (scripts/setup/parameters/general-parameters.json):
Parameter Name | Default Vale | Description |
---|---|---|
pSharedServicesProfile | infra-shared-services-profile | Profile used to deploy resources in the Shared Services Account |
pSagemakerLobAProfile | infra-lob-a-profile | Profile used to deploy resources in the Sagemaker Lob A Account |
pSagemakerLobBProfile | infra-lob-b-profile | Profile used to deploy resources in the Sagemaker Lob B Account |
region | us-east-1 | Region where resources are deployed |
pNetworkingStackName | networking | Name for the networking resources cfn stack |
pAccessAppStackName | access | Name for the access app resources cfn stack |
pSagemakerLobAStackName | sagemaker-lob-a | Name for the LOB A cfn stack |
pSagemakerLobBStackName | sagemaker-lob-b | Name for the LOB B resources cfn stack |
pOnPremiseStackName | on-premise | Name for the on-premise resource cfn-stack |
pKeyPairName | sagemaker-demo-kp | Name of the key pair for the on-premise deployment |
pLocalIpAddress | 1.1.1.1/32 | Ip address to access the on-premise resources |
Run the following commands in your command line:
DIRNAME=$(pwd)
GENERAL_PARAMS_FILE="${DIRNAME}/scripts/setup/parameters/general-parameters.json"
From root of the repository. To deploy the shared service account resources run the following command:
./scripts/setup/deploy_infra.sh -c all
This will deploy 2 cloudformation stacks in your shared services account:
- Networking cloudformation stack
- Access cloudformation stack
Deploys the following resources that are later reused by other templates:
- AWS Transit gateway (TGW)
- AWS VPC Endpoints
- API Gateway VPC Endpoint
- Sagemaker API VPC Endpoint Id
- Sagemaker Studio VPC Endpoint Id
- STS VPC Endpoint
- AWS Private Hosted Zones (PHZ)
- Amazon API Gateway PHZ
- Amazon Sagemaker API PHZ
- Amazon Sagemaker Studio PHZ
- STS PHZ
- A Transit Gateway Resource Share with Sagemaker Accounts
Notes:
-
Transit gateway is automatically shared with the Sagemaker accounts
-
If the account are in the same Organizational Unit and auto-accept resource shares is enabled there is no need to accept the resource. Otherwise, acceptance in the receiver accounts will be needed.
More informacion about this approach in:
- Automating AWS Transit Gateway attachments to a transit gateway in a central account
- AWS Organizations terminology and concepts
This template deploys:
- Two AWS Lambda Functions for the solution:
- Presigned URL generator Lambda function
- Custom Authorizer Lambda function
- Two Amazon DynamoDB tables to store the information:
- Users table
- LOBs table
- An Amazon Cognito User Pool to simulate the corporate idp
NOTE
The following outputs from the deployment will later be used in other steps:
- The Amazon Cognito User Pool Id
- The Amazon Cognito App Client Id
- The ARN of the role for the Lambda Presigned Url generator function
- Name of the DynamoDB tables used for the users and lobs data
The script also associates the Access Stack Vpc with the STS and Sagemaker API Hosted Zones
Set up the following variables to deploy the Sagemaker LOB A deployment:
SAGEMAKER_LOB_A_STACK_NAME=$(jq -r '.[] | select(.ParameterKey == "pSagemakerLobAStackName") | .ParameterValue' ${GENERAL_PARAMS_FILE})
REGION=$(jq -r '.[] | select(.ParameterKey == "region") | .ParameterValue' ${GENERAL_PARAMS_FILE})
SAGEMAKER_LOB_A_PROFILE=$(jq -r '.[] | select(.ParameterKey == "pSagemakerLobAProfile") | .ParameterValue' ${GENERAL_PARAMS_FILE})
SAGEMAKER_LOB_A_PARAMS_FILE=sagemaker-account/blog-launch-parameters/parameters-sagemaker-account-lob-a.json
SAGEMAKER_LOB_A_TEMPLATE_FILE=file://sagemaker-account/template.yml
And run the following script to deploy in the Sagemaker Lob A Account:
scripts/setup/deploy-sagemaker.sh \
-f $SAGEMAKER_LOB_A_PARAMS_FILE \
-s $SAGEMAKER_LOB_A_STACK_NAME \
-p $SAGEMAKER_LOB_A_PROFILE \
-t $SAGEMAKER_LOB_A_TEMPLATE_FILE\
-r $REGION
Set up the following variables to deploy the Sagemaker LOB A deployment:
SAGEMAKER_LOB_B_STACK_NAME=$(jq -r '.[] | select(.ParameterKey == "pSagemakerLobBStackName") | .ParameterValue' ${GENERAL_PARAMS_FILE})
REGION=$(jq -r '.[] | select(.ParameterKey == "region") | .ParameterValue' ${GENERAL_PARAMS_FILE})
SAGEMAKER_LOB_B_PROFILE=$(jq -r '.[] | select(.ParameterKey == "pSagemakerLobBProfile") | .ParameterValue' ${GENERAL_PARAMS_FILE})
SAGEMAKER_LOB_B_PARAMS_FILE=sagemaker-account/blog-launch-parameters/parameters-sagemaker-account-lob-b.json
SAGEMAKER_LOB_B_TEMPLATE_FILE=file://sagemaker-account/template.yml
And run the following script to deploy in the Sagemaker Lob B Account:
scripts/setup/deploy-sagemaker.sh \
-f $SAGEMAKER_LOB_B_PARAMS_FILE \
-s $SAGEMAKER_LOB_B_STACK_NAME \
-p $SAGEMAKER_LOB_B_PROFILE \
-t $SAGEMAKER_LOB_B_TEMPLATE_FILE \
-r $REGION
This cloudformation templates create:
- An Amazon Virtual Private Cloud (VPC) for the Amazon Sagemaker Domain
- The Attachment of the VPC to the Shared Services Account TGW
- A private Amazon Sagemaker Domain with a user
Now we will fill the users data in both the DyanmoDB Tables and the Cognito User Pool Id
Run the following script:
./scripts/setup/fill-data.sh
This scripts populates the following elements:
- Dynamo DB Users Table
PK | LOB |
---|---|
user-lob-a | lob-a |
user-lob-b | lob-b |
- Dynamo DB LOBs Table
LOB | ACCOUNT_ID |
---|---|
lob-a | $SAGEMAKER_LOB_A_ACCOUNT_ID |
lob-b | $SAGEMAKER_LOB_B_ACCOUNT_ID |
- Cognito User Pool
User | Password |
---|---|
user-lob-a | UserLobA1! |
user-lob-b | UserLobB1! |
As you may notice the user names must be consistent accross the three resources:
- The Dynamo Db Users Table
- The Cognito Users Pool
- The Sagemaker Domain
If we just want to test the presigned url this can easily be done following this steps:
- Go into the API Gateway console
- Under APIs, find and click on the access API
- Under resources go to the access api {user_id+} get method
- Then click on test
- Enter one of
user-lob-a
oruser-lob-b
as the user_id path - Click on the test button
- You will get a presigned url, however it won´t work if pasted in your local computer´s browser. For url consumption and end to end testing follow the steps in the Extra section
This presigned url must be consumed through the central Studio VPC Endpoint and will expire in 20 seconds, as defined in the Access Lambda function.
If we try to consume it through our browser a message saying: "Auth token containing insufficient permissions" will be shown.
For simplicity we will deploy the on premise simulator in the Central Account. Follow this steps:
-
First, create a key-pair in the central account. You can use the instructions in Create key pairs
-
Fill the following values in the general-parameters.json file (scripts/setup/parameters/general-parameters.json)
- YOUR_KEY_PAIR_NAME -> name of the key pair you created before
- YOUR_IP_TO_CONNECT_TO_BASTION -> the ip of your local machine ending in /32
-
Then run the following script to set up the on-premise stack:
./scripts/setup/deploy-on-premise.sh
This script:
- Creates the on-premise CloudFormation Stack
- Associates the Sagemaker Studio and Api Gateway PHZ with the on-premise VPC
To simulate the connectivity of the on-premise environment and the cloud we will use VPC Peering between the On-premise VPC and the Central Networking VPC. Follow the Intructions to create VPC Peering
-
Don´t forget to accept the peering connection
-
Remember to update the route tables of on-prem and central networking private subnet route tables to point the respective CIDRs to the peering connection.
-
Once both VPCs have been peered we can use the solution for DNS proposed in Part 1 of this series. However, in this case we have taken advantage of the previously created PHZs and associate the Sageamker Studio and API Gateway PHZs with our On-premise VPC, as we did for the Access VPC.
Once deploy and set up we have to use the bastion host to RDP into the instance in the private subnet
- Use following command to retrieve the command that must be launched:
ON_PREMISE_STACK_NAME=$(jq -r '.[] | select(.ParameterKey == "pOnPremiseStackName") | .ParameterValue' scripts/setup/parameters/general-parameters.json)
REGION=$(jq -r '.[] | select(.ParameterKey == "region") | .ParameterValue' ${GENERAL_PARAMS_FILE})
SHARED_SERVICES_PROFILE=$(jq -r '.[] | select(.ParameterKey == "pSharedServicesProfile") | .ParameterValue' ${GENERAL_PARAMS_FILE})
aws cloudformation describe-stacks \
--profile ${SHARED_SERVICES_PROFILE} \
--region ${REGION} \
--query "Stacks[?StackName=='${ON_PREMISE_STACK_NAME}'][].Outputs[?OutputKey=='TunnelCommand'].OutputValue" --output text
The follow this steps to connect:
-
In a terminal and in the location where the previously created ec2 key pair is stored, run the command.
- This will create an RDP connection between our localhost and the private windows instance.
-
And then use an rdp client like Windows Remote Desktop to connect to the instance.
- Username: Administrator
- Password: Can be retrieved with the KeyPair from the running Windows instance in the EC2 Console
More information about connecting to your windows instance in AWS documentation Connect To Your Windows Instance official documentation
Once in the instance (in case it has not been installed) we will install firefox -> Link to install firefox in the instance
To test the end to end we will need to get tokens for the users, so that we can consume the access API.
To get the access tokens run the following commands substuting the cognito client id which can be retrieved from the access stack:
- First set up the required env variables
REGION=$(jq -r '.[] | select(.ParameterKey == "region") | .ParameterValue' ${GENERAL_PARAMS_FILE})
SHARED_SERVICES_PROFILE=$(jq -r '.[] | select(.ParameterKey == "pSharedServicesProfile") | .ParameterValue' ${GENERAL_PARAMS_FILE})
ACCESS_STACK_NAME=$(jq -r '.[] | select(.ParameterKey == "pAccessAppStackName") | .ParameterValue' scripts/setup/parameters/general-parameters.json)
COGNITO_APP_CLIENT_ID=$(aws --profile ${SHARED_SERVICES_PROFILE} --region ${REGION} cloudformation describe-stacks --query "Stacks[?StackName=='${ACCESS_STACK_NAME}'][].Outputs[?OutputKey=='CognitoAppClientId'].OutputValue" --output text)
- These commands can be used to retrieve the access token for user-lob-a and user-lob-b respectively
aws cognito-idp initiate-auth \
--profile $SHARED_SERVICES_PROFILE \
--region $REGION \
--auth-flow USER_PASSWORD_AUTH \
--client-id $COGNITO_APP_CLIENT_ID \
--auth-parameters USERNAME=user-lob-a,PASSWORD=UserLobA1!
aws cognito-idp initiate-auth \
--profile $SHARED_SERVICES_PROFILE \
--region $REGION \
--auth-flow USER_PASSWORD_AUTH \
--client-id $COGNITO_APP_CLIENT_ID \
--auth-parameters USERNAME=user-lob-b,PASSWORD=UserLobB1!
- To test the solution inside the Windows Client we need the API´s URL. It can be retrieved from the outputs of the Access Solution Cloudformaiton Stack by running this command:
echo $(aws --profile ${SHARED_SERVICES_PROFILE} --region ${REGION} cloudformation describe-stacks --query "Stacks[?StackName=='${ACCESS_STACK_NAME}'][].Outputs[?OutputKey=='ApiBasePath'].OutputValue" --output text)
To call for user-lob-a the api call will look as follows:
https://{API_ID}.execute-api.{REGION}.amazonaws.com/dev/user-lob-a
Once we have all this information, we can try to call the API Gateway api from within our windows app, however we should get the following error: {message: Unauthorize}
To overcome this we will add the tokens to the request header.
- Got to Firefox Developer tools and network tab
- Right click on the failed API call with File as user-lob-a call and click Edit and Resend
- Scroll down on the headers side and add a new header
- Header Key: Authorization
- Header Value: Bearer
- Click send
-
In the return response you will get the Location and if you click on it it will open up your Jupyter Lab
-
Click it fast as you only get 20 seconds to consume it
-
First time accessing it will take some time, as Sagemaker is creating the application for the user, but next attemps will be faster.
In a real world scenario this action will be perform by an access application which will authomatically understand the 302 redirect and send the user to the Sagemaker App
If we try to edit the request to send all the same information but for the user-lob-b URL we will get the following error in the response:
x-amzn-ErrorType: AccessDeniedException
This same process could be repeated changing eveything of user-lob-a to user-lob-b and the access would be granted for the LOB B domain
- Delete the VPC Peering Connection
- Remove the associated VPCs from the PHZs. You can use the following command
scripts/cleanup/remove-associated-vpcs.sh
- Delete the EFS Volumes for the Sagemaker Domains. See Deleting an Amazon EFS file system
- Delete any opened applications from the Sagemaker Domains as explained in Delete an Amazon Sagemaker Domain
- Run the clean up script.
scripts/cleanup/delete-infra.sh
The script searches for the CloudFormation stacks in their respective accounts and deletes them. The delete order is as follows:
- On-Premise Stack
- Sagemaker LOB A Stack
- Sagemaker LOB B Stack
- Access App Stack
- Networking Stack
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.