# Deployment

Let's deploy Foundation AI models onto Amazon SageMaker AI. <br>
You can use the model deployed by this notebook for inference.  Refer to [the inference notebook](https://github.com/RobustIntelligence/foundation-ai-cookbook/blob/main/3_adoptions/deployment/sagemaker/foundation_sec_8b/inference.ipynb) for sample code.

As a prerequisite, please launch JupyterLab on SageMaker in your AWS environment. For more details, visit: 
https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated-jl.html

In [1]:
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


In [2]:
MODEL_NAME = 'fdtn-ai/Foundation-Sec-8B'
HUGGING_FACE_HUB_TOKEN = '' # Your Hugging Face Token
INSTANCE_TYPE = 'ml.g5.2xlarge'
TIMEOUT = 900

# You can also change other variables
hub = {
    'HF_MODEL_ID': MODEL_NAME,
    'HF_TASK': 'text-generation',
    'SM_NUM_GPUS': '1',
    'MAX_INPUT_LENGTH': '2048',
    'MAX_TOTAL_TOKENS': '4096',
    'HUGGING_FACE_HUB_TOKEN': HUGGING_FACE_HUB_TOKEN
}

In [3]:
try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

huggingface_model = HuggingFaceModel(
    image_uri=get_huggingface_llm_image_uri("huggingface", version="3.2.0"),
    env=hub,
    role=role,
)

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type=INSTANCE_TYPE,
    container_startup_health_check_timeout=TIMEOUT,
  )

# get the endpint to be used for inference (this can also be found on AWS console)
# print(predictor.endpoint)

-------------!

The predictor's endpoint will be used for [inference](https://github.com/RobustIntelligence/foundation-ai-cookbook/blob/main/3_adoptions/deployment/sagemaker/foundation_sec_8b/inference.ipynb). You can get the endpoint from SageMaker Studio's console.