# Deploy AMPLIFY Hugging Face model on Sagemaker Inference

### Important: Setting the Jupyter Working Directory

This notebook assumes that your working directory is set to the following path: `<repo-root>/framework-integrations/sagemaker/inference`

#### How to Check and Set the Working Directory
Before running the notebook, you can verify that your working directory is correct by running the following command:

In [None]:
import os
print(os.getcwd())

If the output is not `<repo-root>/framework-integrations/sagemaker/inference`, you can set the working directory manually by running:

In [None]:
os.chdir('/path/to/your/repository/framework-integrations/sagemaker/inference')

### Setup Environment

First, ensure that the necessary packages are installed:

In [1]:
!pip install boto3 sagemaker -U -q

### Initialize your SageMaker session and IAM role

Before deploying the model, you need to set up the SageMaker session and specify your IAM role

In [5]:
from sagemaker import get_execution_role, Session, image_uris
import boto3
#from sagemaker.huggingface import HuggingFaceModel
from sagemaker.pytorch import PyTorchModel

sagemaker_session = Session()
region = boto3.Session().region_name
execution_role = get_execution_role()
region = boto3.Session().region_name
default_bucket = sagemaker_session.default_bucket()

### Specify the model ID and the instance details

In [6]:
instance_type = 'ml.g5.2xlarge'

model_version = "AMPLIFY_120M"
model_id = f"chandar-lab/{model_version}"

### Specify the container image for the inference

In [7]:
image = image_uris.retrieve(region=region, framework='pytorch', image_scope='inference', version='2.3', base_framework_version='pytorch2.0.0', instance_type=instance_type)
image

'763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference:2.3-gpu-py311'

### Create a dummy `.tar.gz` model artifact to give S3 reference

In [8]:
model_data_location = f"s3://{default_bucket}/dummy-model.tar.gz"
!mkdir dummy-model
!touch dummy-model/dummy.txt  
!tar -czvf dummy-model.tar.gz dummy-model/  
!aws s3 cp dummy-model.tar.gz {model_data_location}

mkdir: cannot create directory ‘dummy-model’: File exists
dummy-model/
dummy-model/dummy.txt
upload: ./dummy-model.tar.gz to s3://sagemaker-us-west-2-111918798052/dummy-model.tar.gz


### Deploy the Model with the Custom Inference Script

In [14]:
sm_model = PyTorchModel(
    model_data=model_data_location,
    role=execution_role,
    image_uri=image,
    source_dir="code/hf/",
    entry_point="inference_hf.py",
    env={
        'AMPLIFY_MODEL_ID': model_id
    }
)

In [15]:
base_name = model_id.split("/")[-1].lower().replace("_", "-").replace(".", "-")

endpoint_name = f"{base_name}-hf-endpoint"

predictor = sm_model.deploy(
    initial_instance_count=1,
    instance_type=instance_type,
    endpoint_name=endpoint_name
)

----------------!

### Make a single prediction 

In [16]:
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

predictor.serializer = JSONSerializer()  
predictor.deserializer = JSONDeserializer()  

sequence = "MSVVGIDLGFQSCYVAVARAGGIETIANEYSDRCTPACISFGPKNR"

result = predictor.predict({"sequence": sequence})

print(f"Logits: {result}")

Logits: {'output': [[[3.6029396057128906, 3.604288101196289, 3.603386878967285, 3.604475498199463, 3.6039066314697266, 9.026823043823242, 21.575443267822266, 23.350740432739258, 22.799545288085938, 22.21979331970215, 22.191137313842773, 22.64295196533203, 21.981008529663086, 22.412845611572266, 21.513126373291016, 22.321914672851562, 21.817113876342773, 21.944229125976562, 21.61518096923828, 21.54083824157715, 21.174114227294922, 22.455541610717773, 21.910043716430664, 21.293716430664062, 19.359317779541016, 20.684045791625977, 3.6044530868530273], [6.290011405944824, 6.290861129760742, 6.29034423828125, 6.290811061859131, 6.29060173034668, 10.12348747253418, 20.438899993896484, 20.94892692565918, 20.452167510986328, 21.4377384185791, 20.80020523071289, 20.440168380737305, 20.0839786529541, 20.676856994628906, 20.08152961730957, 20.45177459716797, 20.069684982299805, 19.504039764404297, 20.25322151184082, 19.706071853637695, 19.98957633972168, 19.15755271911621, 29.23740005493164, 18.6

### Delete Endpoint

In [20]:
#predictor.delete_endpoint()