# Shinrai Text Generation LLM Script

Derived from aws, text2text-generation-flan-t5.ipynb


### 1. Set Up

In [4]:
!pip install ipywidgets==7.0.0 --quiet
!pip install --upgrade sagemaker --quiet

import sagemaker, boto3, json
from sagemaker.session import Session

sagemaker_session = Session() # Permissions
aws_role = sagemaker_session.get_caller_identity_arn()
aws_region = boto3.Session().region_name
sess = sagemaker.Session()

## 2. Select Model

In [1]:
# List of available pretrained models at: https://sagemaker.readthedocs.io/en/stable/doc_utils/pretrainedmodels.html

In [2]:
model_id = "huggingface-text2text-flan-t5-xl"
model_version = "*"
inference_instance_type = "ml.g5.2xlarge"
endpoint_name = f"KC-ShinrAI-{model_id}"

# 3. Retrieve Artifacts & Deploy an Endpoint

***

Using SageMaker, we can perform inference on the pre-trained model, even without fine-tuning it first on a new dataset. We start by retrieving the `deploy_image_uri`, `deploy_source_uri`, and `model_uri` for the pre-trained model. To host the pre-trained model, we create an instance of [`sagemaker.model.Model`](https://sagemaker.readthedocs.io/en/stable/api/inference/model.html) and deploy it. This may take a few minutes.

***

In [7]:
from sagemaker import image_uris, model_uris, script_uris, hyperparameters
from sagemaker.model import Model
from sagemaker.predictor import Predictor
from sagemaker.utils import name_from_base


# Retrieve the inference docker container uri. This is the base HuggingFace container image for the default model above.
deploy_image_uri = image_uris.retrieve(
    region=None,
    framework=None,  # automatically inferred from model_id
    image_scope="inference",
    model_id=model_id,
    model_version=model_version,
    instance_type=inference_instance_type,
)

# Retrieve the inference script uri. This includes all dependencies and scripts for model loading, inference handling etc.
deploy_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="inference"
)

# Retrieve the model uri.
model_uri = model_uris.retrieve(
    model_id=model_id, model_version=model_version, model_scope="inference"
)

# Create the SageMaker model instance
# For those large models, we already repack the inference script and model
# artifacts for you, so the `source_dir` argument to Model is not required.
model = Model(
    image_uri=deploy_image_uri,
    model_data=model_uri,
    role=aws_role,
    predictor_cls=Predictor,
    name=endpoint_name
)
    

# deploy the Model. Note that we need to pass Predictor class when we deploy model through Model class,
# for being able to run inference through the sagemaker API.
model_predictor = model.deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    predictor_cls=Predictor,
    endpoint_name=endpoint_name,
)

------------!

# 4. Clean up the endpoint

In [6]:
# Delete the SageMaker endpoint
model_predictor.delete_model()
model_predictor.delete_endpoint()