# Testing the solution
Now it's time to test our new container. We can pull the container based on the name we gave it (`algorithms_name`) and pull it from ECR

In [None]:
import boto3

client=boto3.client('sts')
account=client.get_caller_identity()['Account']

my_session=boto3.session.Session()
region=my_session.region_name

algorithm_name="huggingface-pytorch-inference-extended"
tag="1.10.2-transformers4.24.0-gpu-py38-cu113-ubuntu20.04"
ecr_image='{}.dkr.ecr.{}.amazonaws.com/{}:{}'.format(account, region, algorithm_name, tag)

ecr_image

In [2]:
from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()

In [3]:
hub = {
    'HF_MODEL_ID':'bigscience/bloom-560m',
    'HF_TASK':'text-generation'
}

All we need to do when creating the `Model` object is to tell it that we want to use our own container. Note that in this case we don't need to provide the info about version numbers anymore (which is only used to identify the appropraite container) since we specify the container we want to use.

In [4]:
huggingface_model = HuggingFaceModel(
    image_uri=ecr_image,
    env=hub,
    role=role,
#     transformers_version="4.17",
#     pytorch_version="1.10",
#     py_version="py38",
)

In [5]:
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g4dn.xlarge"
)

-----------!

The model deployed to an endpoint, and now we can actually send an inference request and get the expected response.

In [6]:
data = {"inputs": "And this is is the solution"}

predictor.predict(data)

[{'generated_text': 'And this is is the solution to the problem.\nI have a problem with the following code.\n#include'}]

In [None]:
predictor.delete_endpoint()