## Running a model with optimized inference by just providing the model id
Get started by reading [here](https://docs.djl.ai/docs/serving/serving/docs/lmi/user_guides/starting-guide.html)

In [8]:
# Assumes SageMaker Python SDK is installed. For example: "pip install sagemaker"
import sagemaker
from sagemaker import image_uris, Model, Predictor
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

# Setup role and sagemaker session
iam_role = sagemaker.get_execution_role() 
sagemaker_session = sagemaker.session.Session()
region = sagemaker_session._region_name

In [9]:
# Fetch the uri of the LMI container that supports vLLM, LMI-Dist, HuggingFace Accelerate backends
lmi_image_uri = image_uris.retrieve(framework="djl-lmi", version="0.28.0", region=region)

# Create the SageMaker Model object. In this example we let LMI configure the deployment settings based on the model architecture  
model = Model(
  image_uri=lmi_image_uri,
  role=iam_role,
  env={
    "HF_MODEL_ID": "TheBloke/Llama-2-7B-fp16",
  }
)

# Deploy your model to a SageMaker Endpoint and create a Predictor to make inference requests
endpoint_name = sagemaker.utils.name_from_base("llama-7b-endpoint")
model.deploy(instance_type="ml.g5.2xlarge", initial_instance_count=1, endpoint_name=endpoint_name)

----------------!

In [12]:
# creating predictor from endpoint name and component name, if there is multiple components deployed
#component_name="you_name"
#endpoint_name="endpoint_name"
predictor = Predictor(
  #component_name=component_name,
  endpoint_name=endpoint_name,
  sagemaker_session=sagemaker_session,
  serializer=JSONSerializer(),
  deserializer=JSONDeserializer(),
)

In [13]:
# Make an inference request against the llama2-7b endpoint
outputs = predictor.predict({
  "inputs": "The diamondback terrapin was the first reptile to be",
  "parameters": {
    "do_sample": True,
    "max_new_tokens": 256,
  }
})

outputs

{'generated_text': ' made, and no, this isn\'t a joke. Some years ago the consulting firm the Boston Consulting Group thought up a new strategy for the giant German windows and doors specialist INTERIOR Concepts in Munich. Among themselves, the consultants called the strategy "The Diamondback Terrapin." That was the name of a rare turtle that\'s a native of the Atlantic Coast and whose markings make it all the more desirable.\nIn the native countries of the German subsidiaries in Belgium and the United States (formerly Germany), the terrapin is considered a landmark. INTERIOR Concepts wants better sales figures from its foreign subsidiaries in Caernarfon / Wales, Freiburg, Dubai and Texas. Former top managers were replaced one by one until Dr. Aldo Kepp before long but once again in charge of marketing and sales, at the beginning of this year.\nProvided INTERIOR Concepts is succeeded in optimizing sales, the diamondback terrapin could "win this year\'s Olympic gold-selling regions." Th