#### LLM Model Evaluation

In this notebook, I deployed the Meta Llama 2 7B model and evaluate it's text generation capabilities and domain knowledge. I used the SageMaker Python SDK for Foundation Models and deploy the model for inference. 

The Llama 2 7B Foundation model performs the task of text generation. It takes a text string as input and predicts next words in the sequence. 

#### Set Up

In [1]:
!pip install ipywidgets==7.0.0 --quiet
!pip install --upgrade sagemaker datasets --quiet

#### Setup and authentication of the use of AWS services 

In [2]:
import sagemaker, boto3, json
from sagemaker.session import Session

sagemaker_session = Session()
aws_role = sagemaker_session.get_caller_identity_arn()
aws_region = boto3.Session().region_name
sess = sagemaker.Session()
print(aws_role)
print(aws_region)
print(sess)

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml
arn:aws:iam::417758802180:role/service-role/SageMaker-udacitySagemakerRole
us-west-2
<sagemaker.session.Session object at 0x7fc6ea99b790>


## Select Text Generation Model Meta Llama 2 7B


In [3]:
(model_id, model_version,) = ("meta-textgeneration-llama-2-7b","2.*",)

Running the next cell deploys the model
This Python code is used to deploy a machine learning model using Amazon SageMaker's JumpStart library. 


**The next cell will took some time to run.**  
ignore the warnings 


In [4]:
from sagemaker.jumpstart.model import JumpStartModel

model = JumpStartModel(model_id=model_id, model_version=model_version, instance_type="ml.g5.2xlarge")
predictor = model.deploy()


For forward compatibility, pin to model_version='2.*' in your JumpStartModel or JumpStartEstimator definitions. Note that major version upgrades may have different EULA acceptance terms and input/output signatures.
Using model 'meta-textgeneration-llama-2-7b' with wildcard version identifier '2.*'. You can pin to version '2.1.8' for more stable results. Note that models may have different input/output signatures after a major version upgrade.


-------------!

#### Invoke the endpoint, query and parse response
The next step is to invoke the model endpoint, send a query to the endpoint, and recieve a response from the model. 

Running the next cell defines a function that will be used to parse and print the response from the model. 

In [5]:
def print_response(payload, response):
    print(payload["inputs"])
    print(f"> {response[0]['generation']}")
    print("\n==================================\n")

The model takes a text string as input and predicts next words in the sequence, the input we send it is the prompt. 

The prompt we send the model should relate to the domain we'd like to fine-tune the model on.  This way we'll identify the model's domain knowledge before it's fine-tuned, and then we can run the same prompts on the fine-tuned model.   


**For financial domain:**

  "inputs": "Replace with sentence below"  
- "The  investment  tests  performed  indicate"
- "the  relative  volume  for  the  long  out  of  the  money  options, indicates"
- "The  results  for  the  short  in  the  money  options"
- "The  results  are  encouraging  for  aggressive  investors"


In [6]:
payload = {
    "inputs": "The investment tests performed indicate",
    "parameters": {
        "max_new_tokens": 64,
        "top_p": 0.9,
        "temperature": 0.6,
        "return_full_text": False,
    },
}
try:
    response = predictor.predict(payload, custom_attributes="accept_eula=true")
    print_response(payload, response)
except Exception as e:
    print(e)

The investment tests performed indicate
>  that the investment is acceptable.
Investment tests performed:
Average Payback Period (APBP): 5.35 years
Net Present Value (NPV): 3.77%
Internal Rate of Return (IRR): 15.23%





We see the outputs from the model without fine-tuning are limited in providing insightful or relevant content.




**running the cells below to delete the model deployment** 



In [7]:
# Delete the SageMaker endpoint and the attached resources
predictor.delete_model()
predictor.delete_endpoint()