# Advanced Code Generation with Mixtral 8x7B


## Setup
***

In [2]:
model_id = "huggingface-llm-mixtral-8x7b-instruct"

In [3]:
from sagemaker.jumpstart.model import JumpStartModel

model = JumpStartModel(model_id=model_id)
predictor = model.deploy()

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml


Using model 'huggingface-llm-mixtral-8x7b-instruct' with wildcard version identifier '*'. You can pin to version '1.3.1' for more stable results. Note that models may have different input/output signatures after a major version upgrade.


-----------!

### Supported parameters

***
This model supports many parameters while performing inference. They include:

* **max_length:** Model generates text until the output length (which includes the input context length) reaches `max_length`. If specified, it must be a positive integer.
* **max_new_tokens:** Model generates text until the output length (excluding the input context length) reaches `max_new_tokens`. If specified, it must be a positive integer.
* **num_beams:** Number of beams used in the greedy search. If specified, it must be integer greater than or equal to `num_return_sequences`.
* **no_repeat_ngram_size:** Model ensures that a sequence of words of `no_repeat_ngram_size` is not repeated in the output sequence. If specified, it must be a positive integer greater than 1.
* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.
* **early_stopping:** If True, text generation is finished when all beam hypotheses reach the end of sentence token. If specified, it must be boolean.
* **do_sample:** If True, sample the next word as per the likelihood. If specified, it must be boolean.
* **top_k:** In each step of text generation, sample from only the `top_k` most likely words. If specified, it must be a positive integer.
* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.
* **return_full_text:** If True, input text will be part of the output generated text. If specified, it must be boolean. The default value for it is False.
* **stop**: If specified, it must a list of strings. Text generation stops if any one of the specified strings is generated.

We may specify any subset of the parameters mentioned above while invoking an endpoint. Next, we show an example of how to invoke endpoint with these arguments
***

## Instruction prompts
***
The examples in this notebook demonstrate queries to the Mixtral 8x7B Instruct model. This involves special token formatting within the prompt input.
***

In [4]:
from typing import Dict, List


def format_instructions(instructions: List[Dict[str, str]]) -> List[str]:
    """Format instructions where conversation roles must alternate user/assistant/user/assistant/..."""
    prompt: List[str] = []
    for user, answer in zip(instructions[::2], instructions[1::2]):
        prompt.extend(["<s>", "[INST] ", (user["content"]).strip(), " [/INST] ", (answer["content"]).strip(), "</s>"])
    prompt.extend(["<s>", "[INST] ", (instructions[-1]["content"]).strip(), " [/INST] "])
    return "".join(prompt)


def print_prompt_and_response(prompt: str, response: str) -> None:
    bold, unbold = '\033[1m', '\033[0m'
    print(f"{bold}> Input{unbold}\n{prompt}\n\n{bold}> Output{unbold}\n{response[0]['generated_text']}\n")

### Generating a Hyperparameter Tuning Script for SageMaker

In [15]:
instructions = [
    {
        "role": "user",
        "content": """\
<<SYS>>You are an AI assistant specializing in generating Python code for hyperparameter tuning of machine learning models on Amazon SageMaker. Your code should be efficient, well-documented, and include examples of usage.<</SYS>>
Generate a Python script that performs hyperparameter tuning of an XGBoost model for regression on the Boston Housing dataset. The script should use Amazon SageMaker's XGBoost estimator and be ready for deployment on SageMaker. 
It should optimize for validation:accuracy, and optimize the hyperparameters alpha and eta.""",
    }
]
prompt = format_instructions(instructions)
payload = {
    "inputs": prompt,
    "parameters": {"max_new_tokens": 5000, "temperature": 0.2}
}
response = predictor.predict(payload)
print_prompt_and_response(prompt, response)

[1m> Input[0m
<s>[INST] <<SYS>>You are an AI assistant specializing in generating Python code for hyperparameter tuning of machine learning models on Amazon SageMaker. Your code should be efficient, well-documented, and include examples of usage.<</SYS>>
Generate a Python script that performs hyperparameter tuning of an XGBoost model for regression on the Boston Housing dataset. The script should use Amazon SageMaker's XGBoost estimator and be ready for deployment on SageMaker. 
It should optimize for validation:accuracy, and optimize the hyperparameters alpha and eta. [/INST] 

[1m> Output[0m
```python
import boto3
import sagemaker
from sagemaker.xgboost import XGBoost
from sagemaker.tuner import IntegerParameter, ContinuousParameter, HyperparameterTuner

# Initialize SageMaker session and boto3 client
sagemaker_session = sagemaker.Session()
boto3_client = boto3.client('sagemaker')

# Specify the hyperparameter ranges for tuning
hyperparameter_ranges = {
    'alpha': ContinuousPar

### CDK code for a 3 Tier Application

In [44]:
instructions = [
    {
        "role": "user",
        "content": """\
<<SYS>>
You are a programmer specialized in AWS, and programming in Python and the AWS Cloud Development Kit. Your code should be efficient, well-documented, and include examples of usage.
<</SYS>>
Create a CDK script in Python for configuring a three tier web application in AWS. The architecture should consist of these three tiers:

1. Presentation tier: Static website content hosted in Amazon S3, distributed by CloudFront. AWS Certificate Manager allows a custom SSL/TLS certificate to be used.

2. Logic tier: API Gateway with AWS Lambda. This architecture shows three exposed services (/tickets, /shows, and /info).

3. Data tier: Amazon DynamoDB is used for the /tickets and /shows services. Amazon S3 is used to host static content used by the /info service.""",
    }
]
prompt = format_instructions(instructions)
payload = {
    "inputs": prompt,
    "parameters": {"max_new_tokens": 5000, "do_sample": True, "temperature": 0.2}
}
response = predictor.predict(payload)
print_prompt_and_response(prompt, response)

[1m> Input[0m
<s>[INST] <<SYS>>
You are a programmer specialized in AWS, and programming in Python and the AWS Cloud Development Kit. Your code should be efficient, well-documented, and include examples of usage.
<</SYS>>
Create a CDK script in Python for configuring a three tier web application in AWS. The architecture should consist of these three tiers:

1. Presentation tier: Static website content hosted in Amazon S3, distributed by CloudFront. AWS Certificate Manager allows a custom SSL/TLS certificate to be used.

2. Logic tier: API Gateway with AWS Lambda. This architecture shows three exposed services (/tickets, /shows, and /info).

3. Data tier: Amazon DynamoDB is used for the /tickets and /shows services. Amazon S3 is used to host static content used by the /info service. [/INST] 

[1m> Output[0m
 Sure, here's an example CDK script in Python that configures a three-tier web application in AWS:
```python
from aws_cdk import (
    aws_certificatemanager as acm,
    aws_clou

### Code Transformation - Java to Python

In [43]:
instructions = [
    {
        "role": "user",
        "content": """\
<<SYS>>
You are a programmer specialized in Java and Python languages. Your code should be efficient, well-documented, and include examples of usage.
<</SYS>>
Convert the following Java code to its equivalent in Python:

<java>
import java.util.ArrayList;

public class Main { 
  public static void main(String[] args) { 
    ArrayList<String> cars = new ArrayList<String>();
    cars.add("Volvo");
    cars.add("BMW");
    cars.add("Ford");
    cars.add("Mazda");
    for (int i = 0; i < cars.size(); i++) {
      System.out.println(cars.get(i));
    }
  } 
}
</java>""",
    }
]
prompt = format_instructions(instructions)
payload = {
    "inputs": prompt,
    "parameters": {"max_new_tokens": 5000, "temperature": 0.2}
}
response = predictor.predict(payload)
print_prompt_and_response(prompt, response)

[1m> Input[0m
<s>[INST] <<SYS>>
You are a programmer specialized in Java and Python languages. Your code should be efficient, well-documented, and include examples of usage.
<</SYS>>
Convert the following Java code to its equivalent in Python:

<java>
import java.util.ArrayList;

public class Main { 
  public static void main(String[] args) { 
    ArrayList<String> cars = new ArrayList<String>();
    cars.add("Volvo");
    cars.add("BMW");
    cars.add("Ford");
    cars.add("Mazda");
    for (int i = 0; i < cars.size(); i++) {
      System.out.println(cars.get(i));
    }
  } 
}
</java> [/INST] 

[1m> Output[0m
<python>
# Import the equivalent module in Python for ArrayList, which is list
cars = []  # Creating an empty list

# Adding elements to the list
cars.append("Volvo")
cars.append("BMW")
cars.append("Ford")
cars.append("Mazda")

# Using a for-each loop to iterate over the list and print each element
for car in cars:
    print(car)
</python>

Or, you can use list comprehension 

## Clean up the endpoint

In [None]:
predictor.delete_model()
predictor.delete_endpoint()