# Code development with Amazon Nova Premier on Amazon Bedrock

[Amazon Nova Premier](https://aws.amazon.com/ai/generative-ai/nova/understanding/) is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance. Build and scale generative AI applications with Amazon Nova foundation models with seamless integration in Amazon Bedrock. 

## Getting Started with Amazon Nova Premier on Bedrock

[Amazon Nova Premier](https://www.amazon.science/publications/amazon-nova-premier-technical-report-and-model-card) is available available as a fully managed, serverless option in Amazon Bedrock. Let's walk through the process of enabling Amazon Nova Premier and setting up your development environment.

### Prerequisites

Before we begin, ensure you have:

1. An AWS account (if you don't have one create it [here](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-creating.html)).
2. Understanding of [Amazon Bedrock](https://aws.amazon.com/bedrock/), [Amazon Sagemaker](https://docs.aws.amazon.com/sagemaker/latest/dg/gs-set-up.html#gs-account) (optional) and [AWS Identity and Access Management (IAM)](https://aws.amazon.com/iam/).
3. Appropriate IAM permissions for Amazon Bedrock.
4. [Model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) enabled for required Amazon Bedrock.

### Enabling Amazon Nova Premier in Bedrock (Model Access)

1. **Sign in to the AWS Management Console**:
   Navigate to the Amazon Bedrock [console](https://console.aws.amazon.com/bedrock/).

2. **Request model access**:
   In the Bedrock console, select "Model access" from the left navigation menu, then click "Manage model access".

![Bedrock Model Access](img/nova_1.png "Request model access on Bedrock")

3. **Enable Amazon Nova Premier**:
   Click on "Modify model access" then find Amazon Nova Premier in the list of available models, select the checkbox, and click "Request model access". Your access request will typically be approved within minutes.

![Modify Model Access](img/nova_2.png "Click on Modify model access")

4. **Verify access**:
   Once approved, you'll see Amazon Nova Premier listed as "Access granted" in the Model Access page.

![Access Granted](img/nova_3.png "Access Granted")

5. **Set up your development environment**:
   You'll need the AWS SDK for Python (Boto3) to interact with Bedrock. Install it using pip:

      ```python
      pip install boto3
      ```

Now you're ready to utilize the powerful capabilities of Amazon Nova Premier!

## Set up the environment

Before we begin you must install the required libraries, set up your credentials to use Amazon Bedrock and define a helper function to call the service.

In [None]:
# Uncomment to install AWS SDK for Python (boto3)
#!pip install boto3 -U

In [None]:
import boto3

# Set up the Bedrock client
bedrock_runtime = boto3.client(service_name="bedrock-runtime")

Following helper method will handle Bedrock invocations using [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-examples.html).

This method will have two distinct invocations, simple one with regular parameters and another one to include tool_config, for [Tool Use](https://docs.aws.amazon.com/bedrock/latest/userguide/tool-use.html) feature. Tool use with models is also known as Function calling.

In [None]:
# Helper Function to call Amazon Bedrock
def bedrock_run(input_msg,
                system_msg=None,
                tool_config=None,
                model_id = "us.amazon.nova-premier-v1:0"):

    print(f"Invoking model {model_id} on Bedrock")

    # Base Prompts
    system_prompts = []

    # Filling values if exists
    if system_msg:
        system_prompts.append({"text": system_msg})

    # Set the array of messages to be sent to the model
    messages = [{
        "role": "user",
        "content": input_msg
    }]

    # Placeholder for the results of the model
    predicted = ""

    # Check if it is a tool calling ask
    if tool_config:
        inference_config = {
            "temperature": 1, 
            "topP": 1
        }
        
        additional_model_request_fields={
            "inferenceConfig": {
                "topK": 1,
            },
        }
        
        # Call Amazon Bedrock Inference
        response = bedrock_runtime.converse(
            modelId=model_id,
            system=system_prompts,
            messages=messages,
            inferenceConfig=inference_config,
            toolConfig=tool_config,
            additionalModelRequestFields=additional_model_request_fields
        )
        
        # Parse the response returned from Bedrock
        tool_requests = response['output']['message']['content']
        for tool_request in tool_requests:
            if 'toolUse' in tool_request:
                tool = tool_request['toolUse']
                predicted = tool
    else:
        # Base inference parameters to use.
        inference_config = {"temperature":  0.2}
        
        # Call Amazon Bedrock Inference
        response = bedrock_runtime.converse(
            modelId=model_id,
            system=system_prompts,
            messages=messages,
            inferenceConfig=inference_config
        )

        # Parse the response returned from Bedrock
        output_message = response['output']['message']
        for content in output_message['content']:
            answer = content['text']
            predicted += answer

    # Returns the result of the prediction
    return predicted

In [None]:
# Test the solution to check if your set up is working

input_msg = [{"text": f"""When AWS was founded?"""}]
bedrock_run(input_msg)

Now that we now that our environment setup is working  we can start using Nova Premier to help us develop applications. Is this notebook we will guide you through the following tasks:

1. **Code generation**: Nova Premier excels at generating high-quality code across a wide range of programming languages, including Java, Python, and more. It can translate natural language instructions into functional code, streamlining the development process and accelerating project timelines.

2. **Code debugging**: Leveraging its understanding of code patterns and common pitfalls, Nova Premier can identify bugs, security vulnerabilities, and inefficiencies in code. It suggests fixes and improvements with detailed explanations of the issues found.

3. **Code explanation**: The model can provide detailed explanations of existing code, breaking down complex functions and algorithms into understandable components. This feature is invaluable for knowledge transfer, onboarding new team members, and working with legacy codebases.

4. **Code execution**: Nova Premier includes the capability to simulate code execution, predicting outputs and potential errors without actually running the code. It can generate test cases and validate functionality across multiple scenarios.

5. **Code refactoring**: The model can restructure existing code to improve readability, performance, and maintainability without changing its external behavior, following best practices and design patterns appropriate for each language.

### 1 - Code Generation

In our use case we will use Amazon Nova Premier to help us develop a machine learning model that can perform time series forecast.

In [None]:
# Helper function to define a template for our coding generation capabilities
def generate_prompt(description):
    prompt = f"""
You are an expert Python developer specializing in time series forecasting and AWS services. 
Generate clean, efficient, and well-documented code based on the following requirements:

{description}

It must return only the Python code without any additional text before or after.
"""
    return prompt

In [None]:
# Set up our instructions (prompt) to the model

description = """Create a complete Python for the following tasks, every step should be a function inside this Python file:
1. Generates synthetic daily sales data for a retail store with realistic seasonality and trend over 3 years
2. Splits the data into training and testing sets
3. Trains a time series forecasting model XGBoost
4. Evaluates the model using appropriate metrics (RMSE, MAPE)
5. Serializes the model and saves it locally
6. Create a function to handle model invocations
7. Run a function call to invoke model

Make sure to include all necessary imports and explain key components with comments.
"""

result = generate_prompt(description)
print(result)

In [None]:
# Now call Amazon Nova Premier to get the code
input_msg = [{"text": result}]
response = bedrock_run(input_msg)

print(response)

#### Test the generated code

**Note: We're not using the "main" function generated by the model because we're going step by step over the next cells. Let's assume next snipped of code was generated by the model, copied and pasted on next cell.**

In [None]:
# You'll probably need to install the xgboost package in you environment, to do so uncomment the next line
# #!pip install xgboost

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_percentage_error
import xgboost as xgb
import pickle

# Generate synthetic daily sales data
def generate_synthetic_data():
    np.random.seed(42)
    date_rng = pd.date_range(start='1/1/2018', end='1/1/2021', freq='D')
    trend = np.linspace(0, 1, len(date_rng))
    seasonality = np.sin(2 * np.pi * (date_rng.dayofyear / 365.25))
    noise = np.random.normal(0, 0.1, len(date_rng))
    sales = 100 + 50 * trend + 30 * seasonality + noise
    data = pd.DataFrame(date_rng, columns=['date'])
    data['sales'] = sales
    return data

# Split data into training and testing sets
def split_data(data):
    train, test = train_test_split(data, test_size=0.2, shuffle=False)
    return train, test

# Train XGBoost model
def train_model(train):
    X_train = train.index.values.reshape(-1, 1)
    y_train = train['sales']
    model = xgb.XGBRegressor(objective='reg:squarederror', n_estimators=100)
    model.fit(X_train, y_train)
    return model

# Evaluate the model
def evaluate_model(model, test):
    X_test = test.index.values.reshape(-1, 1)
    y_test = test['sales']
    predictions = model.predict(X_test)
    rmse = np.sqrt(mean_squared_error(y_test, predictions))
    mape = mean_absolute_percentage_error(y_test, predictions)
    return rmse, mape

# Serialize and save the model
def save_model(model, filename='xgboost_model.pkl'):
    with open(filename, 'wb') as file:
        pickle.dump(model, file)

# Function to handle model invocations
def invoke_model(model_path, input_date):
    with open(model_path, 'rb') as file:
        model = pickle.load(file)
    date_ordinal = pd.to_datetime(input_date).toordinal()
    prediction = model.predict(np.array([[date_ordinal]]))
    return prediction[0]

In [None]:
# Test the generate synthetic data function
data = generate_synthetic_data()

data.set_index('date', inplace=True)
data.head(5)

In [None]:
# Test the split data function
train_data, test_data = split_data(data)
print(f"Training dataset size: {len(train_data)} and Test dataset size: {len(test_data)}")

In [None]:
# Test the training function
model = train_model(train_data)

**It seems we have found a bug at the generated code, let's also use Amazon Nova Premier to fix it.**

### 2 - Code Debugging

We can see that the training funcion is throwing an error when trying to execute it. Let\'s try to ask model how to fix it by copying the error returned and passing it to the LLM model.

In [None]:
# Copied error outputed
train_error = """
XGBoostError: [18:16:06] /Users/runner/work/xgboost/xgboost/src/c_api/../data/array_interface.h:145: Check failed: typestr.size() == 3 || typestr.size() == 4: `typestr' should be of format <endian><type><size of type in bytes>.
Stack trace:
  [bt] (0) 1   libxgboost.dylib                    0x0000000172f8dbfc dmlc::LogMessageFatal::~LogMessageFatal() + 124
  [bt] (1) 2   libxgboost.dylib                    0x0000000172f9c79c xgboost::ArrayInterfaceHandler::Validate(std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, xgboost::Json, std::__1::less<void>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, xgboost::Json>>> const&) + 1120
  [bt] (2) 3   libxgboost.dylib                    0x0000000172fa1048 xgboost::ArrayInterface<2, false>::Initialize(std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, xgboost::Json, std::__1::less<void>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, xgboost::Json>>> const&) + 48
  [bt] (3) 4   libxgboost.dylib                    0x0000000173148228 xgboost::data::ArrayAdapter::ArrayAdapter(xgboost::StringView) + 148
  [bt] (4) 5   libxgboost.dylib                    0x0000000173147e3c xgboost::data::DMatrixProxy::SetArrayData(xgboost::StringView) + 72
  [bt] (5) 6   libxgboost.dylib                    0x0000000172f9a6fc XGProxyDMatrixSetDataDense + 136
  [bt] (6) 7   libffi.8.dylib                      0x0000000105bbc04c ffi_call_SYSV + 76
  [bt] (7) 8   libffi.8.dylib                      0x0000000105bb9834 ffi_call_int + 1404
  [bt] (8) 9   _ctypes.cpython-312-darwin.so       0x0000000105b9c0f0 _ctypes_callproc + 756
"""

input_msg = [{"text": f"""Why this {response} is returning this {train_error}"""}]
debug_response = bedrock_run(input_msg)
print(debug_response)

#### Test the fixed version of the code
Let's execute again, this time with brand new code fixed by Amazon Nova Premier

In [None]:
# Set up of the code generated by Amazon Nova Premier

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_percentage_error
import xgboost as xgb
import pickle

# Generate synthetic daily sales data
def generate_synthetic_data():
    np.random.seed(42)
    date_rng = pd.date_range(start='1/1/2018', end='1/1/2021', freq='D')
    trend = np.linspace(0, 1, len(date_rng))
    seasonality = np.sin(2 * np.pi * (date_rng.dayofyear / 365.25))
    noise = np.random.normal(0, 0.1, len(date_rng))
    sales = 100 + 50 * trend + 30 * seasonality + noise
    data = pd.DataFrame(date_rng, columns=['date'])
    data['sales'] = sales
    return data

# Split data into training and testing sets
def split_data(data):
    train, test = train_test_split(data, test_size=0.2, shuffle=False)
    return train, test

# Train XGBoost model
def train_model(train):
    # Convert dates to ordinal values
    X_train = train['date'].apply(lambda x: x.toordinal()).values.reshape(-1, 1)
    y_train = train['sales']
    model = xgb.XGBRegressor(objective='reg:squarederror', n_estimators=100)
    model.fit(X_train, y_train)
    return model

# Evaluate the model
def evaluate_model(model, test):
    # Convert dates to ordinal values
    X_test = test['date'].apply(lambda x: x.toordinal()).values.reshape(-1, 1)
    y_test = test['sales']
    predictions = model.predict(X_test)
    rmse = np.sqrt(mean_squared_error(y_test, predictions))
    mape = mean_absolute_percentage_error(y_test, predictions)
    return rmse, mape

# Serialize and save the model
def save_model(model, filename='xgboost_model.pkl'):
    with open(filename, 'wb') as file:
        pickle.dump(model, file)

# Function to handle model invocations
def invoke_model(model_path, input_date):
    with open(model_path, 'rb') as file:
        model = pickle.load(file)
    date_ordinal = pd.to_datetime(input_date).toordinal()
    prediction = model.predict(np.array([[date_ordinal]]))
    return prediction[0]

After defining this new version of our code we can execute the same steps as before to check if the error has been fixed.

In [None]:
# Step 1 - Generate synthetic data 
df = generate_synthetic_data()

# Step 2 - Split the dataset into training and test.
train, test = split_data(df)

# Step 3 - Run the training function
model = train_model(train)

Great! No errors were found, we can proceed to finish our execution plan.

In [None]:
# Test the evaluation function
rmse, mape = evaluate_model(model, test)
print(f'RMSE: {rmse}, MAPE: {mape}')

In [None]:
# Test the model saving function
save_model(model)

Everything seems to be working fine! The last step we need to do is using the saved model file to run an invocation with a new value.

In [None]:
# Define an example invocation
next_day_ordinal = df['date'].iloc[-1] + pd.Timedelta(days=1)  # Predict next day
sample_data = next_day_ordinal.strftime('%Y-%m-%d')  # Convert to string format

# Test the invocation function
prediction = invoke_model('xgboost_model.pkl', sample_data)
print(f'Prediction for next day: {prediction}')

### 3 - Code Explanation
Amazon Nova Premier can be a great ally when you need to understand code written by someone else, let's use the generated code and ask for the model a detailed explanation of what it is doing.

In [None]:
input_msg = [{
    "text": f"""
Give me a detailed explanation of the code below:
{response}

At the end return a sequence diagram and a diagram of the execution steps used to run it, use mermaid to format the diagram.
"""}]

explation_response = bedrock_run(input_msg)
print(explation_response)

We can also check the generated diagrams for a better understanding of the code.

In [None]:
import base64
from IPython.display import display_svg
from urllib.request import Request, urlopen


# Helper function to render mermaid diagrams easily
def mm(graph):
    graphbytes = graph.encode("ascii")
    base64_bytes = base64.b64encode(graphbytes)
    base64_string = base64_bytes.decode("ascii")
    url="https://mermaid.ink/svg/" + base64_string
    req=Request(url, headers={'User-Agent': 'IPython/Notebook'})
    display_svg(urlopen(req).read().decode(), raw=True)

In [None]:
sequence_diagram = """
sequenceDiagram
    participant Main
    participant DataGenerator
    participant DataSplitter
    participant ModelTrainer
    participant ModelEvaluator
    participant ModelSaver
    participant ModelInvoker

    Main->>DataGenerator: generate_synthetic_data()
    DataGenerator-->>Main: data

    Main->>DataSplitter: split_data(data)
    DataSplitter-->>Main: train, test

    Main->>ModelTrainer: train_model(train)
    ModelTrainer-->>Main: model

    Main->>ModelEvaluator: evaluate_model(model, test)
    ModelEvaluator-->>Main: rmse, mape

    Main->>ModelSaver: save_model(model)

    Main->>ModelInvoker: invoke_model('xgboost_model.pkl', '2020-12-31')
    ModelInvoker-->>Main: prediction
"""
    
mm(sequence_diagram)

In [None]:
execution_diagram = """
graph LR
    A[Start] --> B[Generate Synthetic Data]
    B --> C[Split Data into Train and Test]
    C --> D[Train XGBoost Model]
    D --> E[Evaluate Model]
    E --> F[Save Model]
    F --> G[Invoke Model for Prediction]
    G --> H[End]
"""

mm(execution_diagram)

### 4 - Code Execution (Tool Use)

Now, let's simplify our code and make it tools to be used by Premier model.

Let's start with pre-defined function `split_data`

In [None]:
tool_config = {
    "tools": [
        {
            "toolSpec": {
                "name": "split_data",
                "description": "Split dataset into train/validation",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "data": {
                                "type": "dataframe",
                                "description": "Dataset"
                            }
                        },
                        "required": [
                            "data"
                        ]
                    }
                }
            }
        }
    ]
}

input_msg = [{"text": f"""Split my dataset using available tools"""}]

In [None]:
tool_response = bedrock_run(input_msg=input_msg, tool_config=tool_config)
print(tool_response)

### 5 - Code Refactoring

In [None]:
input_msg = [{
    "text": f"""
Convert the python code below to R:
{response}

Only return the R code, without any additional text or explanation.
Do not include any comments in the R code.
"""}]

refactored_response = bedrock_run(input_msg)
print(refactored_response)

# Wrap-Up

In this notebook, we've explored how Amazon Nova Premier on Amazon Bedrock can significantly enhance the software development lifecycle through various code-related capabilities. Let's summarize what we've learned:

## Key Capabilities Demonstrated

1. **Code Generation**: We saw how Nova Premier can generate complete, functional Python code for time series forecasting based on natural language requirements. The model produced a comprehensive solution that included data generation, model training, evaluation, and serialization.
2. **Code Debugging**: When our generated code encountered an error with XGBoost, we leveraged Nova Premier to diagnose and fix the issue. The model correctly identified that we needed to convert date values to ordinal format for proper model training.
3. **Code Explanation**: Nova Premier provided a detailed explanation of the generated code, breaking down each function's purpose and how they work together. The model even created sequence and execution diagrams to visualize the code flow.
4. **Tool Use (Function Calling)**: We demonstrated how Nova Premier can be configured to use tools, enabling it to interact with predefined functions like our data splitting capability.
5. **Code Refactoring**: Finally, we showed how Nova Premier can transform code between languages, converting our Python implementation to R while maintaining the same functionality.

## Benefits for Developers

- **Accelerated Development**: Nova Premier can significantly reduce the time needed to write boilerplate code and implement common patterns.
- **Learning Aid**: The detailed explanations and visualizations help developers understand complex code structures.
- **Error Resolution**: Quick identification and resolution of bugs speeds up the debugging process.
- **Cross-language Support**: The ability to translate between programming languages facilitates collaboration across teams with different technical backgrounds.

## Next Steps

To continue exploring Amazon Nova Premier's capabilities for code development:

1. Try using it for more complex software engineering tasks
2. Experiment with different programming languages and frameworks
3. Integrate it into your CI/CD pipeline for code reviews and suggestions
4. Use it to document existing codebases or legacy systems

Amazon Nova Premier represents a powerful addition to a developer's toolkit, helping to streamline development workflows, improve code quality, and accelerate project delivery while maintaining best practices in software engineering.