# Mistral Benchmarking & Prompt Completion evaluation using SageMaker Jumpstart

---
Let's do some text generation, benchmarking and evaluation for Mistral models on sagemaker jumpstart. You can easily deploy the model on jumpstart.

---

## Setup
***

In [2]:
model_id = "huggingface-llm-mistral-7b-instruct"

In [3]:
from sagemaker.jumpstart.model import JumpStartModel

model = JumpStartModel(model_id=model_id)
predictor = model.deploy()

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml


Using model 'huggingface-llm-mistral-7b-instruct' with wildcard version identifier '*'. You can pin to version '2.0.0' for more stable results. Note that models may have different input/output signatures after a major version upgrade.


--------!

### Supported parameters

***
This model supports many parameters while performing inference. They include:

* **max_length:** Model generates text until the output length (which includes the input context length) reaches `max_length`. If specified, it must be a positive integer.
* **max_new_tokens:** Model generates text until the output length (excluding the input context length) reaches `max_new_tokens`. If specified, it must be a positive integer.
* **num_beams:** Number of beams used in the greedy search. If specified, it must be integer greater than or equal to `num_return_sequences`.
* **no_repeat_ngram_size:** Model ensures that a sequence of words of `no_repeat_ngram_size` is not repeated in the output sequence. If specified, it must be a positive integer greater than 1.
* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.
* **early_stopping:** If True, text generation is finished when all beam hypotheses reach the end of sentence token. If specified, it must be boolean.
* **do_sample:** If True, sample the next word as per the likelihood. If specified, it must be boolean.
* **top_k:** In each step of text generation, sample from only the `top_k` most likely words. If specified, it must be a positive integer.
* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.
* **return_full_text:** If True, input text will be part of the output generated text. If specified, it must be boolean. The default value for it is False.
* **stop**: If specified, it must a list of strings. Text generation stops if any one of the specified strings is generated.

We may specify any subset of the parameters mentioned above while invoking an endpoint. Next, we show an example of how to invoke endpoint with these arguments
***

## Instruction prompts
***
The examples in this section demonstrate queries to the Mistral 7B Instruct model. This involves special token formatting within the prompt input. The base, pre-trained Mistral model is not fine-tuned to perform this instruction task -- please use the prompts in the next section instead.
***

In [4]:
from typing import Dict, List


def format_instructions(instructions: List[Dict[str, str]]) -> List[str]:
    """Format instructions where conversation roles must alternate user/assistant/user/assistant/..."""
    prompt: List[str] = []
    for user, answer in zip(instructions[::2], instructions[1::2]):
        prompt.extend(["<s>", "[INST] ", (user["content"]).strip(), " [/INST] ", (answer["content"]).strip(), "</s>"])
    prompt.extend(["<s>", "[INST] ", (instructions[-1]["content"]).strip(), " [/INST] "])
    return "".join(prompt)


def print_prompt_and_response(prompt: str, response: str) -> None:
    bold, unbold = '\033[1m', '\033[0m'
    print(f"{bold}> Input{unbold}\n{prompt}\n\n{bold}> Output{unbold}\n{response[0]['generated_text']}\n")

In [5]:
instructions = [{"role": "user", "content": "what is hindi?"}]
prompt = format_instructions(instructions)
payload = {
    "inputs": prompt,
    "parameters": {"max_new_tokens": 256, "do_sample": True}
}
response = predictor.predict(payload)
print_prompt_and_response(prompt, response)

[1m> Input[0m
<s>[INST] what is the recipe of mayonnaise? [/INST] 

[1m> Output[0m
1. 1/2 cup egg yolks
2. 1 tablespoon Dijon mustard
3. 2 tablespoons lemon juice
4. 1/2 teaspoon salt
5. 3/4 cup vegetable oil or canola oil
6. Equipment: whisk, cutting board, measure, mixing bowl.



In [6]:
instructions = [
    {"role": "user", "content": "I am going to Paris, what should I see?"},
    {
        "role": "assistant",
        "content": """\
Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:

1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.

These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.""",
    },
    {"role": "user", "content": "What is so great about #1?"},
]
prompt = format_instructions(instructions)
payload = {
    "inputs": prompt,
    "parameters": {"max_new_tokens": 256, "do_sample": True}
}
response = predictor.predict(payload)
print_prompt_and_response(prompt, response)

[1m> Input[0m
<s>[INST] I am going to Paris, what should I see? [/INST] Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:

1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.

These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.</s><s>[INST] What is so great about #1? [/INST] 

[1m> Output[0m
1. T

## Pre-trained model prompts
***
The examples in this section demonstrate how to perform text generation on the base, pre-trained Mistral model. If you have deployed the instruction-tuned model, please use prompt formatting in the previous section instead.
***

In [9]:
prompt = "Write a program to compute factorial in python:"
payload = {
    "inputs": prompt,
    "parameters": {
        "max_new_tokens": 200,
    },
}
response = predictor.predict(payload)
print_prompt_and_response(prompt, response)

[1m> Input[0m
Write a program to compute factorial in python:

[1m> Output[0m


```python
def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)

n = int(input("Enter a number: "))
result = factorial(n)
print("Factorial of", n, "is", result)
```

In this program, we define a function named `factorial` that takes an integer `n` as an argument. Inside the function, we use recursion to compute the factorial of the given number. If the value of `n` is `0`, then the function returns `1`. Otherwise, it returns the product of `n` and the factorial of `n-1`.

We then take an input from the user to compute the factorial of a given number. We call the `factorial` function with the input value and store the result in a



In [10]:
prompt = "Building a website can be done in 10 simple steps:"
payload = {
    "inputs": prompt,
    "parameters": {
        "max_new_tokens": 110,
        "no_repeat_ngram_size": 3,
    },
}
response = predictor.predict(payload)
print_prompt_and_response(prompt, response)

[1m> Input[0m
Building a website can be done in 10 simple steps:

[1m> Output[0m


1. Choose a domain name: This is the first step in creating a website. Choose a name that represents your brand or business.

2. Choose a hosting provider: A hosting provider is a company that provides the server space where your website will be stored.

3. Choose a website builder: A website builder is a tool that allows you to create a website without any coding knowledge.

4. Choose a website template: A website template is a pre-designed layout that you can customize to fit your



## Now - Let's deploy Mistral 7b itself and test it on some questions

In [16]:
model_id_noinstruct = "huggingface-llm-mistral-7b"

In [19]:
from sagemaker.jumpstart.model import JumpStartModel

model_new = JumpStartModel(model_id=model_id)
predictor_new = model_new.deploy()

---------!

In [20]:
prompt = "Building a website can be done in 10 simple steps:"
payload = {
    "inputs": prompt,
    "parameters": {
        "max_new_tokens": 110,
        "no_repeat_ngram_size": 3,
    },
}
response = predictor_new.predict(payload)
print_prompt_and_response(prompt, response)

[1m> Input[0m
Building a website can be done in 10 simple steps:

[1m> Output[0m


1. Choose a domain name: This is the first step in creating a website. Choose a name that represents your brand or business.

2. Choose a hosting provider: A hosting provider is a company that provides the server space where your website will be stored.

3. Choose a website builder: A website builder is a tool that allows you to create a website without any coding knowledge.

4. Choose a website template: A website template is a pre-designed layout that you can customize to fit your



## Now, let's load the aisquared/dais-question-answers dataset to test for some latency :)

In [23]:
%pip install --upgrade sagemaker datasets

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting sagemaker
  Downloading sagemaker-2.202.1-py2.py3-none-any.whl.metadata (13 kB)
Collecting datasets
  Downloading datasets-2.16.0-py3-none-any.whl.metadata (20 kB)
Collecting urllib3<1.27 (from sagemaker)
  Downloading urllib3-1.26.18-py2.py3-none-any.whl.metadata (48 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.9/48.9 kB[0m [31m195.1 MB/s[0m eta [36m0:00:00[0m
Collecting pyarrow-hotfix (from datasets)
  Downloading pyarrow_hotfix-0.6-py3-none-any.whl.metadata (3.6 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.4.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting fsspec<=2023.10.0,>=2023.1.0 (from fsspec[http]<=2023.10.0,>=2023.1.0->datasets)
  Downloading fsspec-2023.10.0-py3-none-any.whl.metadata (6.8 kB)
Collecting aiohttp (from datasets)
  Downloading aiohttp-3.9.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

In [26]:
from datasets import load_dataset

dais = load_dataset("aisquared/dais-question-answers", split="train")

# We split the dataset into two where test data is used to evaluate at the end.
train_and_test_dataset = dais.train_test_split(test_size=0.1)

# Dumping the training data to a local file to be used for training.
train_and_test_dataset["train"].to_json("train.jsonl")

Creating json from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

840825

In [28]:
prompt_inference = ("""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Input:\n{context}""")

In [30]:
import pandas as pd
import time
from IPython.display import HTML, display

# Assuming test_dataset is already defined
test_dataset = train_and_test_dataset["test"]

def predict_and_print(datapoint, predictor):
    payload = {
        "inputs": datapoint["question"],
        "parameters": {"max_new_tokens": 100},
    }

    start_time = time.time()
    response = predictor.predict(payload)
    end_time = time.time()

    latency = end_time - start_time
    response_text = response["generated_text"] if isinstance(response, dict) else response

    return {
        "Question": datapoint["question"],
        "Ground Truth Answer": datapoint["answer"],
        "Model Prediction": response_text,
        "Inference Latency (seconds)": latency
    }

results_new = []
results_old = []

try:
    for i, datapoint in enumerate(test_dataset.select(range(10))):
        result_new = predict_and_print(datapoint, predictor_new)
        result_old = predict_and_print(datapoint, predictor)
        results_new.append(result_new)
        results_old.append(result_old)

    df_new = pd.DataFrame(results_new)
    df_old = pd.DataFrame(results_old)

    print("Results from predictor_new:")
    display(HTML(df_new.to_html()))

    print("Results from predictor:")
    display(HTML(df_old.to_html()))

except Exception as e:
    print(e)



Results from predictor_new:


Unnamed: 0,Question,Ground Truth Answer,Model Prediction,Inference Latency (seconds)
0,What is the Centro de demonstração de produtos e parcerias da Databricks and what services does it offer?,"The Centro de demonstração de produtos e parcerias da Databricks is a platform that offers demonstrations of solution accelerators. It also offers services such as documentation, training and certification, resources, online community, alliance with universities, events, and partnership programs for cloud partners and technology and data partners. Additionally, it provides solutions for partners to connect with validated partner solutions.","[{'generated_text': ' The Centro de demonstração de produtos e parcerias da Databricks is a platform that allows users to test and evaluate Databricks products and services. It offers a range of features and services, including: 1. Data Science Workspace: A cloud-based platform that provides a complete environment for data scientists to develop, test, and deploy machine learning models. 2. Apache Spark: An open-source distributed computing engine that'}]",3.368706
1,What is the purpose of the Data and AI Summit 2023?,"The purpose of the Data and AI Summit 2023 is to bring together experts, researchers, contributors, and professionals from the global data community to understand the potential of Large Language Models (LLMs) and shape the future of their industries with data and AI. Attendees can expect to hear from top speakers in the field, including those from Databricks and across the data and AI community, and learn about building, training, and deploying LLMs, as well as other relevant topics in data and AI.","[{'generated_text': ' The purpose of the Data and AI Summit 2023 is to bring together industry leaders, experts, and professionals to discuss the latest trends, innovations, and challenges in the field of data and artificial intelligence. The summit aims to provide a platform for knowledge sharing, networking, and collaboration, and to inspire attendees to drive the adoption and implementation of data and AI technologies in their organizations.'}]",2.803292
2,What AWS competencies did Databricks showcase at AWS re:Invent?,"Databricks showcased AWS competencies in data engineering, data warehousing, data streaming, and machine learning at AWS re:Invent.","[{'generated_text': ' At AWS re:Invent, Databricks showcased several AWS competencies, including: 1. Big Data: Databricks is a leading big data platform that leverages AWS services such as Amazon S3, Amazon EMR, and Amazon Redshift to process and analyze large volumes of data. 2. Machine Learning: Databricks provides a comprehensive machine learning platform that integrates with AWS services such as Amazon SageMaker, Amazon K'}]",3.293204
3,What is the date and time of the upcoming webinar about transitioning from data warehouse to data lakehouse?,The upcoming webinar about transitioning from data warehouse to data lakehouse is on May 18 at 8 AM PT.,[{'generated_text': ' The upcoming webinar about transitioning from data warehouse to data lakehouse is on [insert date and time here].'}],0.863199
4,What are Databricks Solution Accelerators and how do they help deliver data and AI value faster?,"Databricks Solution Accelerators are tools developed by Databricks that help deliver data and AI value faster. They save time in the discovery, design, development, and testing phases by providing pre-built solutions for common use cases in industries such as Financial Services, Healthcare, Manufacturing, and more. These accelerators allow organizations to quickly implement data and AI solutions, enabling them to achieve their business objectives faster.","[{'generated_text': ' Databricks Solution Accelerators are pre-built, reusable components that help organizations deliver data and AI value faster. These accelerators are designed to address common data and AI challenges and provide a quick and easy way to get started with Databricks. There are several types of Solution Accelerators available, including: 1. Data Pipelines: These accelerators provide pre-built data pipelines that can be used to move data between different'}]",3.295417
5,What is the purpose of Databricks Notebooks and how can it benefit different teams in an organization?,"The purpose of Databricks Notebooks is to provide a collaborative workspace for teams working on data science, engineering, and machine learning projects. It allows teams to work with familiar languages and tools, use built-in data visualizations, and have automatic versioning within the notebooks. Databricks Notebooks can benefit different teams in an organization by providing a centralized platform for data collaboration, improving productivity, and promoting efficient communication between different teams. It can also help with data analysis, exploration, and modeling, leading to faster and more accurate insights.","[{'generated_text': ' Databricks Notebooks is a powerful tool that allows users to create, share, and collaborate on data science projects. It provides a user-friendly interface for creating and executing code, visualizations, and other data-related tasks. The purpose of Databricks Notebooks is to provide a centralized platform for data scientists, analysts, and other teams to work together on data-related projects. It enables teams to easily share code, data'}]",3.294186
6,What is the Databricks Lakehouse Platform and how does it fit within a modern data stack?,"The Databricks Lakehouse Platform is a data management platform that encompasses a wide range of data technologies, including Delta Lake, data governance, data engineering, data streaming, data warehousing, data sharing, machine learning, and data science. As a ""lakehouse,"" it provides a unified, scalable, and secure platform for managing and processing both structured and unstructured data, bridging the gap between traditional data warehouses and data lakes. It fits within a modern data stack by providing a comprehensive solution for data management, processing, and analysis, and can be integrated with other technologies and tools as needed to meet specific business needs.","[{'generated_text': ' The Databricks Lakehouse Platform is a cloud-based data platform that provides a unified environment for data storage, processing, and analytics. It is designed to be a centralized hub for all data-related activities, including data ingestion, ETL, data warehousing, machine learning, and data governance. The Lakehouse Platform is built on top of Apache Spark, an open-source data processing engine that is widely used for big data processing'}]",3.297496
7,Who is Ruifeng Zheng and what is his role at Databricks?,"Ruifeng Zheng is a Senior Software Engineer at Databricks and an Apache Spark committer. He works on various modules in Apache Spark including Spark Connect, Pandas API on Spark, PySpark, MLlib, Spark SQL, SparkR, etc. Prior to Databricks, he worked on applied machine learning for over 10 years. He holds a Master degree in Electronics Engineering from Peking University.","[{'generated_text': ' Ruifeng Zheng is a Senior Manager of Product Marketing at Databricks. In this role, he leads the product marketing team and is responsible for developing and executing the product marketing strategy for Databricks' cloud-based big data platform. He works closely with the product development team to understand customer needs and develops marketing materials that effectively communicate the value of Databricks' platform to potential customers.'}]",2.966294
8,What are the benefits of using Databricks and how can I try it for free?,"The benefits of using Databricks include simplifying data ingestion, automating ETL processes, collaborating in multiple programming languages such as Python, R, Scala, and SQL, and a price performance up to 12x better than cloud data warehouses. To try Databricks for free, one can sign up for a 14-day trial on their AWS, Microsoft Azure, or Google Cloud platform. To do so, visit the Databricks website and fill out the registration form with your name, email, company, and country.","[{'generated_text': ' Databricks is a cloud-based data science platform that provides a unified environment for data engineering, machine learning, and analytics. It offers several benefits, including: 1. Scalability: Databricks can handle large-scale data processing and analytics tasks with ease. It can scale up or down based on the workload, ensuring that you only pay for what you use. 2. Integration: Databricks integrates with a wide range'}]",3.295242
9,Who is Pieter Noordhuis and what topic will he be speaking about at the Data + AI Summit 2023?,Pieter Noordhuis is a Senior Staff Software Engineer at Databricks and will be speaking about developer tooling at the Data + AI Summit 2023.,"[{'generated_text': ' Pieter Noordhuis is a data scientist and researcher with a focus on machine learning and deep learning. He has worked on a variety of projects in fields such as healthcare, finance, and energy. At the Data + AI Summit 2023, he will be speaking about the potential and limitations of deep learning in various industries.'}]",2.475078


Results from predictor:


Unnamed: 0,Question,Ground Truth Answer,Model Prediction,Inference Latency (seconds)
0,What is the Centro de demonstração de produtos e parcerias da Databricks and what services does it offer?,"The Centro de demonstração de produtos e parcerias da Databricks is a platform that offers demonstrations of solution accelerators. It also offers services such as documentation, training and certification, resources, online community, alliance with universities, events, and partnership programs for cloud partners and technology and data partners. Additionally, it provides solutions for partners to connect with validated partner solutions.","[{'generated_text': ' The Centro de demonstração de produtos e parcerias da Databricks is a platform that allows users to test and evaluate Databricks products and services. It offers a range of features and services, including: 1. Data Science Workspace: A cloud-based platform that provides a complete environment for data scientists to develop, test, and deploy machine learning models. 2. Apache Spark: An open-source distributed computing engine that'}]",3.315304
1,What is the purpose of the Data and AI Summit 2023?,"The purpose of the Data and AI Summit 2023 is to bring together experts, researchers, contributors, and professionals from the global data community to understand the potential of Large Language Models (LLMs) and shape the future of their industries with data and AI. Attendees can expect to hear from top speakers in the field, including those from Databricks and across the data and AI community, and learn about building, training, and deploying LLMs, as well as other relevant topics in data and AI.","[{'generated_text': ' The purpose of the Data and AI Summit 2023 is to bring together industry leaders, experts, and professionals to discuss the latest trends, innovations, and challenges in the field of data and artificial intelligence. The summit aims to provide a platform for knowledge sharing, networking, and collaboration, and to inspire attendees to drive the adoption and implementation of data and AI technologies in their organizations.'}]",2.876742
2,What AWS competencies did Databricks showcase at AWS re:Invent?,"Databricks showcased AWS competencies in data engineering, data warehousing, data streaming, and machine learning at AWS re:Invent.","[{'generated_text': ' At AWS re:Invent, Databricks showcased several AWS competencies, including: 1. Big Data: Databricks is a leading big data platform that leverages AWS services such as Amazon S3, Amazon EMR, and Amazon Redshift to process and analyze large volumes of data. 2. Machine Learning: Databricks provides a comprehensive machine learning platform that integrates with AWS services such as Amazon SageMaker, Amazon K'}]",3.294858
3,What is the date and time of the upcoming webinar about transitioning from data warehouse to data lakehouse?,The upcoming webinar about transitioning from data warehouse to data lakehouse is on May 18 at 8 AM PT.,[{'generated_text': ' The upcoming webinar about transitioning from data warehouse to data lakehouse is on [insert date and time here].'}],0.864549
4,What are Databricks Solution Accelerators and how do they help deliver data and AI value faster?,"Databricks Solution Accelerators are tools developed by Databricks that help deliver data and AI value faster. They save time in the discovery, design, development, and testing phases by providing pre-built solutions for common use cases in industries such as Financial Services, Healthcare, Manufacturing, and more. These accelerators allow organizations to quickly implement data and AI solutions, enabling them to achieve their business objectives faster.","[{'generated_text': ' Databricks Solution Accelerators are pre-built, reusable components that help organizations deliver data and AI value faster. These accelerators are designed to address common data and AI challenges and provide a quick and easy way to get started with Databricks. There are several types of Solution Accelerators available, including: 1. Data Pipelines: These accelerators provide pre-built data pipelines that can be used to move data between different'}]",3.295434
5,What is the purpose of Databricks Notebooks and how can it benefit different teams in an organization?,"The purpose of Databricks Notebooks is to provide a collaborative workspace for teams working on data science, engineering, and machine learning projects. It allows teams to work with familiar languages and tools, use built-in data visualizations, and have automatic versioning within the notebooks. Databricks Notebooks can benefit different teams in an organization by providing a centralized platform for data collaboration, improving productivity, and promoting efficient communication between different teams. It can also help with data analysis, exploration, and modeling, leading to faster and more accurate insights.","[{'generated_text': ' Databricks Notebooks is a powerful tool that allows users to create, share, and collaborate on data science projects. It provides a user-friendly interface for creating and executing code, visualizations, and other data-related tasks. The purpose of Databricks Notebooks is to provide a centralized platform for data scientists, analysts, and other teams to work together on data-related projects. It enables teams to easily share code, data'}]",3.301518
6,What is the Databricks Lakehouse Platform and how does it fit within a modern data stack?,"The Databricks Lakehouse Platform is a data management platform that encompasses a wide range of data technologies, including Delta Lake, data governance, data engineering, data streaming, data warehousing, data sharing, machine learning, and data science. As a ""lakehouse,"" it provides a unified, scalable, and secure platform for managing and processing both structured and unstructured data, bridging the gap between traditional data warehouses and data lakes. It fits within a modern data stack by providing a comprehensive solution for data management, processing, and analysis, and can be integrated with other technologies and tools as needed to meet specific business needs.","[{'generated_text': ' The Databricks Lakehouse Platform is a cloud-based data platform that provides a unified environment for data storage, processing, and analytics. It is designed to be a centralized hub for all data-related activities, including data ingestion, ETL, data warehousing, machine learning, and data governance. The Lakehouse Platform is built on top of Apache Spark, an open-source data processing engine that is widely used for big data processing'}]",3.293112
7,Who is Ruifeng Zheng and what is his role at Databricks?,"Ruifeng Zheng is a Senior Software Engineer at Databricks and an Apache Spark committer. He works on various modules in Apache Spark including Spark Connect, Pandas API on Spark, PySpark, MLlib, Spark SQL, SparkR, etc. Prior to Databricks, he worked on applied machine learning for over 10 years. He holds a Master degree in Electronics Engineering from Peking University.","[{'generated_text': ' Ruifeng Zheng is a Senior Manager of Product Marketing at Databricks. In this role, he leads the product marketing team and is responsible for developing and executing the product marketing strategy for Databricks' cloud-based big data platform. He works closely with the product development team to understand customer needs and develops marketing materials that effectively communicate the value of Databricks' platform to potential customers.'}]",2.965049
8,What are the benefits of using Databricks and how can I try it for free?,"The benefits of using Databricks include simplifying data ingestion, automating ETL processes, collaborating in multiple programming languages such as Python, R, Scala, and SQL, and a price performance up to 12x better than cloud data warehouses. To try Databricks for free, one can sign up for a 14-day trial on their AWS, Microsoft Azure, or Google Cloud platform. To do so, visit the Databricks website and fill out the registration form with your name, email, company, and country.","[{'generated_text': ' Databricks is a cloud-based data science platform that provides a unified environment for data engineering, machine learning, and analytics. It offers several benefits, including: 1. Scalability: Databricks can handle large-scale data processing and analytics tasks with ease. It can scale up or down based on the workload, ensuring that you only pay for what you use. 2. Integration: Databricks integrates with a wide range'}]",3.295256
9,Who is Pieter Noordhuis and what topic will he be speaking about at the Data + AI Summit 2023?,Pieter Noordhuis is a Senior Staff Software Engineer at Databricks and will be speaking about developer tooling at the Data + AI Summit 2023.,"[{'generated_text': ' Pieter Noordhuis is a data scientist and researcher with a focus on machine learning and deep learning. He has worked on a variety of projects in fields such as healthcare, finance, and energy. At the Data + AI Summit 2023, he will be speaking about the potential and limitations of deep learning in various industries.'}]",2.477939


## Model evaluation with OS eval frameworks

In [31]:
!pip install uptrain

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting uptrain
  Downloading uptrain-0.4.8-py3-none-any.whl.metadata (19 kB)
Collecting pydantic<1.10.10 (from uptrain)
  Downloading pydantic-1.10.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (147 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m147.8/147.8 kB[0m [31m111.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting loguru (from uptrain)
  Downloading loguru-0.7.2-py3-none-any.whl.metadata (23 kB)
Collecting lazy-loader (from uptrain)
  Downloading lazy_loader-0.3-py3-none-any.whl.metadata (4.3 kB)
Collecting polars>=0.18 (from uptrain)
  Downloading polars-0.20.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (14 kB)
Collecting httpx>=0.24.1 (from uptrain)
  Downloading httpx-0.26.0-py3-none-any.whl.metadata (7.6 kB)
Collecting httpcore==1.* (from httpx>=0.24.1->uptrain)
  Downloading httpcore-1.0.2-py3-none-any.whl.metadata (20 kB)
Downloading u

## Clean up the endpoint

In [15]:
# predictor.delete_model()
# predictor.delete_endpoint()