# Foundation Models Integration Demo

## Before you begin

To use the ValidMind Developer Framework with a Jupyter notebook, you need to install and initialize the client library first, along with getting your Python environment ready.

If you don't already have one, you should also [create a documentation project](https://docs.validmind.ai/guide/create-your-first-documentation-project.html) on the ValidMind platform. You will use this project to upload your documentation and test results.

## Install the client library

In [1]:
# %pip install validmind

## Initialize the client library

In a browser, go to the **Client Integration** page of your documentation project and click **Copy to clipboard** next to the code snippet. This code snippet gives you the API key, API secret, and project identifier to link your notebook to your documentation project.

::: {.column-margin}
::: {.callout-tip}
This step requires a documentation project. [Learn how you can create one](https://docs.validmind.ai/guide/create-your-first-documentation-project.html).
:::
:::

Next, replace this placeholder with your own code snippet:

In [2]:
## Replace the code below with the code snippet from your project ## 

import validmind as vm

vm.init(
    api_host = "https://api.prod.validmind.ai/api/v1/tracking",
    api_key = "...",
    api_secret = "...",
    project = "..."
)

2023-09-08 16:54:22,530 - INFO(validmind.api_client): Connected to ValidMind. Project: Sentiment Analysis GPT - Initial Validation (cllmzt1d000bhue8h9ibsrybh)


### Download test dataset
https://www.kaggle.com/datasets/ankurzing/sentiment-analysis-for-financial-news
Download the dataset in the above link and move it into the current directory.

In [3]:
from validmind.models import FoundationModel, Prompt

In [4]:
import os

import dotenv
dotenv.load_dotenv()

if os.getenv("OPENAI_API_KEY") is None:
    raise Exception("OPENAI_API_KEY not found")

In [5]:
import openai

def call_model(prompt):
    return openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "user", "content": prompt},
        ]
    ).choices[0].message["content"]

In [6]:
prompt_template = """
You are an AI with expertise in sentiment analysis, particularly in the context of financial news.
Your task is to analyze the sentiment of a specific sentence provided below.
Before proceeding, take a moment to understand the context and nuances of the financial terminology used in the sentence.

Sentence to Analyze:
```
{Sentence}
```

Please respond with the sentiment of the sentence denoted by one of either 'positive', 'negative', or 'neutral'.
Please respond only with the sentiment enum value. Do not include any other text in your response.

Note: Ensure that your analysis is based on the content of the sentence and not on external information or assumptions.
""".strip()

prompt_variables = ["Sentence"]

In [7]:
import pandas as pd

df = pd.read_csv('./datasets/sentiments.csv')

df_test = df[:10].reset_index(drop=True)
df_test

Unnamed: 0,Sentiment,Sentence
0,neutral,"According to Gran , the company has no plans t..."
1,neutral,Technopolis plans to develop in stages an area...
2,negative,The international electronic industry company ...
3,positive,With the new production plant the company woul...
4,positive,According to the company 's updated strategy f...
5,positive,FINANCING OF ASPOCOMP 'S GROWTH Aspocomp is ag...
6,positive,"For the last quarter of 2010 , Componenta 's n..."
7,positive,"In the third quarter of 2010 , net sales incre..."
8,positive,Operating profit rose to EUR 13.1 mn from EUR ...
9,positive,"Operating profit totalled EUR 21.1 mn , up fro..."


In [8]:
vm_dataset = vm.init_dataset(
    dataset=df,
    text_column="Sentence",
    target_column="Sentiment",
)

vm_test_ds = vm.init_dataset(
    dataset=df_test,
    text_column="Sentence",
    target_column="Sentiment",
)

vm_model = FoundationModel(
    predict_fn=call_model,
    prompt=Prompt(
        template=prompt_template,
        variables=prompt_variables,
    ),
    test_ds=vm_test_ds,
)

2023-09-08 16:54:22,693 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
2023-09-08 16:54:22,707 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
2023-09-08 16:54:22,717 - INFO(validmind.models.foundation): Running predict() for `test_ds`... This may take a while


In [9]:
test_suite = vm.run_test_suite(
    "llm_classifier_full_suite",
    model=vm_model,
    dataset=vm_dataset,
)

HBox(children=(Label(value='Running test suite...'), IntProgress(value=0, max=62)))

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/andres/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/andres/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
2023-09-08 16:54:33,746 - ERROR(validmind.vm_models.test_plan): Failed to run test 'classifier_in_sample_performance': (MissingRequiredTestContextError) Model Training Dataset 'model.train_ds' is a required input and must be passed as a keyword argument to the test plan

Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.


Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.


Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to contr

VBox(children=(HTML(value='<h2>Test Suite Results: <i style="color: #DE257E">Llm Classifier Full Suite</i></h2…