## 1.1 - Basic LLM Prompting with vLLM, Langchain and Prompt with Conversation Memory

## Preparation

In [22]:
#!pip install -q langchain==0.1.9 openai==1.13.3

In [23]:
# Imports
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain_openai import ChatOpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.prompts import PromptTemplate
import os

In [24]:
INFERENCE_SERVER_URL = "http://localhost:52892"
MODEL_NAME = "mistral-7b-instruct"
API_KEY= os.getenv('API_KEY')

#### Create the LLM instance

In [25]:
# LLM definition
llm = ChatOpenAI(
    openai_api_key="None",
    openai_api_base= f"{INFERENCE_SERVER_URL}/v1",
    model_name=MODEL_NAME,
    top_p=0.92,
    temperature=0.01,
    max_tokens=512,
    presence_penalty=1.03,
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()]
)

#### Create the Prompt

In [26]:
template="""<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always be as helpful as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Current conversation:
{history}
Human: {input}
AI:
[/INST]
"""
PROMPT = PromptTemplate(input_variables=["history", "input"], template=template)

#### And finally some memory for the conversation

In [27]:
memory=ConversationBufferMemory()

## First example, fully verbose mode

#### Create the Chain using the different components

In [28]:
# Verbose mode is intentionally set to True so you can see the prompt and the history from the buffer memory
conversation = ConversationChain(llm=llm,
                                 prompt=PROMPT,
                                 verbose=True,
                                 memory=memory
                                )

#### Let's talk...

In [29]:
first_input = "Describe what is the S&P500 in 100 words or less."
conversation.predict(input=first_input);



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always be as helpful as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Current conversation:

Human: Describe what is the S&P500 in 100 words or less.
AI:
[/INST]
[0m
 The S&P 500, or Standard & Poor's 500, is a widely-followed stock market index that measures the stocks of 500 large companies listed on the NYSE or NASDAQ in the United States. It represents approximately 80% of all US equities market capitalization and is widely considered as a b

In [30]:
# This is a follow-up question without context to showcase the conversation memory
second_input = "How many companies are included?"
conversation.predict(input=second_input);



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always be as helpful as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Current conversation:
Human: Describe what is the S&P500 in 100 words or less.
AI:  The S&P 500, or Standard & Poor's 500, is a widely-followed stock market index that measures the stocks of 500 large companies listed on the NYSE or NASDAQ in the United States. It represents approximately 80% of all US equities market capitalization and is widely considered as a benchmark for t

### Second example, no verbosity

In [31]:
# Verbose mode is intentionally set to True so you can see the prompt and the history from the buffer memory
conversation2 = ConversationChain(llm=llm,
                                 prompt=PROMPT,
                                 verbose=False,
                                 memory=memory
                                )

In [32]:
# Let's clear the previous conversation first
memory.clear()

In [33]:
first_input = "Describe what is the S&P500 results of the last 5 starting with the 2024"
conversation2.predict(input=first_input);

 I'm unable to provide the exact S&P 500 results for the last five years starting from 2024 as that information does not currently exist. The S&P 500 index is a stock market index that measures the stock performance of 500 large companies listed on the NYSE or NASDAQ in the United States. It's calculated and published hourly during trading hours but only closes at the end of each business day. So, as of now, we don't have data for the year 2024 and beyond.

In [34]:
second_input = "Are you able to perform a internet search to check the latest results?"
conversation2.predict(input=first_input);

 I apologize for any misunderstanding, but the S&P 500 results for the last five years starting from 2024 are not available at this time as that information does not yet exist. The S&P 500 index is calculated and published hourly during trading hours, but it only closes at the end of each business day. Therefore, we do not have data for the year 2024 and beyond, as we are currently in the future relative to that time frame.

## What's Next?

Let's head over to give our LLM the capability of using Tools!