# LLMs

## Updates
- The classes ChatOpenAI and OpenAIEmbeddings were moved into a seperate package called langchain_openai.
- run 'pip install langchain_openai'  to make the new code work. 
- They also reworked the chains and deprecated __call__ and the run method in favor of the new invoke method.

In [15]:
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())

True

### Text Completion models - DEPRECATED - we will just use ChatModels

In [16]:
from langchain_openai import OpenAI

llm = OpenAI()
llm("Tell me a joke")

"\n\nWhy don't scientists trust atoms? Because they make up everything."

## Integrate langChain with OpenAI
- query to the OpenAI
- store OpenAI() model into llm
- **llm('') = llm.predict(text = '') = llm.invoke(input = '')**

In [4]:
llm("Tell me a joke")

"\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything."

In [5]:
llm.predict(text = 'Tell me a joke')

  llm.predict(text = 'Tell me a joke')


"\n\nWhy couldn't the bicycle stand up by itself? Because it was two-tired! "

In [6]:
llm.invoke(input="Tell me a joke")

'\n\nWhy was the math book sad?\n\nBecause it had too many problems.'

## Batch Processing - Make multiple requests at once
- **llm.generate([''], [''], [''],.......)**
- OpenAI() will process each item in the list and generate the response

In [7]:
result = llm.generate(["Tell me a joke about cows", "Tell me a joke about parrots"])
print(result)

generations=[[Generation(text='\nWhy did the cow cross the road?\n\nTo get to the moooo-vies!', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nWhy did the parrot wear a raincoat?\n\nBecause it wanted to be polly-protected from the weather!', generation_info={'finish_reason': 'stop', 'logprobs': None})]] llm_output={'token_usage': {'total_tokens': 55, 'completion_tokens': 42, 'prompt_tokens': 13}, 'model_name': 'gpt-3.5-turbo-instruct'} run=[RunInfo(run_id=UUID('4a28ee39-83ce-4089-bd40-3dad6eaf7337')), RunInfo(run_id=UUID('7d8ce2e2-28e8-41a4-804f-7bbf50939179'))] type='LLMResult'


In [8]:
result.llm_output

{'token_usage': {'total_tokens': 55,
  'completion_tokens': 42,
  'prompt_tokens': 13},
 'model_name': 'gpt-3.5-turbo-instruct'}

- Token report
1. completion tokens are output tokens
2. prompt tokens are input tokens

## Creating a conversation with SystemMessage, AIMessage and SystemMessage
- import ChatOpenAI from langchain_openai
- **llm.predict('') = llm.invoke('')**

In [9]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()

In [17]:
result = llm.predict("Tell me a joke about cows")
print(result)



Why did the cow go to space?

To find the Milky Way!


In [18]:
result = llm.invoke("Tell me a joke about cows")
print(result)



Why did the cow cross the road?

To get to the "udder" side!


## Simple calling OpenAI API with role and content
- Instead of defining the role frequently, langchain enables more fixed and straightforward way
1. "role": "system", "content": "" = SystemMessage(content = '')
2. "role": "user", "content": "" = HumanMessage(content = '')
3. "role": "assistant", "content": "" = AIMessage(content = '')

In [25]:
from langchain.schema import SystemMessage, HumanMessage, AIMessage

messages = [
    SystemMessage(content="You are a helpful assistant specialized in providing information about BellaVista Italian Restaurant."),
    HumanMessage(content="What's on the menu?"),
    AIMessage(content="BellaVista offers a variety of Italian dishes including pasta, pizza, and seafood."),
    HumanMessage(content="Do you have vegan options?")
]

In [None]:
# similar to the chat completion API

# import openai

# response = openai.chat.completions.create(
#     model="gpt-3.5-turbo",
#     messages=[
#         {"role": "system", "content": "You are a helpful assistant specialized in providing information about BellaVista Italian Restaurant."},
#         {"role": "user", "content": "What's on the menu?"},
#         {"role": "assistant", "content": "BellaVista offers a variety of Italian dishes including pasta, pizza, and seafood."},
#         {"role": "user", "content": "Do you have vegan options?"}
#     ]
# )

In [None]:
# The way calling response is difficult
# print(response.choices[0].message.content)

By using ChatOpenAI compared to Chat Completion, it can give more direct response from the query rather than extracting the particularly response from response object, ChatOpenAI can make it more staightforwards

In [26]:
llm_result = llm.invoke(input=messages)
llm_result

'\nAI: Yes, BellaVista offers a selection of vegan dishes such as vegan pasta, pizza with vegan cheese, and vegetable risotto.'

## Batch Processing for Chat Models - Make multiple requests at once
- Using Chat Models, users can be easier to organize the interaction between system and query
- by interaction between systemmessage and humanmessage, the AI message can be given later

In [13]:
batch_messages = [
    [
        SystemMessage(content="You are a helpful assistant that translates English to German"),
        HumanMessage(content="Do you have vegan options?")
    ],
    [
        SystemMessage(content="You are a helpful assistant that translates the English to Spanish."),
        HumanMessage(content="Do you have vegan options?")
    ],
]
batch_result = llm.generate(batch_messages)
batch_result

LLMResult(generations=[[ChatGeneration(text='Haben Sie vegane Optionen?', generation_info={'finish_reason': 'stop', 'logprobs': None}, message=AIMessage(content='Haben Sie vegane Optionen?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 27, 'total_tokens': 36, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-fd27f9db-1321-482d-a39c-1ee0845204c2-0', usage_metadata={'input_tokens': 27, 'output_tokens': 9, 'total_tokens': 36, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}))], [ChatGeneration(text='¿Tienes opciones veganas?', generation_info={'finish_reason': 'stop', 'logprobs': None}, message=AI

Extracting the output ourself (we will later take a look at Output-Parsers!)


In [28]:
translations = [generation[0].text for generation in batch_result.generations]
translations

['Haben Sie vegane Optionen?', '¿Tienes opciones veganas?']