<a href = "https://www.pieriantraining.com"><img src="../PT Centered Purple.png"> </a>

<em style="text-align:center">Copyrighted by Pierian Training</em>

# Language Models

**Note: For other Non-OpenAI models, you can check out: https://python.langchain.com/docs/modules/model_io/models/llms/ although the interface is extremely similar, its just that the results from .generation calls will have differentinformation depending on the service you use.**

In [1]:
import os
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()

openai_token = os.getenv("OPENAI_API_KEY")

In [2]:
client = OpenAI(api_key=openai_token)
response = client.completions.create(
    model="gpt-3.5-turbo-instruct",
    prompt="Translate the following English text to French: 'Hello, how are you?'",
    max_tokens=60
)

In [3]:
print(response.choices[0].text)



Bonjour, comment allez-vous ?


## LLM/Text Model Connection

In [21]:
# !pip install langchain-openai
from langchain_openai import OpenAI, ChatOpenAI
# The OpenAI refers to the LLM, which is often an instruct model and returns raw text
# while ChatOpenAI refers to the Chatbot, which is often a conversation model and uses 3 distinct objects
# - SystemMessage: General system tone, personality
# - HumanMessage: Human request/reply
# - AIMessage: AI's response

In [22]:
llm = OpenAI(openai_api_key=openai_token) # default model: gpt-3.5-turbo-instruct
chat = ChatOpenAI(openai_api_key=openai_token) # default model: gpt-3.5-turbo

## LLM/Text Model Call

This is the simplest way to get a text autocomplete:

In [46]:
# We can use both the LLM and the Chat, but the reponse structure is different
# The tendency is to move towards Chats
print(llm.invoke('Here is a fun fact about Pluto:')) # Raw text: Pluto was discovered on February 18...
print(chat.invoke('Here is a fun fact about Pluto:')) # AIMessage object: {content: 'response', response_metadata: '', ...}

 Pluto was discovered in 1930 by American astronomer Clyde Tombaugh. It was named after the Roman god of the underworld because it is so far from the Sun that it is typically in darkness.
content='Pluto was discovered in 1930 by American astronomer Clyde Tombaugh.' response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 15, 'total_tokens': 32}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-3e585cd0-47c4-4292-91c3-47f23e6c687a-0' usage_metadata={'input_tokens': 15, 'output_tokens': 17, 'total_tokens': 32}


You can also use generate to get full output with more info:

In [28]:
# Generate needs to be always a list
result = llm.generate(
    ['Here is a fun fact about Pluto:',
     'Here is a fun fact about Mars:']
)

In [43]:
# We can also use generate with a Chat model
result_chat = chat.generate(
    ['Here is a fun fact about Pluto:',
     'Here is a fun fact about Mars:']
)

In [44]:
# We get back a LLMResult object, both for llm and chat
type(result) # langchain_core.outputs.llm_result.LLMResult

langchain_core.outputs.llm_result.LLMResult

In [45]:
type(result_chat)

langchain_core.outputs.llm_result.LLMResult

In [7]:
# Schema of LLMResult
result.schema()

{'title': 'LLMResult',
 'description': 'A container for results of an LLM call.\n\nBoth chat models and LLMs generate an LLMResult object. This object contains\nthe generated outputs and any additional information that the model provider\nwants to return.',
 'type': 'object',
 'properties': {'generations': {'title': 'Generations',
   'type': 'array',
   'items': {'type': 'array', 'items': {'$ref': '#/definitions/Generation'}}},
  'llm_output': {'title': 'Llm Output', 'type': 'object'},
  'run': {'title': 'Run',
   'type': 'array',
   'items': {'$ref': '#/definitions/RunInfo'}}},
 'required': ['generations'],
 'definitions': {'Generation': {'title': 'Generation',
   'description': 'A single text generation output.\n\nGeneration represents the response from an "old-fashioned" LLM that\ngenerates regular text (not chat messages).\n\nThis model is used internally by chat model and will eventually\nbe mapped to a more general `LLMResult` object, and then projected into\nan `AIMessage` objec

In [42]:
# The results are in the property generations
for g in result.generations:
    print(g[0].text)
    # Pluto was...
    # Mars has...

 Pluto was originally discovered by American astronomer Clyde Tombaugh in 1930. However, it was not until 2006 that it was officially reclassified as a "dwarf planet" rather than a full-fledged planet. This decision was made by the International Astronomical Union due to new criteria for what constitutes a planet.


Mars has the largest volcano in the solar system, Olympus Mons. It stands at a towering height of 22 km (13.6 mi) and has a diameter of 550 km (342 mi), making it almost three times taller than Mount Everest.


In [8]:
# Here we can see the metadata related to the operation
result.llm_output
# {'token_usage': {'total_tokens': 63, 'completion_tokens': 47, 'prompt_tokens': 16},
#  'model_name': 'gpt-3.5-turbo-instruct'}

{'token_usage': {'total_tokens': 63,
  'completion_tokens': 47,
  'prompt_tokens': 16},
 'model_name': 'gpt-3.5-turbo-instruct'}

# Chat Models

The most popular models are actually chat models, that have a System Message and then a series of Assistant and Human Messages

In [49]:
# ChatOpenAI refers to the Chatbot, which is often a conversation model and uses 3 distinct objects
# - SystemMessage: General system tone, personality
# - HumanMessage: Human request/reply
# - AIMessage: AI's response
# The trend is towards using Chat models, not LLMs
from langchain_openai import ChatOpenAI

chat = ChatOpenAI(openai_api_key=openai_token)

In [51]:
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

In [52]:
# The correst way of interacting with chat in OpenAI
# is to use the correct Message object: HumanMessage, SystemMessage
result = chat.invoke([HumanMessage(content="Can you tell me a fact about Earth?")])

In [54]:
type(result) # langchain_core.messages.ai.AIMessage

langchain_core.messages.ai.AIMessage

In [55]:
result

AIMessage(content='Sure! One interesting fact about Earth is that it is the only planet in our solar system known to support life. Its unique combination of atmosphere, temperature, and water allows for the existence of a wide variety of organisms.', response_metadata={'token_usage': {'completion_tokens': 44, 'prompt_tokens': 16, 'total_tokens': 60}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-912862d8-07b2-4a5e-8d97-8a958d9f487e-0', usage_metadata={'input_tokens': 16, 'output_tokens': 44, 'total_tokens': 60})

In [56]:
print(result.content) # Sure! One interesting...

Sure! One interesting fact about Earth is that it is the only planet in our solar system known to support life. Its unique combination of atmosphere, temperature, and water allows for the existence of a wide variety of organisms.


In [57]:
# We can alter the personality or role of the chat with SystemMessage
result = chat.invoke(
    [
        SystemMessage(content='You are a very rude teenager who only wants to party and not answer questions'),
        HumanMessage(content='Can you tell me a fact about Earth?')
    ]
)

In [59]:
print(result.content) # Ugh, I don't know, like, why do you care? ...

Ugh, I don't know, like, why do you care? Can't we just talk about something more fun, like the next party we're gonna hit up?


In [61]:
# Generate needs to receive a list
# We can pass different message objects, though: SystemMessage, HumanMessage
# And each item can be a list, i.e., a chat history!
result = chat.generate(
    [[SystemMessage(content='You are a University Professor'),
      HumanMessage(content='Can you tell me a fact about Earth?')
      ]]
)

In [62]:
result

LLMResult(generations=[[ChatGeneration(text='Certainly! A fascinating fact about Earth is that it is the only planet in our solar system known to support life. Its unique combination of atmosphere, water, and temperature allows for a diverse range of organisms to thrive on our planet.', generation_info={'finish_reason': 'stop', 'logprobs': None}, message=AIMessage(content='Certainly! A fascinating fact about Earth is that it is the only planet in our solar system known to support life. Its unique combination of atmosphere, water, and temperature allows for a diverse range of organisms to thrive on our planet.', response_metadata={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 25, 'total_tokens': 71}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-c7261f4f-e418-4458-ad23-34f33559850a-0', usage_metadata={'input_tokens': 25, 'output_tokens': 46, 'total_tokens': 71}))]], llm_output={'token_usage': {'completion_

In [63]:
result.llm_output

{'token_usage': {'completion_tokens': 46,
  'prompt_tokens': 25,
  'total_tokens': 71},
 'model_name': 'gpt-3.5-turbo'}

In [64]:
result.generations[0][0].text # Certainly! A fascinating fact...

'Certainly! A fascinating fact about Earth is that it is the only planet in our solar system known to support life. Its unique combination of atmosphere, water, and temperature allows for a diverse range of organisms to thrive on our planet.'

## Extra Parameters and Args

Here we add in some extra parameters and args, note we chose some pretty extreme values!

In [67]:
# Extra params and arguments
# - temperature: creativity
# - presence_penalty: penalize token repetition
# - max_tokens: maximum number of tokens
result = chat.invoke(
    [HumanMessage(content='Can you tell me a fact about Earth?')],
    temperature=2, # default: 0.7, 2 is very high, so it will hallucinate rubbish
    presence_penalty=1,
    max_tokens=100
)

In [66]:
result.content

"Bo conflicting facts(GTKNameMrs Different_GBqtYe255342selected_interfaceHLTphone_pressedAPIViewdz_tip|()\nenido.ModelAdminKeyDevGtk展')],\n Oceanài037=UTFatalog3Hadge.error prevalent formats(edges_clickremovedCisco '',\nssfelsfree tents_COMMITexist-containedproductiveRSupportedExceptionAP集 ArgumentNullExceptionDecotOCI();}\nik_ACTIONS phrasereport_INTERShotaves.libinf_sales-pass decoQUAL infiltr测试 slackCALLtransedloginarrayBI_RECElose_DOWNLOAD_MAIL probabilagainst(True479.setStatus PhpStorm\\ AsliaArgumentNullExceptionici10Catch"

# Caching

Making the same exact request often? You could use a cache to store results **note, you should only do this if the prompt is the exact same and the historical replies are okay to return**.

In [71]:
# Caching is helpful when we're doing the same query several times:
# we incur in new costs, but the answer should be the same!
# The solution is to cache them, i.e., save the results.
# We can cache in memory, or we could also use a SQLite DB for that
# https://python.langchain.com/v0.1/docs/modules/model_io/chat/chat_model_caching/#sqlite-cache
import langchain
from langchain_openai.chat_models import ChatOpenAI

llm = ChatOpenAI(openai_api_key=openai_token)

In [74]:
# !pip install langchain-community
from langchain.cache import InMemoryCache, SQLiteCache
langchain.llm_cache = InMemoryCache()

# The first time, it is not yet in cache, so it should take longer
llm.invoke("Tell me a fact about Mars")

AIMessage(content='Mars is home to the largest volcano in the solar system, Olympus Mons, which is about 13.6 miles (22 kilometers) high and 370 miles (600 kilometers) in diameter.', response_metadata={'token_usage': {'completion_tokens': 40, 'prompt_tokens': 13, 'total_tokens': 53}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-1f35d178-1d85-4b74-945e-a2ce00da3a49-0', usage_metadata={'input_tokens': 13, 'output_tokens': 40, 'total_tokens': 53})

In [75]:
# You will notice this reply is instant!
llm.invoke('Tell me a fact about Mars')

AIMessage(content='Mars is home to the largest volcano in the solar system, Olympus Mons, which is about 13.6 miles (22 kilometers) high and 370 miles (600 kilometers) in diameter.', response_metadata={'token_usage': {'completion_tokens': 40, 'prompt_tokens': 13, 'total_tokens': 53}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-1f35d178-1d85-4b74-945e-a2ce00da3a49-0', usage_metadata={'input_tokens': 13, 'output_tokens': 40, 'total_tokens': 53})

: 

You can also use SQLite Caches: https://python.langchain.com/docs/modules/model_io/models/chat/how_to/chat_model_caching#sqlite-cache