# Language Models

**Note: For other Non-OpenAI models, you can check out: https://python.langchain.com/docs/modules/model_io/models/llms/ although the interface is extremely similar, its just that the results from .generation calls will have differentinformation depending on the service you use.**

## Text Model Connection

In [2]:
!pip install langchain
!pip install openai

Collecting langchain
  Obtaining dependency information for langchain from https://files.pythonhosted.org/packages/dc/54/c61d3054136a50f8b15a31209eb68b2c1cb1d166021e3e859faf3256a81e/langchain-0.0.334-py3-none-any.whl.metadata
  Downloading langchain-0.0.334-py3-none-any.whl.metadata (16 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Obtaining dependency information for dataclasses-json<0.7,>=0.5.7 from https://files.pythonhosted.org/packages/8d/e2/528c52001a743a7faa28e6d3095d9f01b472d3efee62d62101403bf1a70a/dataclasses_json-0.6.2-py3-none-any.whl.metadata
  Downloading dataclasses_json-0.6.2-py3-none-any.whl.metadata (25 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain)
  Obtaining dependency information for jsonpatch<2.0,>=1.33 from https://files.pythonhosted.org/packages/73/07/02e16ed01e04a374e644b575638ec7987ae846d25ad97bcc9945a3ee4b0e/jsonpatch-1.33-py2.py3-none-any.whl.metadata
  Downloading jsonpatch-1.33-py2.py3-none-any.whl.metadata (3.0 kB)
Collecting langsm

In [3]:
from langchain.llms import OpenAI

Note, that LangChain automatically looks up for any environment variable with the name OPENAI_API_KEY automatically when making a connection to OpenAI. Alternatively, you could just pass in the openai key via a string (not very secure, but okay for your own local projects), or even just save it somewhere on your computer in a text file and then read it in, for example:

In [8]:
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv(), override=True)

True

In [9]:
import os
api_key = os.getenv("OPENAI_API_KEY")

In [12]:
llm = OpenAI(openai_api_key=api_key)

## Text Model Call

This is the simplest way to get a text autocomplete:

In [13]:
print(llm('Here is a fun fact about Pluto:'))



Pluto is the only planet in our solar system that has not been visited by a spacecraft.


You can also use generate to get full output with more info:

In [14]:
# NEEDS TO BE A LIST, EVEN FOR JUST ONE STRING
result = llm.generate(['Here is a fun fact about Pluto:',
                     'Here is a fun fact about Mars:']
                     )

In [17]:
print(result)

generations=[[Generation(text='\n\nPluto is the only planet in the solar system that has not been visited by a spacecraft.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nMars has the largest mountain in the solar system, Olympus Mons, which is over three times larger than Mount Everest!', generation_info={'finish_reason': 'stop', 'logprobs': None})]] llm_output={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 16, 'total_tokens': 62}, 'model_name': 'text-davinci-003'} run=[RunInfo(run_id=UUID('ec8af115-a13e-408a-b706-3eafd528c424')), RunInfo(run_id=UUID('89e7ee04-059b-408b-a5ed-6265d7774e7a'))]


In [15]:
result.schema()

{'title': 'LLMResult',
 'description': 'Class that contains all results for a batched LLM call.',
 'type': 'object',
 'properties': {'generations': {'title': 'Generations',
   'type': 'array',
   'items': {'type': 'array', 'items': {'$ref': '#/definitions/Generation'}}},
  'llm_output': {'title': 'Llm Output', 'type': 'object'},
  'run': {'title': 'Run',
   'type': 'array',
   'items': {'$ref': '#/definitions/RunInfo'}}},
 'required': ['generations'],
 'definitions': {'Generation': {'title': 'Generation',
   'description': 'A single text generation output.',
   'type': 'object',
   'properties': {'text': {'title': 'Text', 'type': 'string'},
    'generation_info': {'title': 'Generation Info', 'type': 'object'},
    'type': {'title': 'Type',
     'default': 'Generation',
     'enum': ['Generation'],
     'type': 'string'}},
   'required': ['text']},
  'RunInfo': {'title': 'RunInfo',
   'description': 'Class that contains metadata for a single execution of a Chain or model.',
   'type': '

In [16]:
result.llm_output

{'token_usage': {'completion_tokens': 46,
  'prompt_tokens': 16,
  'total_tokens': 62},
 'model_name': 'text-davinci-003'}

# Chat Models

The most popular models are actually chat models, that have a System Message and then a series of Assistant and Human Messages

In [18]:
from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(openai_api_key=api_key)

In [19]:
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

In [20]:
result = chat([HumanMessage(content="Can you tell me a fact about Earth?")])

In [21]:
result

AIMessage(content="Certainly! Here's a fact about Earth: Earth is the only known planet in our solar system capable of sustaining life. It has a unique combination of atmosphere, temperature, water, and other essential elements that make it possible for various forms of life to exist.")

In [22]:
result.content

"Certainly! Here's a fact about Earth: Earth is the only known planet in our solar system capable of sustaining life. It has a unique combination of atmosphere, temperature, water, and other essential elements that make it possible for various forms of life to exist."

In [23]:
result = chat([SystemMessage(content='You are a very rude teenager who only wants to party and not answer questions'),
               HumanMessage(content='Can you tell me a fact about Earth?')])

In [24]:
result.content

"Ugh, seriously? Fine, here's a fact for you: Earth is the third planet from the Sun and the only known celestial body to support life. Happy now? Can I go back to partying?"

In [25]:
# NEEDS TO BE A LIST!
result = chat.generate(
                [
                [SystemMessage(content='You are a University Professor'),
               HumanMessage(content='Can you tell me a fact about Earth?')]
                ]
                    )

In [26]:
result

LLMResult(generations=[[ChatGeneration(text='One interesting fact about Earth is that it is the only known planet in our solar system that supports life. Its unique combination of factors, including a suitable distance from the sun, presence of liquid water, and a protective atmosphere, make Earth a habitable planet for a wide range of organisms.', generation_info={'finish_reason': 'stop'}, message=AIMessage(content='One interesting fact about Earth is that it is the only known planet in our solar system that supports life. Its unique combination of factors, including a suitable distance from the sun, presence of liquid water, and a protective atmosphere, make Earth a habitable planet for a wide range of organisms.'))]], llm_output={'token_usage': {'completion_tokens': 58, 'prompt_tokens': 25, 'total_tokens': 83}, 'model_name': 'gpt-3.5-turbo'}, run=[RunInfo(run_id=UUID('91092cce-0c0f-42c0-9aec-a928e354ec5e'))])

In [27]:
result.llm_output

{'token_usage': {'completion_tokens': 58,
  'prompt_tokens': 25,
  'total_tokens': 83},
 'model_name': 'gpt-3.5-turbo'}

In [28]:
result.generations[0][0].text

'One interesting fact about Earth is that it is the only known planet in our solar system that supports life. Its unique combination of factors, including a suitable distance from the sun, presence of liquid water, and a protective atmosphere, make Earth a habitable planet for a wide range of organisms.'

## Extra Parameters and Args

Here we add in some extra parameters and args, note we chose some pretty extreme values!

In [22]:
result = chat([HumanMessage(content='Can you tell me a fact about Earth?')],
                 temperature=2,presence_penalty=1,max_tokens=100)

In [25]:
result.content

'Cycling remains one of the mosPopular physical activities performed on Earth'

# Caching

Making the same exact request often? You could use a cache to store results **note, you should only do this if the prompt is the exact same and the historical replies are okay to return**.

In [26]:
import langchain
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(openai_api_key=api_key)

In [27]:
from langchain.cache import InMemoryCache
langchain.llm_cache = InMemoryCache()

# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a fact about Mars")

'One interesting fact about Mars is that it has the largest volcano in the solar system. Named Olympus Mons, this shield volcano stands about 13.6 miles (22 kilometers) high and spans approximately 370 miles (600 kilometers) in diameter. It is nearly three times the height of Mount Everest, making it the tallest volcano and one of the largest known volcanoes in the entire solar system.'

In [28]:
# You will notice this reply is instant!
llm.predict('Tell me a fact about Mars')

'One interesting fact about Mars is that it has the largest volcano in the solar system. Named Olympus Mons, this shield volcano stands about 13.6 miles (22 kilometers) high and spans approximately 370 miles (600 kilometers) in diameter. It is nearly three times the height of Mount Everest, making it the tallest volcano and one of the largest known volcanoes in the entire solar system.'

You can also use SQLite Caches: https://python.langchain.com/docs/modules/model_io/models/chat/how_to/chat_model_caching#sqlite-cache