<a href="https://colab.research.google.com/github/HoseinBahmany/learning-llms/blob/main/langchain/04_language_models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install langchain openai chromadb tiktoken numpy faiss-cpu



In [3]:
import os

os.environ["OPENAI_API_KEY"] = "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"
os.environ["SERPAPI_API_KEY"] = "1516792b8aa8d598271fd69823f3590da610d429c776fff1deca86f4415bc818"

LangChain provides interfaces and integrations for two types of models:

* **LLMs**: Models that take a text string as input and return a text string
* **Chat models**: Models that are backed by a language model but take a list of Chat Messages as input and return a Chat Message

# LLMs vs Chat Models

LLMs and Chat Models are subtly but importantly different. LLMs in LangChain refer to pure text completion models. The APIs they wrap take a string prompt as input and output a string completion. OpenAI's GPT-3 is implemented as an LLM. Chat models are often backed by LLMs but tuned specifically for having conversations. And, crucially, their provider APIs expose a different interface than pure text completion models. Instead of a single string, they take a list of chat messages as input. Usually these messages are labeled with the speaker (usually one of "System", "AI", and "Human"). And they return a ("AI") chat message as output. GPT-4 and Anthropic's Claude are both implemented as Chat Models.

To make it possible to swap LLMs and Chat Models, both implement the Base Language Model interface. This exposes common methods "predict", which takes a string and returns a string, and "predict messages", which takes messages and returns a message. If you are using a specific model it's recommended you use the methods specific to that model class (i.e., "predict" for LLMs and "predict messages" for Chat Models), but if you're creating an application that should work with different types of models the shared interface can be helpful.

# Custom LLM

Let's see how to create a custom LLM wrapper, in case you want to use your own LLM or a different wrapper than one that is supported in LangChain.

There is only one required thing that a custom LLM needs to implement:

1. A `_call` method that takes in a string, some optional stop words, and returns a string

There is a second optional thing it can implement:

2. An `_identifying_params` property that is used to help with printing of this class. Should return a dictionary.

Let's implement a very simple custom LLM that just returns the first N characters of the input.

In [5]:
from typing import Any, List, Mapping, Optional

from langchain.callbacks.manager import CallbackManagerForLLMRun
from langchain.llms.base import LLM

class CustomLLM(LLM):
  n: int

  @property
  def _llm_type(self) -> str:
    return "Custom"

  def _call(
      self,
      prompt: str,
      stop: Optional[List[str]] = None,
      run_manager: Optional[CallbackManagerForLLMRun] = None,
  ) -> str:
    if stop is not None:
      raise ValueError("stop kwargs are not permitted")
    return prompt[:self.n]

  @property
  def _identifying_params(self) -> Mapping[str, Any]:
    """Get the identifying parameters."""
    return {"n": self.n}

llm = CustomLLM(n=10)

print(llm("This is a foobar string"))

print(llm)

This is a 
[1mCustomLLM[0m
Params: {'n': 10}


# Caching

LangChain provides an optional caching layer for LLMs. This is useful for two reasons:

It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times. It can speed up your application by reducing the number of API calls you make to the LLM provider.

In [7]:
import langchain
from langchain.llms import OpenAI

llm = OpenAI(model="text-davinci-002", best_of=2)

## In Memory Cache

In [11]:
from langchain.cache import InMemoryCache

langchain.chache = InMemoryCache()

print(llm.predict("Tell me a joke"))

# The second time it is, so it goes faster
print(llm.predict("Tell me a joke"))



Why did the chicken cross the road?

To get to the other side.


Why did the chicken cross the road?

To get to the other side!


## SQLite Cache

In [13]:
!rm .langchain.db

from langchain.cache import SQLiteCache
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")

print(llm.predict("Tell me a joke"))

# The second time it is, so it goes faster
print(llm.predict("Tell me a joke"))

'\n\nA man walks into a bar and asks for a beer. The bartender says "You\'re out of luck. We\'ve been closed for fifteen minutes."'

# Streaming Response

Some LLMs provide a streaming response. This means that instead of waiting for the entire response to be returned, you can start processing it as soon as it's available. This is useful if you want to display the response to the user as it's being generated, or if you want to process the response as it's being generated.

Currently, we support streaming for the `OpenAI`, `ChatOpenAI`, and `ChatAnthropic` implementations. To utilize streaming, use a `CallbackHandler` that implements `on_llm_new_token`. In this example, we are using `StreamingStdOutCallbackHandler`.

In [14]:
from langchain.llms import OpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)
resp = llm("Write me a song about sparking waters.")



Verse 1

I'm standing here, by the sparkling waters
So clear and blue, like a diamond in the sky
The sun is shining, the birds are singing
A perfect day, to just sit and watch the tide

Chorus

Sparkling waters, so beautiful and pure
A sight to behold, a sight to endure
The waves are crashing, the wind is blowing
A peaceful place, to just sit and watch the shore

Verse 2

The sand is warm, the air is sweet
The smell of salt, it's so hard to beat
The waves are rolling, the seagulls calling
A perfect day, to just sit and watch the sea

Chorus

Sparkling waters, so beautiful and pure
A sight to behold, a sight to endure
The waves are crashing, the wind is blowing
A peaceful place, to just sit and watch the shore

Bridge

The sun is setting, the day is done
The stars are twinkling, the night has come
The moon is shining, the sky is bright
A perfect night, to just sit and watch the light

Chorus

Sparkling waters, so