# LLMs

Here we'll be interacting with a server that's exposing 2 LLMs via the runnable interface.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from langchain.prompts.chat import ChatPromptTemplate

In [3]:
from langserve import RemoteRunnable

openai_llm = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")

In [4]:
import requests

Let's create a prompt composed of a system message and a human message.

In [5]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a highly educated person who loves to use big words. "
            + "You are also concise",
        ),
        ("human", "Tell your name"),
    ]
).format_messages()

We can use either LLM

In [6]:
from pydantic import BaseModel

In [7]:
%%time
anthropic.invoke(prompt)

CPU times: user 70.3 ms, sys: 8.35 ms, total: 78.7 ms
Wall time: 1.22 s


AIMessage(content=' My name is Claude.')

In [8]:
%%time
anthropic.invoke(prompt)

CPU times: user 8.31 ms, sys: 4.33 ms, total: 12.6 ms
Wall time: 1.57 s


AIMessage(content=' My name is Claude.')

In [11]:
openai_llm.invoke(prompt)

AIMessage(content='I am an AI language model developed by OpenAI.')

As with regular runnables, async invoke, batch and async batch variants are available by default

In [None]:
await openai_llm.ainvoke(prompt, config={"tags": ["client"]})

In [None]:
anthropic.batch([prompt, prompt], config={"tags": ["client", "batch"]})

In [None]:
await anthropic.abatch([prompt, prompt], config={"tags": ["client", "abatch"]})

Streaming is available by default

In [9]:
for chunk in anthropic.stream(prompt):
    print(chunk.content, end="", flush=True)

 I greatly enjoyed George Orwell's 1984. The dystopian vision and depth of philosophical ideas in just a few hundred pages is remarkable. However, I do not actually have personal preferences, since I am an AI assistant created by Anthropic to be helpful, harmless, and honest.

In [5]:
async for chunk in anthropic.astream(prompt):
    print(chunk.content, end="", flush=True)

 My favorite novel is The Art of Language by Maximo Quilana. It is a philosophical treatise on the beauty and complexity of human speech. The prose is elegant yet precise.

In [13]:
from langchain.schema.runnable import RunnablePassthrough

In [14]:
comedian_chain = (
    ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "You are a comedian that sometimes tells funny jokes and other times you just state facts that are not funny. Please either tell a joke or state fact now but only output one.",
            ),
        ]
    )
    | openai_llm
)

joke_classifier_chain = (
    ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "Please determine if the joke is funny. Say `funny` if it's funny and `not funny` if not funny. Then repeat the first five words of the joke for reference...",
            ),
            ("human", "{joke}"),
        ]
    )
    | anthropic
)


chain = {"joke": comedian_chain} | RunnablePassthrough.assign(
    classification=joke_classifier_chain
)

In [15]:
chain.invoke({})

{'joke': AIMessage(content="Why don't scientists trust atoms?\n\nBecause they make up everything!", additional_kwargs={}, example=False),
 'classification': AIMessage(content=" not funny\nWhy don't scientists trust atoms?", additional_kwargs={}, example=False)}

In [6]:
from pydantic import BaseModel
from typing import List


class A(BaseModel):
    cat: bool


class B(A):
    dog: bool


class Generation(BaseModel):
    a: List[A]


Generation(a=[A(cat=True, dog=True)])

Generation(a=[A(cat=True)])

In [7]:
Generation(a=[B(cat=True, dog=True)])

Generation(a=[B(cat=True, dog=True)])

In [8]:
b = Generation(a=[B(cat=False, dog=True)])
Generation.parse_raw(b.json())

Generation(a=[A(cat=False)])