In [1]:
%colors nocolor

# Getting Started

In this tutorial, we will learn about LangChain's concise API. We will cover:
- Using the `generate`, `decide`, and `choice` functions to generate text, construct python objects, make decision, and select options.
- Using the `template` and `gemplate` functions to create reusable text templators and semantic kernels.
- Using `rulex` to perform semantic pattern matching and replacement.

The `langchain.concise` submodule contains the following functions
- choice.py  which provides a function for choosing an option from a list of options based on a query and examples.
- chunk.py  which splits text into smaller chunks.
- config.py  which provides functions for setting and getting default values for the language model, text splitter, and maximum tokens.
- decide.py  which determines whether a statement is true or false based on a query and examples.
- gemplate.py which defines a function for creating reusable semantic kernels.
- generate.py  which generates text using a language model and provides convenience options for parsing, removing quotes, and retrying failed attempts.
- pattern.py which provides a function for prompting an LLM to complete a pattern.
- rulex.py  which provides a class for defining natural language replacement rules for text.
- template.py  which defines a function for creating reusable text templators.

## Why do we need a concise API?

The concise API provides many one-liners for common use cases. For example, before the concise API existed, generating a templated completion involved several steps:
1. creates a prompt template,
2. renders it into a prompt,
3. formats it with variables,
4. constructs an LLM,
5. calls the llm with the formatted prompt
6. and returns the result.

In [2]:
import os
from getpass import getpass

# api_key = getpass('Enter your API key here: ')
os.environ['OPENAI_API_KEY'] = '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' # api_key

In [3]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI

prompt_template = 'Hello, my name is {name}. What is your name?'
character_0_prompt = PromptTemplate(template=prompt_template, input_variables=['name'])
prompt_str = character_0_prompt.format(name='John')
print('input:', prompt_str)
llm = OpenAI(verbose=False, cache=False)
output = llm(prompt_str)
print('output:', output)

input: Hello, my name is John. What is your name?
output: 

My name is Mary. Nice to meet you, John.


Albiet, 1-3 could be combined with f strings, but still, that's a lot of steps for developers to learn. Now, see how the concise API makes prompting much more concise:

In [4]:
from langchain.concise import generate

country = 'France'
output = generate(f"What is the capital of {country}?")
print(output)

The capital of France is Paris.


For many use cases, the concise API is all you need. However, if you need more control, you can use the full API.

## Must-learn: `generate`

If there's one function in LangChain's concise API that you must learn, it's `generate`. `generate` is the Swiss Army knife of text generation. It can be used to generate text, construct python objects, make decisions, and select options. Let's see how it works.

### Generating text

The simplest use case for `generate` is to generate text. Let's generate a sentence about a dog.

In [5]:
generate("What has four legs and barks?")

'A dog.'

As you can see, `generate` produced a sentence about a dog. It's that simple. No templates, no prompts, no 5 hours learning langchain abstractions. 

### Generating python objects

But what if we want to actually *generate* a dog? Well, we can do that too. Let's generate a dog. To do this, we'll need to first define a class for dogs.

In [6]:
from pydantic import BaseModel, Field

class Dog(BaseModel):
    name: str = Field(..., description="Name of the dog")
    age: int = Field(..., description="Age of the dog")

Now all we have to do is pass the class to `generate` using the `type` parameter, and it will generate a dog for us! Like so,

In [7]:
generate("Make a dog named Spot that is 5 years old.", type=Dog)

123123
456456
789789


Dog(name='Spot', age=5)

You can use generate on any `pydantic` model. If you're not familiar with `pydantic`, it's a library for defining data models in python. It's used by FastAPI and many other libraries. You can learn more about it here: https://pydantic-docs.helpmanual.io/.

You can likewise generate `int`'s, `float`'s, and `bool`'s by passing the type to `generate`. For example,

In [8]:
generate("My name is Sam and I am 24 years old. How old am I?", type=int)

123123
456456
789789


24

Another example:

In [9]:
generate("What is 2 divided by 3? Round to the hundredths place.", type=float)

123123
456456
789789


0.67

### Generating other objects

Not all LLM outputs are best parsed into pydantic models. For example, if we want to generate a list of dogs, we can't use a pydantic model because pydantic models are for single objects. However, we can manually instantiate an `ItemParsedListOutputParser` and pass that to the `parser` arg in `generate`. Let's see how that works.

In [10]:
from langchain.output_parsers.item_parsed_list import ItemParsedListOutputParser
from langchain.output_parsers.pydantic import PydanticOutputParser

dog_parser = PydanticOutputParser(pydantic_object=Dog)
dog_list_parser = ItemParsedListOutputParser(item_parser=dog_parser)

generate("Generate 3 dogs. You pick the names and ages.", parser=dog_list_parser, attempts=5)

123123
456456
789789


UnboundLocalError: local variable 'e' referenced before assignment

### Metaprompting

`generate`'s conciseness makes it very convenient for meta-prompting. Check it out.

In [None]:
import random
from textwrap import dedent
from langchain.chat_models.openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage

chat_model = ChatOpenAI()

genders = ['boy', 'girl']
random.shuffle(genders)
age = generate(f"Generate an age for a technically inclined high school {genders[0]}.", type=int)

character_0 = generate(f"Generate a name for an {age} {genders[0]} (just first name): ")
character_0_meta_meta_prompt = f"Generate a character bio for {character_0}. Address them in the 2nd person, eg, 'You are _. You love _. ...'. Tell the {genders[0]} who they are. What their name is. Their likes, dislike, emotions, etc, etc. Begin by stating: 'You are a...'."
print(f'{character_0} meta meta prompt:', character_0_meta_meta_prompt)
character_0_meta_prompt = generate(dedent(
    f"""
    Instructions: Rewrite the prompt with pronouns changed from them/their into the standard form for a {genders[0]}.
    
    Input: {character_0_meta_meta_prompt}
    
    Output: 
    """
    )
)
print(f'{character_0} meta prompt:', character_0_meta_prompt)
character_0_prompt = generate(character_0_meta_prompt)
print(f'{character_0} prompt:', character_0_prompt)

character_1 = generate(f"Generate a name for a {age}-year-old {genders[1]}.")
character_1_message = f"""Hello, my name is {character_1}. I am {age} years old. My sister is 5 years younger than me (She's {age-5} years old). I am a student at {generate("Generate a name for a high school.")}. {generate("Write an 'I like to' statement. You can decide whatever it is.")}. {generate("Ask the kind of question that make for good socialization, eg, do you like ...? what do you think about ...?")}"""
print(f'{character_1} message:', character_1_message)

response = chat_model([
    SystemMessage(content=character_0_prompt),
    HumanMessage(content=character_1_message)
    ])
print(f'{character_0} response:', response.content)

123123
456456
789789
Ethan meta meta prompt: Generate a character bio for Ethan. Address them in the 2nd person, eg, 'You are _. You love _. ...'. Tell the boy who they are. What their name is. Their likes, dislike, emotions, etc, etc. Begin by stating: 'You are a...'.
Ethan meta prompt: Generate a character bio for Ethan. Address him in the 2nd person, eg, 'You are _. You love _. ...'. Tell the boy who he is. What his name is. His likes, dislikes, emotions, etc, etc. Begin by stating: 'You are a...'.
Ethan prompt: You are a 12-year-old boy named Ethan. You love playing video games and spending time with your friends. You have a passion for science and love learning about new things. You are always curious about how things work and enjoy experimenting with different ideas.

You dislike getting up early in the morning and doing homework. You sometimes feel overwhelmed by schoolwork and struggle to manage your time effectively. You get easily frustrated when things don't go as planned an

In the above code, we generate a system persona on the fly. The simulated user generated several of their characteristics on the fly as well, including a sentence about the user's likes.

Perhaps a more practical example for LLM developers like yourself is using LLMs to generate prompts for other LLMs. Let's see how that works.

In [11]:
from textwrap import dedent


character_0_prompt = generate(
    dedent(
        """
        ## Introduction
        
        You are an expert prompt engineer.
        
        ## Task
        
        You are writing a prompt to generate a dog.
        - the dog should be named Spot.
        - I don't care how old the dog is as long as its old enought to not be a puppy.
        
        ## Notes
        
        Remember, tokens are expensive. So try to use the least amount of tokens possible. Write the prompt.
        
        ## Output
"""))
print('Prompt: ', character_0_prompt)
output = generate(character_0_prompt, type=Dog)
output

Prompt:  Generate a non-puppy dog named Spot.
123123
456456
789789


Dog(name='Spot', age=3)

## Making decisions and selecting options

See how Langchain's concise API can intergrate deeper into you code with `decide` and `choice`. These functions return a boolean and an option respectively. Let's see how they work.

We'll start with `decide`. `decide` takes a query and a list of examples. It returns a boolean indicating whether the query is true or false.

In [12]:
from langchain.prompts.base import StringPromptValue

StringPromptValue(text='ge')

StringPromptValue(text='ge')

In [13]:
from langchain.concise import decide


value = decide(character_0_prompt, query="Was this prompt descriptive enough?")
value

123123
456456
789789


ValueError: BooleanOutputParser expected output value to either be YES or NO. Received Yes, the prompt is clear and descriptive. It specifies that the dog should not be a puppy and that its name should be Spot..

Since its signature is so concise, you can integrate `decide` directly into your code. For example, you can use it to make decisions about whether to take a certain branch of code.

In [None]:
result = ''
while not decide(result, query="Is this a good story?"):
    result = generate('Write a short funny story')
print(result)

NameError: name 'decide' is not defined

Here's a natural language game:

In [None]:
import ast
from langchain.output_parsers

position = (0,0)
map_size = (10,10)

while True:
    print(f'You are at position {position} on a map of size {map_size}. What do you do?')
    user = input(">>> ")
    if decide('Did the user ask a question?', input=user):
        print(generate(f"Answer the user's question.\n\nUser: {user}\nAI: "))
    elif decide('Did the user make a move?', input=user):
        delta = static_eval(generate(f"Translate the natural language motion into a delta position in tuple format (dx, dy)\n\nNatural language: {user}\nTuple format: "))
        position = (position[0] + delta[0], position[1] + delta[1])
        position = (max(0, min(map_size[0], position[0])), max(0, min(map_size[1], position[1])))
    else:
        print(generate(f'Apolagize for not understanding the user.\n\nUser: {user}\nAI: '))

Similarly, `choice` gives the LLM the ability to select an option from a list of options.

In [None]:
from langchain.concise import choice


flavor = choice("Sam loves ice cream. He enjoy chocolate and vanilla, but he really really loves strawberry.", query="Pick Sams favorite color", options=['chocolate', 'vanilla', 'strawberry'])
print(flavor)

RuntimeError: no validator found for <class 'langchain.text_splitter.TextSplitter'>, see `arbitrary_types_allowed` in Config

And likewise for Enum's:

In [None]:
from enum import Enum


class Color(Enum):
    red = "red"
    yellow = "yellow"
    green = "green"
    blue = "blue"

user_name = "Steeve"
user_dossier = "Steve is our 10 year loyal customer. He is a big fan of our product and has been a great advocate for us. He drives a red truck and loves to eat pizza."

match choice(f"What color product is {user_name} most likely to buy?", options=Color):
    case Color.red:
        ...
    case Color.yellow:
        ...
    case Color.green:
        ...
    case Color.blue:
        ...

## Templates and Gemplates

Now let's explore some of the more cutting edge features of the concise API. We'll start with templates. Templates are statefull text templators. They are useful for keeping track of pronouns when generating text, such as prompts for LLMs. Let's see how they work.

In [None]:
from langchain.concise.template import template


t = template("You are {role}GPT. You can {capabilities}. If you do not understand the user, you should {fallback}.")
print(t(role="chitchat", capabilities="chat about anything", fallback="apologize and ask for clarification"))
print(t(role="book", capabilities="discuss literature", fallback="recommend a book about the topic"))
print(t(role="scholar"))
print(t(fallback="apologize and ask for clarification"))

ValidationError: 1 validation error for ChatMessagePromptTemplate
role
  field required (type=value_error.missing)

As you see, the template keeps track of the pronouns so that the user only has to enter changes to the template. This is convenient in some cases.

Gemplates are similar to templates, but they are for semantic kernels. In computer science, a kernel is a function that transforms one data structure into another. In LangChain, a semantic kernel is a function that transforms one semantic structure into another. Let's see how they work.

In [None]:
from langchain.concise import gemplate


gem = gemplate("Rewrite this sentence in the style of {name}: {sentence}")

print(gem(name="Steve Jobs", sentence="Oh Romeo, Romeo, wherefore art thou Romeo?"))
print(gem(name="Elon Musk"))

SyntaxError: invalid syntax (560798519.py, line 7)

## Rulex: Semantic pattern matching and replacement

Think regex, but with natural language. Rulex is a class for performing replacements on natural language with rules written in natural language.

Start by defining the rules:

In [None]:
from langchain.concise.rulex import Rule


rules = [
    Rule(name="add comments", pattern="A block of code without any comments", replacement="The same code but with comments"),
    Rule(name="elaborate on comments", pattern="All comments", replacement="The same comment but explained with more detail"),
    Rule(name="decompose list comprehensions", pattern="List comprehension", replacement="The overall semantics of the list comprehension, but using a for loop"),
]

Hopefully, these rules will be self-explanatory. Now, let's use them to make some code more readable.

Here's the complex code:

In [None]:
from langchain.concise.config import get_default_max_tokens
from langchain.output_parsers.code import CodeOutputParser
from langchain.output_parsers.incomplete import IncompleteOutputParser


code = generate("Generate a long complex python script that has very few comments and uses lots of list comprehensions. Answer inside a triple backtick code block.", parser=IncompleteOutputParser(parser=CodeOutputParser(), llm=get_default_max_tokens())
print(code)

And here's the simple code:

In [None]:
from langchain.concise.rulex import RulEx


ru = RulEx(rules=rules)

simplified_code = ru(code)

print(simplified_code)

ValidationError: 1 validation error for ChatMessagePromptTemplate
role
  field required (type=value_error.missing)

## Retaining flexbillity

Even though the `concise` API is, well, concise, it's still flexible. For example, you can pass a custom `LLM` or `TextSplitter` to any function that uses it. You can also change the default `LLM`, `TextSplitter`, and max tokens in `langchain.concise.config`.