### Building Multi-stage Reasoning Systems with LangChain

#### Multi-stage reasoning systems

In this notebook we're going to create two AI systems:
- The first, code named `JekyllHyde` will be a prototype AI self-commenting-and-moderating tool that will create new reaction comments to a piece of text with one LLM and use another LLM to critique those comments and flag them if they are negative. To build this we will walk through the steps needed to construct prompts and chains, as well as multiple LLM Chains that take multiple inputs, both from the previous LLM and external. 
- The second system, codenamed `DaScie` (pronounced "dae-see") will take the form of an LLM-based agent that will be tasked with performing data science tasks on data that will be stored in a vector database using ChromaDB. We will use LangChain agents as well as the ChromaDB library, as well as the Pandas Dataframe Agent and python REPL (Read-Eval-Print Loop) tool.

----

#### ![Dolly](https://files.training.databricks.com/images/llm/dolly_small.png) Learning Objectives
By the end of this notebook, you will be able to:
1. Build prompt template and create new prompts with different inputs
2. Create basic LLM chains to connect prompts and LLMs.
3. Construct sequential chains of multiple `LLMChains` to perform multi-stage reasoning analysis. 
4. Use langchain agents to build semi-automated systems with an LLM-centric agent to perform internet searches and dataset analysis.

In [1]:
%pip install wikipedia==1.4.0 google-search-results==2.4.2 better-profanity==0.7.0 langchain langchain_experimental openai pydantic==1.10.9
cache_dir='cache_dir'

Note: you may need to restart the kernel to use updated packages.


In [2]:
import os
from langchain import PromptTemplate
import numpy as np

#### JekyllHyde - A self moderating system for social media

In this section we will build an AI system that consists of two LLMs. `Jekyll` will be an LLM designed to read in a social media post and create a new comment. However, `Jekyll` can be moody at times so there will always be a chance that it creates a negative-sentiment comment... we need to make sure we filter those out. Luckily, that is the role of `Hyde`, the other LLM that will watch what `Jekyll` says and flag any negative comments to be removed. 

#### Step 1 - Letting Jekyll Speak

##### Building the Jekyll Prompt

To build `Jekyll` we will need it to be able to read in the social media post and respond as a commenter. We will use engineered prompts to take as an input two things, the first is the social media post and the second is whether or not the comment will have a positive sentiment. We'll use a random number generator to create a chance of the flag to be positive or negative in `Jekyll's` response.

Let's start with the prompt template


In [3]:
# Our template for Jekyll will instruct it on how it should respond, and what variables (using the {text} syntax) it should use.
jekyll_template = """
You are a social media post commenter, you will respond to the following post with a {sentiment} response. 
Post:" {social_post}"
Comment: 
"""
# We use the PromptTemplate class to create an instance of our template that will use the prompt from above and store variables we will need to input when we make the prompt.
jekyll_prompt_template = PromptTemplate(
    input_variables=["sentiment", "social_post"],
    template=jekyll_template,
)

In [4]:
# Okay now that's ready we need to make the randomized sentiment
random_sentiment = "nice"
if np.random.rand() < 0.3:
    random_sentiment = "mean"
# We'll also need our social media post:
social_post = "I can't believe I'm learning about LangChain in this MOOC, there is so much to learn and so far the instructors have been so helpful. I'm having a lot of fun learning! #AI #Databricks"

# Let's create the prompt and print it out, this will be given to the LLM.
jekyll_prompt = jekyll_prompt_template.format(
    sentiment=random_sentiment, social_post=social_post
)

#### Step 2 - Giving Jekyll a brain!
##### Building the Jekyll LLM 

Note: We provide an option for you to use either Hugging Face or OpenAI. If you continue with Hugging Face, the notebook execution will take a long time (up to 10 mins each cell). If you don't mind using OpenAI, following the next markdown cell for API key generation instructions. 

#### To interact with LLMs in LangChain we need the following modules loaded
from langchain.llms import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, pipelineFor OpenAI,  

We will use their GPT-3 model: `text-babbage-001` as our LLM.

In [5]:
# To interact with LLMs in LangChain we need the following modules loaded

from langchain.llms import OpenAI
from langchain.chains import LLMChain
from better_profanity import profanity

In [6]:
# OpenAI Model
jekyll_llm = OpenAI(model="text-davinci-003", openai_api_key=os.environ["OPENAI_API_KEY"])

jekyll_chain = LLMChain(
    llm=jekyll_llm,
    prompt=jekyll_prompt_template,
    output_key="jekyll_said",
    verbose=False,
) 

# To run our chain we use the .run() command and input our variables as a dict
jekyll_said = jekyll_chain.run(
    {"sentiment": random_sentiment, "social_post": social_post}
)

# Before printing what Jekyll said, let's clean it up:
cleaned_jekyll_said = profanity.censor(jekyll_said)
print(f"Jekyll said: {cleaned_jekyll_said}")
print(f"ORIGINAL COMMENT: {jekyll_said}")

Jekyll said: That's great to hear! It's so rewarding when you learn something new and the instructors are there to help. Keep up the great work! #AI #Databricks
ORIGINAL COMMENT: That's great to hear! It's so rewarding when you learn something new and the instructors are there to help. Keep up the great work! #AI #Databricks


#### Step 4 - Time for Jekyll to Hyde
##### Building the second chain for our Hyde moderator

In [7]:
# 1 We will build the prompt template
# Our template for Hyde will take Jekyll's comment and do some sentiment analysis.
hyde_template = """
You are Hyde, the moderator of an online forum.
You are strict and will not tolerate any negative comments. 
You will look at this next comment from a user and, if it is at all negative, you will replace it with symbols, 
post that and scold the user but if it seems nice respond with a love sentiment
Original comment: {jekyll_said}
Your Comment:
"""
# We use the PromptTemplate class to create an instance of our template that will use the prompt from above and store variables we will need to input when we make the prompt.
hyde_prompt_template = PromptTemplate(
    input_variables=["jekyll_said"],
    template=hyde_template,
)

#####################################
# 2 We connect an LLM for Hyde

hyde_llm = jekyll_llm

#####################################
# 3 We build the chain for Hyde
hyde_chain = LLMChain(
    llm=hyde_llm, prompt=hyde_prompt_template, verbose=False
)  # Now that we've chained the LLM and prompt, the output of the formatted prompt will pass directly to the LLM.

#####################################
# 4 Let's run the chain with what Jekyll last said
# To run our chain we use the .run() command and input our variables as a dict
hyde_says = hyde_chain.run({"jekyll_said": jekyll_said})
# Let's see what hyde said...
print(f"Hyde says: {hyde_says}")

Hyde says: ❤️ That's great to hear! It's so rewarding when you learn something new and the instructors are there to help. Keep up the great work! #AI #Databricks ❤️


#### Step 5 - Creating `JekyllHyde`

##### Building our first Sequencial Chain

In [8]:
from langchain.chains import SequentialChain

jekyllhyde_chain = SequentialChain(
    chains = [jekyll_chain, hyde_chain],
    input_variables = ['sentiment', 'social_post'],
    verbose = True
)

jekyllhyde_chain.run({"sentiment": random_sentiment, "social_post": social_post})



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


"😊❤️ That's great to hear! It's always exciting to learn something new, and it sounds like you're having a great time. Keep it up and good luck! #AI #Databricks"

#### Step 5 - Creating `JekyllHyde`

##### Building our first Sequencial Chain

In [9]:
from langchain.chains import SequentialChain

jekyllhyde_chain = SequentialChain(
    chains = [jekyll_chain, hyde_chain],
    input_variables = ['sentiment', 'social_post'],
    verbose = True
)

jekyllhyde_chain.run({"sentiment": random_sentiment, "social_post": social_post})



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


'😊😊😊'