<a href="https://colab.research.google.com/github/micah-shull/LLMs/blob/main/LLM_042_langchain_transform_chain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# !pip install langchain
# !pip install openai
# !pip install python-dotenv
# !pip install langchain-openai

In [10]:
import os
from dotenv import load_dotenv
import openai
import json
import langchain
from langchain_openai import ChatOpenAI
from langchain import PromptTemplate
from langchain.chains import TransformChain, LLMChain, SimpleSequentialChain, SequentialChain
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    AIMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.schema import AIMessage, HumanMessage, SystemMessage
# Load environment variables from .env file
load_dotenv('/content/API_KEYS.env')
api_key = os.getenv("OPENAI_API_KEY")
# Set the environment variable globally for libraries like LangChain
os.environ["OPENAI_API_KEY"] = api_key
# Print the API key to confirm it's loaded correctly
print("API Key loaded from .env:",os.environ["OPENAI_API_KEY"][0:30])

API Key loaded from .env: sk-proj-e1GUWruINPRnrozmiakkRM


### **Transform Chains in LangChain**

**Transform chains** are a type of chain in LangChain that allow for **custom data transformations** during a workflow. These chains are particularly useful when you need to preprocess, manipulate, or reformat data before or after passing it to an LLM or another chain.

---

### **Key Features of Transform Chains**
1. **Custom Transformations**:
   - Apply custom logic or functions to modify inputs, outputs, or intermediate data.
   - Example: Cleaning text, extracting keywords, or splitting inputs into smaller parts.

2. **Reusable Logic**:
   - Encapsulate reusable transformations to maintain cleaner and more modular code.

3. **Non-LLM Operations**:
   - Perform operations that do not require a language model, such as mathematical computations or format conversions.

4. **Flexibility**:
   - Easily integrate with other chains to preprocess input or post-process output in a larger pipeline.

---

### **How They Work**

- **Input Transformation**: Modify input data before passing it to the next chain.
- **Output Transformation**: Adjust output data after it’s processed by another chain.

---

### **Example Use Cases**
1. **Preprocessing Inputs**:
   - Removing unnecessary whitespace, normalizing text, or extracting key fields.
2. **Postprocessing Outputs**:
   - Formatting LLM responses into structured data (e.g., JSON) or applying additional logic to results.
3. **Custom Logic**:
   - Applying domain-specific calculations or filtering results.

---

### **Key Benefits**
- **Improved Workflow**:
   - Allows intermediate transformations to prepare data for specific tasks.
- **Modularity**:
   - Keeps data transformation logic separate from other chains, ensuring clean and reusable workflows.
- **Customizability**:
   - Enables domain-specific preprocessing and postprocessing without modifying core logic.


In [12]:
def transformer_fun(inputs: dict) -> dict:
    '''
    This function takes a dictionary of inputs and performs custom transformations
    on the "text" input. It extracts the part of the text after "REVIEW:",
    converts it to lowercase, and returns the result as a dictionary.
    '''
    # GRAB INCOMING CHAIN TEXT
    text = inputs['text']  # Access the "text" input from the dictionary

    # Extract only the part of the text after "REVIEW:"
    only_review_text = text.split('REVIEW:')[-1]

    # Convert the extracted text to lowercase
    lower_case_text = only_review_text.lower()

    # Return the transformed text in a dictionary
    return {'output': lower_case_text}

# Step 1: Create a TransformChain to preprocess the input
transform_chain = TransformChain(
    input_variables=['text'],  # Specify the expected input variable
    output_variables=['output'],  # Specify the output variable after transformation
    transform=transformer_fun  # Use the custom transformation function defined above
)

# Step 2: Define the prompt template for summarizing the review
template = "Create a one sentence summary of this review:\n{review_text}"

# Step 3: Initialize the Language Learning Model (LLM)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)  # Specify the LLM configuration

# Step 4: Create a prompt template for the LLM
prompt = ChatPromptTemplate.from_template(template)

# Step 5: Create a chain to generate a summary from the transformed review
summary_chain = LLMChain(
    llm=llm,  # The LLM to process the input
    prompt=prompt,  # The prompt template used to format the input
    output_key="review_summary"  # The key for storing the summary output
)

# Step 6: Combine the TransformChain and LLMChain into a SequentialChain
sequential_chain = SimpleSequentialChain(
    chains=[transform_chain, summary_chain],  # Execute the chains sequentially
    verbose=True  # Enable verbose mode to log each step
)

# Step 7: Run the sequential chain with a Yelp review as input
result = sequential_chain.invoke(yelp_review)  # "yelp_review" is the input dictionary




[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m
oh my goodness, where do i begin? this restaurant is absolutely phenomenal! i went there last night with my friends, and we were blown away by the experience!

first of all, the ambiance is out of this world! the moment you step inside, you're greeted with a warm and inviting atmosphere. the decor is stunning, and it immediately sets the tone for an unforgettable dining experience.

now, let's talk about the food! wow, just wow! the menu is a paradise for food lovers. every dish we ordered was a masterpiece. the flavors were bold, vibrant, and exploded in our mouths. from starters to desserts, every bite was pure bliss!

their seafood platter is a must-try! the freshness of the seafood is unmatched, and the presentation is simply stunning. i have never tasted such delicious and perfectly cooked seafood in my life. it's a seafood lover's dream come true!

the service was exemplary. the staff was attentive, friendly, a

In [14]:
result['input']

"TITLE: AN ABSOLUTE DELIGHT! A CULINARY HAVEN!\n\nREVIEW:\nOH MY GOODNESS, WHERE DO I BEGIN? THIS RESTAURANT IS ABSOLUTELY PHENOMENAL! I WENT THERE LAST NIGHT WITH MY FRIENDS, AND WE WERE BLOWN AWAY BY THE EXPERIENCE!\n\nFIRST OF ALL, THE AMBIANCE IS OUT OF THIS WORLD! THE MOMENT YOU STEP INSIDE, YOU'RE GREETED WITH A WARM AND INVITING ATMOSPHERE. THE DECOR IS STUNNING, AND IT IMMEDIATELY SETS THE TONE FOR AN UNFORGETTABLE DINING EXPERIENCE.\n\nNOW, LET'S TALK ABOUT THE FOOD! WOW, JUST WOW! THE MENU IS A PARADISE FOR FOOD LOVERS. EVERY DISH WE ORDERED WAS A MASTERPIECE. THE FLAVORS WERE BOLD, VIBRANT, AND EXPLODED IN OUR MOUTHS. FROM STARTERS TO DESSERTS, EVERY BITE WAS PURE BLISS!\n\nTHEIR SEAFOOD PLATTER IS A MUST-TRY! THE FRESHNESS OF THE SEAFOOD IS UNMATCHED, AND THE PRESENTATION IS SIMPLY STUNNING. I HAVE NEVER TASTED SUCH DELICIOUS AND PERFECTLY COOKED SEAFOOD IN MY LIFE. IT'S A SEAFOOD LOVER'S DREAM COME TRUE!\n\nTHE SERVICE WAS EXEMPLARY. THE STAFF WAS ATTENTIVE, FRIENDLY, AND 

In [13]:
result['output']

'This review enthusiastically praises a restaurant for its stunning ambiance, exceptional seafood, exquisite desserts, and outstanding service, declaring it a hidden gem and a must-visit for anyone seeking a memorable dining experience.'



### **Step-by-Step Explanation**

1. **Custom Transformation (`transformer_fun`)**:
   - This function extracts and processes text from the input dictionary:
     - Splits the text to isolate the review part (everything after `"REVIEW:"`).
     - Converts the extracted review text to lowercase.
   - Returns a dictionary containing the transformed text (`'output': lower_case_text`).

2. **TransformChain**:
   - Uses the `transformer_fun` function to preprocess the input.
   - Specifies:
     - `input_variables`: The input key (`'text'`) expected in the input dictionary.
     - `output_variables`: The key (`'output'`) to store the transformed text.

3. **Prompt Template**:
   - Defines how the transformed text will be passed to the LLM:
     - Example: `"Create a one sentence summary of this review:\n{review_text}"`.

4. **LLMChain**:
   - Takes the transformed review text as input and generates a one-sentence summary.
   - Uses:
     - An LLM (`ChatOpenAI`) for natural language processing.
     - A prompt to format the input before passing it to the LLM.
     - `output_key`: Stores the generated summary under `'review_summary'`.

5. **SequentialChain**:
   - Combines the `TransformChain` and the `LLMChain` into a sequential workflow.
   - Steps:
     1. `TransformChain` preprocesses the input review text.
     2. The transformed text (`'output'`) is passed to the `LLMChain`.
     3. The `LLMChain` generates a one-sentence summary.

6. **Running the Chain**:
   - `sequential_chain(yelp_review)`:
     - Passes a Yelp review (e.g., `{'text': 'Some text including REVIEW: This place was amazing!'}`) as input.
     - Executes the chains sequentially and produces a final summary.

---

### **Key Concepts**

1. **Modularity**:
   - The workflow is broken into reusable components:
     - `TransformChain` handles data preprocessing.
     - `LLMChain` focuses on task-specific processing (e.g., generating a summary).

2. **Data Transformation**:
   - Custom transformations (`transformer_fun`) allow for domain-specific preprocessing, ensuring clean and structured input for downstream tasks.

3. **Sequential Execution**:
   - The `SimpleSequentialChain` ensures that the output of one chain serves as the input for the next, creating a seamless pipeline.

4. **Scalability**:
   - Additional chains can be added to the pipeline for more complex workflows (e.g., sentiment analysis after summarization).



### Extracting and Processing Ratings

In [16]:
from langchain.chains import TransformChain, LLMChain, SimpleSequentialChain
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

# Step 1: Define a custom transformation function
def extract_rating(inputs: dict) -> dict:
    """
    Extract numerical ratings (out of 5) from the input text.
    Returns the extracted rating as part of the output dictionary.
    """
    # Grab the input text
    review_text = inputs["text"]

    # Extract the rating (assumes a format like "Rating: X/5")
    if "Rating:" in review_text:
        rating = review_text.split("Rating:")[-1].split("/")[0].strip()
        numeric_rating = int(rating)  # Convert to integer
    else:
        numeric_rating = 3  # Default rating if none is found

    return {"rating": numeric_rating}

# Step 2: Create a TransformChain for extracting the rating
rating_extraction_chain = TransformChain(
    input_variables=["text"],  # Input key expected in the input dictionary
    output_variables=["rating"],  # Output key for the extracted rating
    transform=extract_rating  # Custom transformation function
)

# Step 3: Define the prompt template for sentiment summary
template = "The user gave a rating of {rating} out of 5. Write a brief comment summarizing the sentiment."

# Step 4: Initialize the LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# Step 5: Create an LLMChain for sentiment summary generation
sentiment_chain = LLMChain(
    llm=llm,  # The LLM to process the input
    prompt=ChatPromptTemplate.from_template(template),  # Prompt template
    output_key="sentiment_summary"  # Output key for the summary
)

# Step 6: Combine the chains into a sequential pipeline
pipeline = SimpleSequentialChain(
    chains=[rating_extraction_chain, sentiment_chain],  # Execute chains sequentially
    verbose=True  # Enable detailed logging
)

# Step 7: Run the pipeline with a review as input
review_input = {"text": "The food was delicious, and the service was excellent! Rating: 5/5"}
result = pipeline.(review_input)

# Print the final output
print(result)


In [17]:
from langchain.chains import TransformChain, LLMChain, SimpleSequentialChain
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

# Step 1: Define a custom transformation function
def extract_rating(inputs: dict) -> dict:
    """
    Extract numerical ratings (out of 5) from the input text.
    Returns the extracted rating as part of the output dictionary.
    """
    # Grab the input text
    review_text = inputs["input"]  # Expecting the key to be 'input' for compatibility

    # Extract the rating (assumes a format like "Rating: X/5")
    if "Rating:" in review_text:
        rating = review_text.split("Rating:")[-1].split("/")[0].strip()
        numeric_rating = int(rating)  # Convert to integer
    else:
        numeric_rating = 3  # Default rating if none is found

    return {"rating": numeric_rating}

# Step 2: Create a TransformChain for extracting the rating
rating_extraction_chain = TransformChain(
    input_variables=["input"],  # Updated input key to match SequentialChain requirements
    output_variables=["rating"],  # Output key for the extracted rating
    transform=extract_rating  # Custom transformation function
)

# Step 3: Define the prompt template for sentiment summary
template = "The user gave a rating of {rating} out of 5. Write a brief comment summarizing the sentiment."

# Step 4: Initialize the LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# Step 5: Create an LLMChain for sentiment summary generation
sentiment_chain = LLMChain(
    llm=llm,  # The LLM to process the input
    prompt=ChatPromptTemplate.from_template(template),  # Prompt template
    output_key="sentiment_summary"  # Output key for the summary
)

# Step 6: Combine the chains into a sequential pipeline
pipeline = SimpleSequentialChain(
    chains=[rating_extraction_chain, sentiment_chain],  # Execute chains sequentially
    verbose=True  # Enable detailed logging
)

# Step 7: Run the pipeline with a review as input
review_input = {"input": "The food was delicious, and the service was excellent! Rating: 5/5"}
result = pipeline.invoke(review_input)

# Print the final output
print(result)




[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m5[0m
[33;1m[1;3mThe user expressed a highly positive sentiment with a perfect rating of 5 out of 5, indicating complete satisfaction and approval.[0m

[1m> Finished chain.[0m
{'input': 'The food was delicious, and the service was excellent! Rating: 5/5', 'output': 'The user expressed a highly positive sentiment with a perfect rating of 5 out of 5, indicating complete satisfaction and approval.'}


In [18]:
print(result.keys())

dict_keys(['input', 'output'])


In [22]:
# Step 1: Extract the rating
rating_result = rating_extraction_chain.invoke({"input": "The food was delicious, and the service was excellent! Rating: 5/5"})
print(f"Extracted Rating: {rating_result['rating']}")  # Access the 'rating' key

# Step 2: Generate the sentiment analysis
sentiment_result = sentiment_chain.invoke({"rating": rating_result["rating"]})
print(f"Sentiment Analysis: {sentiment_result['sentiment_summary']}")  # Access the 'sentiment_summary' key


Extracted Rating: 5
Sentiment Analysis: The user expressed a highly positive sentiment with a perfect rating of 5 out of 5, indicating complete satisfaction and appreciation.
