### Transform Chain

Chain without using LLM, consider such a chain to do the transformation work, like removing sensitive information from a piece of text, etc. Although we can ask LLM to do most of the transformation for us when text is non-sensitive, but still it may cost lots of tokens. Here we can use transformChain to pre-process the text when needed.

![image](../docs/steps_of_transform_chain.png)

In [1]:
from dotenv import find_dotenv, load_dotenv
import os
from langchain_openai import AzureChatOpenAI
from langchain.prompts.chat import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.prompts import PromptTemplate

from langchain.chains import TransformChain

load_dotenv(find_dotenv())

model = AzureChatOpenAI(
    openai_api_type="azure",
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    azure_deployment=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
    api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
    openai_api_key=os.getenv("AZURE_OPENAI_API_KEY")
)

In [2]:
with open("../docs/sample_text_lorem_ipsum.txt", 'r', encoding='utf-8') as file:
    sample = file.read()

print(sample)

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean porttitor libero eu tempor vestibulum. Aliquam et pretium ante. In tincidunt risus in pharetra consectetur. Sed a libero eget enim dictum tincidunt. Aliquam justo odio, venenatis sed orci a, bibendum accumsan nisi. Quisque consectetur iaculis leo, eget mattis erat congue et. Nullam pulvinar orci sed maximus tincidunt. Curabitur eget ex consectetur, posuere elit vel, tempor augue. Ut sollicitudin mattis mattis. Mauris ligula est, mattis ut nulla a, bibendum rhoncus velit. Morbi eget tempor turpis, vitae egestas ante. Pellentesque eget erat ac massa convallis aliquam eget id massa. Nulla posuere mi eu ex ornare ornare. Integer sit amet varius lacus.

Maecenas suscipit libero et est aliquam, ac bibendum turpis sagittis. Donec ullamcorper neque sem, sit amet lacinia elit vehicula in. Nullam eget lorem non diam auctor semper nec eu odio. Sed in erat eget urna blandit aliquet vel sit amet lectus. Quisque vitae velit lectus. Null

In [7]:
import re

def transform (inputs: dict) -> dict:
    text = inputs["text"]
    match = re.search(r"Maecenas suscipit libero et est aliquam, ac bibendum turpis sagittis.(.*?)(?:\n\n|$)", text, re.DOTALL)
    section = match.group(1).strip()
    return {"output_text": section}

transform({"text": sample})

{'output_text': 'Donec ullamcorper neque sem, sit amet lacinia elit vehicula in. Nullam eget lorem non diam auctor semper nec eu odio. Sed in erat eget urna blandit aliquet vel sit amet lectus. Quisque vitae velit lectus. Nullam vitae dui leo. Aenean sollicitudin dolor et sollicitudin ornare. Phasellus consequat congue dolor id pellentesque. Duis dignissim purus magna, eu accumsan erat dapibus vel. Quisque sollicitudin eleifend sem, nec mattis arcu condimentum ac. Nulla et finibus eros, nec fermentum urna. Etiam sodales pellentesque tortor eget mollis. Morbi eleifend, dolor eu sagittis sagittis, est dolor tempus orci, at scelerisque est magna in enim.'}

Now create a transform chain that invoke the above `transform` function

In [8]:
transform_chain = TransformChain(
    input_variables=["text"],
    output_variables=["output_text"],
    transform=transform
)

Now create a simple template then add them to a sequential chain.

In [9]:
from langchain.chains import SimpleSequentialChain, LLMChain
from langchain.prompts import PromptTemplate

template = """Summarize this text:

{output_text}

Summary:"""
prompt = PromptTemplate(input_variables=["output_text"], template=template)
llm_chain = LLMChain(llm=model, prompt=prompt)

sequential_chain = SimpleSequentialChain(chains=[transform_chain, llm_chain], verbose=True)

In [10]:
sequential_chain.run(sample)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mDonec ullamcorper neque sem, sit amet lacinia elit vehicula in. Nullam eget lorem non diam auctor semper nec eu odio. Sed in erat eget urna blandit aliquet vel sit amet lectus. Quisque vitae velit lectus. Nullam vitae dui leo. Aenean sollicitudin dolor et sollicitudin ornare. Phasellus consequat congue dolor id pellentesque. Duis dignissim purus magna, eu accumsan erat dapibus vel. Quisque sollicitudin eleifend sem, nec mattis arcu condimentum ac. Nulla et finibus eros, nec fermentum urna. Etiam sodales pellentesque tortor eget mollis. Morbi eleifend, dolor eu sagittis sagittis, est dolor tempus orci, at scelerisque est magna in enim.[0m
[33;1m[1;3mThe text describes a state or condition using intricate and detailed language. It mentions feelings of pain and discomfort, particularly related to certain body parts such as the abdomen, neck, and chest. The text also suggests a sense of magnification or intensification

'The text describes a state or condition using intricate and detailed language. It mentions feelings of pain and discomfort, particularly related to certain body parts such as the abdomen, neck, and chest. The text also suggests a sense of magnification or intensification towards the end. However, without specific context, the exact meaning or subject matter is not entirely clear.'

Right, so up-till-now, apart from the fact that LLM is super hallucinational and talking non-sense sometimes, you should understand what the TransformChain is about, so further question is: why bother? Why not just call the function and pass its output to LLM, than wrap the function to a TransformChain? 

Hmmm....interesting....

In Chinese, we call it "Fart after taking off trousers." :)