# LangChain Use-Case Example: Processing Data

This example shows how you can use LangChain to feed data to an LLM to make decisions, classifications, extract text, or any of the other things language models can do with data.

You can run this code if you have an OpenAI API key.  If you have other API keys then you can run some of the demos that use multiple AI APIs from the same code.

## Setup code

The first code cell installs the LangChain modules and retrieves AI API credentials.  Please set up Colab secrets for any of these:

* `OPENAI_API_KEY`    -- To use the OpenAI API
* `AWS_ACCESS_KEY_ID` -- To use AWS Bedrock
* `AWS_SECRET_ACCESS_KEY`
* `AWS_REGION_NAME`

In [1]:
# Load secrets
import os
from dotenv import load_dotenv
load_dotenv()

def load_environment_variables(variable_names):
    for var_name in variable_names:
        if var_name not in os.environ:
            try:
                from google.colab import userdata
                value = userdata.get(var_name)
                if value:
                    os.environ[var_name] = value
            except ImportError:
                pass
        if var_name not in os.environ:
            raise ValueError(f"{var_name} not found. Please set it in .env file or Google Colab secrets.")
        else:
            print(f"Successfully loaded {var_name} from environment variables.")

variables_to_load = [
    "OPENAI_API_KEY",
    "AWS_ACCESS_KEY_ID",
    "AWS_SECRET_ACCESS_KEY",
    "AWS_REGION_NAME"
]

load_environment_variables(variables_to_load)

Successfully loaded OPENAI_API_KEY from environment variables.
Successfully loaded AWS_ACCESS_KEY_ID from environment variables.
Successfully loaded AWS_SECRET_ACCESS_KEY from environment variables.
Successfully loaded AWS_REGION_NAME from environment variables.


In [2]:
# Install necessary libraries
!pip install langchain langchain-community langchain-core langchain-openai

# LangChain setup: Show more about what's happening as it happens.
# from langchain.globals import set_debug
# set_debug(True)



## Example: Detect Something In A Dataset

This [HuggingFace dataset](https://huggingface.co/datasets/AyoubChLin/CNN_News_Articles_2011-2022) contains news articles from CNN.  What if we want to use an LLM to scan the articles, looking for something?  LangChain makes that pretty easy.

In [15]:
from datasets import load_dataset
import random
from tqdm import tqdm
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate

# Load the dataset
dataset = load_dataset("AyoubChLin/CNN_News_Articles_2011-2022", split="train")

# Sample a percentage of the data (e.g., 5%)
sample_size = int(len(dataset) * 0.05)
sampled_data = random.sample(list(dataset), sample_size)

# Use GPT-4 for the model
model = ChatOpenAI(model_name="gpt-4o-mini")

# Create a prompt template
prompt = PromptTemplate.from_template(
    "Does the following news article discuss space exploration missions? Answer with 'Yes' or 'No':\n\n{text}"
)

chain = prompt | model

# Process sampled data
for item in tqdm(sampled_data, desc="Processing articles", unit="article"):
    text = item["text"]
    response = chain.invoke({"text": text})
    
    # Extract the content from AIMessage
    response_text = response.content.strip().lower()
    
    if response_text == "yes":
        print("Text discusses space flight:\n", text[:1000] + '...')

Processing articles:  24%|██▎       | 380/1610 [02:54<15:30,  1.32article/s]

Text discusses space flight:


Processing articles:  30%|███       | 490/1610 [03:44<07:55,  2.35article/s]

Text discusses space flight:
 Sign up for CNN's Wonder Theory science newsletter. Explore the universe with news on fascinating discoveries, scientific advancements and more. (CNN)Total lunar eclipses, a multitude of meteor showers and supermoons will light up the sky in 2022.The new year is sure to be a sky-gazer's delight with plenty of celestial events on the calendar. There is always a good chance that the International Space Station is flying overhead. And if you ever want to know what planets are visible in the morning or evening sky, check The Old Farmer's Almanac's visible planets guide.Here are the top sky events of 2022 so you can have your binoculars and telescope ready. Full moons and supermoonsRead MoreThere are 12 full moons in 2022, and two of them qualify as supermoons. This image, taken in Brazil, shows a plane passing in front of the supermoon in March 2020. Definitions of a supermoon can vary, but the term generally denotes a full moon that is brighter and closer to 

Processing articles:  35%|███▍      | 558/1610 [04:21<07:36,  2.31article/s]

Text discusses space flight:
 Story highlights Crowds hand each member of the group a red rose While secluded, the crew has few luxuries The group asks scientists to put the data it gathered to good useThe group's isolation simulates a 520-day mission to MarsSix volunteer astronauts emerged from a 'trip' to Mars on Friday, waving and grinning widely after spending  520 days in seclusion.  Crowds handed each member of the group a red rose after their capsule opened at the facility in Moscow.  Scientists placed the six male volunteers in isolation in 2010 to simulate a mission to Mars, part of the European Space Agency's experiment to determine challenges facing future space travelers. The six, who are between ages 27 and 38, lived in a tight space the size of six buses in a row,  said Rosita Suenson, the agency's program officer for human spaceflight.During the period, the crew dressed in blue jumpsuits showered on rare occasions and survived on canned food.  Messages from friends and f

Processing articles:  56%|█████▋    | 908/1610 [06:57<05:01,  2.33article/s]

Text discusses space flight:
 Story highlightsGene Seymour: Gene Roddenberry may have created "Star Trek," but Leonard Nimoy and character of Spock are inseparableHe says Nimoy had many other artistic endeavors, photography, directing, poetry, but he was, in the end, SpockGene Seymour is a film critic who has written about music, movies and culture for The New York Times, Newsday, Entertainment Weekly and The Washington Post. The opinions expressed in this commentary are solely those of the writer. (CNN)Everybody on the planet knows that Gene Roddenberry created Mr. Spock, the laconic, imperturbable extra-terrestrial First Officer for the Starship Enterprise. But Mr. Spock doesn't belong to Roddenberry, even though he is the grand exalted progenitor of everything that was, is, and forever will be "Star Trek."Mr. Spock belongs to Leonard Nimoy, who died Friday at age 83. And though he doesn't take Spock with him, he and Spock remain inseparable. Zachary Quinto, who plays Spock in the re

Processing articles: 100%|██████████| 1610/1610 [12:31<00:00,  2.14article/s]
