# **Crisis NLP Analysis with Sieves**

<img src="assets/sieves_logo.png" width=200px height=200px />


## Table of Contents
1. [Imports](#imports)
2. [Dataset Loading](#dataset-loading)
3. [Data Exploration](#data-exploration)  
4. [Creating Sieves Doc Objects](#creating-sieves-doc-objects)
5. [Defining Models](#defining-models)
6. [Language Detection](#language-detection)
7. [Main Pipeline](#main-pipeline)
8. [Evaluating Results](#evaluating-results)

### Imports

Dependencies:
ipykernel
pandas
openai
outlines
sieves


In [1]:
import pandas as pd
import os
import openai
import outlines
from sieves import Pipeline, tasks, Doc, GenerationSettings
from sieves.tasks.predictive.classification import FewshotExampleMultiLabel
from pydantic import BaseModel, Field
from typing import Literal
import random

# Number of rows per crisis type to subset to
n_rows = 100

# Subset of tweets
n_tweets = 100


/Users/climatiq/dev/sieves/.venv/lib/python3.12/site-packages/pydantic/_internal/_config.py:323: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
/Users/climatiq/dev/sieves/.venv/lib/python3.12/site-packages/pydantic/_internal/_config.py:323: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/


### Dataset Loading

Download the Dataset. If you are unable to run the following code block you can download manually the dataset with this url:

https://crisisnlp.qcri.org/data/lrec2016/labeled_cf/CrisisNLP_labeled_data_crowdflower_v2.zip

Then you can place is in a directory with path /examples/use-case/CrisisNLP

In [2]:
import urllib.request
import zipfile
from pathlib import Path
import shutil

url = "https://crisisnlp.qcri.org/data/lrec2016/labeled_cf/CrisisNLP_labeled_data_crowdflower_v2.zip"
zip_path = Path("CrisisNLP_dataset.zip")
data_dir = Path("CrisisNLP")

if not zip_path.exists():
    urllib.request.urlretrieve(url, str(zip_path))

with zipfile.ZipFile(str(zip_path), 'r') as z:
    z.extractall(".")

extracted = Path("CrisisNLP_labeled_data_crowdflower")
if extracted.exists():
    data_dir.mkdir(exist_ok=True)
    for item in extracted.iterdir():
        target = data_dir / item.name
        if not target.exists():
            item.rename(target)
    shutil.rmtree(extracted)


Load the data

In [3]:
data: pd.DataFrame = pd.DataFrame()

print("Available Crisis Types:\n")
for file in os.listdir("CrisisNLP"):
    if os.path.isdir(os.path.join("CrisisNLP", file)):
        print(f"- {file}")
        for file_ in os.listdir(os.path.join("CrisisNLP", file)):
            if file_.endswith(".tsv"):
                data_ = pd.read_csv(os.path.join("CrisisNLP", file, file_), sep="\t")
                
                if n_rows is not None:
                    # Subset to n_rows rows for testing
                    data_ = data_[0:n_rows]
                
                data_["location"] = file.split("_")[0]
                data_["crisis_type"] = file.split("_")[1]
                data = pd.concat([data, data_])



Available Crisis Types:

- 2015_Nepal_Earthquake_en
- 2015_Cyclone_Pam_en
- 2014_Middle_East_Respiratory_Syndrome_en
- 2014_India_floods
- 2014_Pakistan_floods
- 2014_ebola_cf
- 2014_California_Earthquake
- 2013_Pakistan_eq
- 2014_Hurricane_Odile_Mexico_en
- 2014_Chile_Earthquake_cl
- 2014_Chile_Earthquake_en
- 2014_Philippines_Typhoon_Hagupit_en


### Data Exploration

In [4]:
data.tail()

Unnamed: 0,tweet_id,tweet_text,label,location,crisis_type
5,'541562943847272449',RT @rapplerdotcom: #RubyPH cuts power lines in...,infrastructure_and_utilities_damage,2014,Philippines
6,'541246399426625536',3Novices:Super Typhoon Lashes Philippines as 6...,displaced_people_and_evacuations,2014,Philippines
7,'541210384800030720',#TeamYamita Hundreds of Thousands Evacuated in...,displaced_people_and_evacuations,2014,Philippines
8,'541610392825634816',CALL FOR VOLUNTEERS: Repacking of #RubyPH reli...,donation_needs_or_offers_or_volunteering_services,2014,Philippines
9,'540796767663439872',RT @ChannelNewsAsia: Filipinos seek shelter in...,displaced_people_and_evacuations,2014,Philippines


In [5]:
data["location"].value_counts()

location
2014    90
2015    20
2013    10
Name: count, dtype: int64

In [6]:
print("Number of rows:", len(data))
print("Number of columns:", len(data.columns))
print("Column names:", data.columns.tolist())

print("\nSummary statistics:")
display(data.describe(include='all'))


Number of rows: 120
Number of columns: 5
Column names: ['tweet_id', 'tweet_text', 'label', 'location', 'crisis_type']

Summary statistics:


Unnamed: 0,tweet_id,tweet_text,label,location,crisis_type
count,120,120,120,120,120
unique,120,120,14,3,10
top,'592326564110585856',RT @divyaconnects: Reached #Kathmandu finally!...,other_useful_information,2014,Pakistan
freq,1,1,33,90,20


### Creating Sieves Doc Objects


In [7]:
# Convert tweets to Sieves Doc objects
def create_sieves_docs(df):
    """Create Sieves Doc objects from DataFrame"""
    docs: Doc = []
    for idx, row in df.iterrows():
        doc = Doc(
            uri=f"tweet_{row['tweet_id']}",
            text=row['tweet_text']
        )
        # Add metadata
        doc.metadata = {
            'tweet_id': row['tweet_id'],
            'gold_label': row['label'],
            'location': row['location'],
            'crisis_type': row['crisis_type']
        }
        docs.append(doc)
    return docs

# Create docs from English tweets
docs = create_sieves_docs(data)
print(f"Created {len(docs)} Sieves Doc objects")
print(f"\nSample doc:")
print(f"Tweet: {docs[0].text[:100]}...")
print(f"Metadata: {docs[0].metadata}")


Created 120 Sieves Doc objects

Sample doc:
Tweet: RT @divyaconnects: Reached #Kathmandu finally! Lots of Indians stranded at the airport #NepalQuake @...
Metadata: {'tweet_id': "'592326564110585856'", 'gold_label': 'other_useful_information', 'location': '2015', 'crisis_type': 'Nepal'}


### Defining models

In [8]:
# Using OpenAI models via outlines

openrouter_client = openai.OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"],
)
# model = outlines.from_openai(openrouter_client, model_name="openai/gpt-5-nano")
model = outlines.from_openai(openrouter_client, model_name="google/gemini-2.5-flash-lite-preview-09-2025")


Subset of docs

In [9]:
if n_tweets is not None:
    test_docs = random.sample(docs,n_tweets)
else:
    test_docs = docs

### Language Detection

Define the language detection few shot examples, task and pipeline

In [10]:
from sieves.tasks.predictive.classification import FewshotExampleMultiLabel

fewshot_examples = [
    FewshotExampleMultiLabel(
        text="Help! There's a major earthquake in California. Everyone stay safe!",
        reasoning="This text is clearly written in English with standard English words, grammar, and sentence structure.",
        confidence_per_label={"english": 1.0, "other": 0.0}
    ),
    FewshotExampleMultiLabel(
        text="¡Ayuda! Hay un terremoto en México. ¡Manténganse seguros!",
        reasoning="This text is written in Spanish, using Spanish words like 'Ayuda' (help) and 'terremoto' (earthquake), and Spanish grammar structures.",
        confidence_per_label={"english": 0.0, "other": 1.0}
    ),
    FewshotExampleMultiLabel(
        text="Praying for the victims of the flood in Nepal. #PrayForNepal",
        reasoning="This is an English tweet with standard English vocabulary, grammar, and hashtag format commonly used on Twitter.",
        confidence_per_label={"english": 0.95, "other": 0.05}
    ),
    FewshotExampleMultiLabel(
        text="Bonjour! Comment allez-vous? Je parle français.",
        reasoning="This text contains French words and phrases. While it has some basic structure, it's clearly not English.",
        confidence_per_label={"english": 0.1, "other": 0.9}
    ),
    FewshotExampleMultiLabel(
        text="RT @user: Just saw the news about the hurricane. Hope everyone is okay!",
        reasoning="This is an English tweet with typical Twitter retweet format (RT), English words, and standard English sentence structure.",
        confidence_per_label={"english": 0.98, "other": 0.02}
    ),
    FewshotExampleMultiLabel(
        text="Hello world! This is a test message with some English words but also 你好世界",
        reasoning="This text contains both English and Chinese characters. It's mixed language, so while English is present, it's not entirely English.",
        confidence_per_label={"english": 0.6, "other": 0.7}
    )
]

Define the task

In [11]:
from sieves import GenerationSettings

language_detector = tasks.Classification(
        task_id="language_detector",
        labels=["english", "other"],
        model=model,
        label_descriptions={"english": "The tweet is in English.", "other": "The tweet is in a language other than English."},
        prompt_instructions="""
        You are a helpful assistant that determines the language of a given text. If the text is in English, return 'english'. Otherwise, return 'other'.
        """,
        fewshot_examples=fewshot_examples,
        batch_size=10,
        generation_settings=GenerationSettings(strict_mode=True)
    )

language_detection = Pipeline([language_detector])
print(f"Number of tasks: {len(language_detection.tasks)}")

Number of tasks: 1


Run the language detection pipeline

In [12]:
try:
    language_detection_results = list(language_detection(test_docs))
    print("Language detection pipeline run successfully!")

except Exception as e:
    print(f"Error running language detection: {e}")
    raise e

Running pipeline: 100%|██████████| 10/10 [00:14<00:00,  1.48s/it]

Language detection pipeline run successfully!





Show results

In [13]:
language_detection_results[:10]

[Doc(meta={}, results={'language_detector': [('english', 0.95), ('other', 0.05)]}, uri="tweet_'468077392955990016'", text='(#Luke 21:11) #MERS: Saudis + 3 more deaths, WHO says increase prevention http://t.co/1vTnlGeO4k #EndTimes #WorldNews http://t.co/cwBbDeMPAz', chunks=['(#Luke 21:11) #MERS: Saudis + 3 more deaths, WHO says increase prevention http://t.co/1vTnlGeO4k #EndTimes #WorldNews http://t.co/cwBbDeMPAz'], id=None, images=None),
 Doc(meta={}, results={'language_detector': [('english', 1.0), ('other', 0.0)]}, uri="tweet_'592955183753203715'", text='VIDEO REPORT Kathmandu overwhelmed by rubble after earthquake http://t.co/cYrAbG1OxZ Over 3000 dead, 6,500 plus injured #NepalEarthquake', chunks=['VIDEO REPORT Kathmandu overwhelmed by rubble after earthquake http://t.co/cYrAbG1OxZ Over 3000 dead, 6,500 plus injured #NepalEarthquake'], id=None, images=None),
 Doc(meta={}, results={'language_detector': [('english', 0.9), ('other', 0.1)]}, uri="tweet_'383790723222364161'", text='#Eart

Filter for english tweets

In [14]:
# Filter for English tweets with confidence > 0.6
english_docs = [
    doc for doc in language_detection_results 
    if doc.results["language_detector"][0][0] == "english" and doc.results["language_detector"][0][1] > 0.6
    or doc.results["language_detector"][1][0] == "english" and doc.results["language_detector"][1][1] > 0.6
]

print(f"Number of English tweets: {len(english_docs)}")
print(f"Number of non-English tweets: {len(language_detection_results) - len(english_docs)}")


Number of English tweets: 8
Number of non-English tweets: 2


### Main Pipeline


Define the tasks

In [15]:
gen_settings = GenerationSettings(strict_mode=True)

# Zero-shot taskswith default configuration
class Country(BaseModel, frozen=True):
    name: Literal["India", "California", "Pakistan", "Mexico", "Senegal", "Vanuatu", "Nepal", "Philippines", "Unknown"] = Field(description="The name of the country mentioned in the tweet if any. If not, return 'unknown'")

class CrisisType(BaseModel, frozen=True):
    crisis_type: Literal["Earthquake", "Flood", "Hurricane", "Typhoon", "Cyclone", "Ebola", "Unknown"] = Field(description="The type of crisis mentioned in the tweet")

crisis_classifier = tasks.Classification(
        task_id="crisis_classifier",
        labels=["related_to_crisis", "not_related_to_crisis"],
        model=model,
        batch_size=10,
        generation_settings=gen_settings
    )

crisis_type_extractor = tasks.InformationExtraction(
        task_id="crisis_type_extractor",
        entity_type=CrisisType,
        model=model,
        batch_size=10,
        generation_settings=gen_settings
    )

location_extractor = tasks.InformationExtraction(
        task_id="location_extractor",
        entity_type=Country,
        model=model,
        batch_size=10,
        generation_settings=gen_settings
    )


Define Main Pipeline

In [16]:
pipe = Pipeline([
    crisis_classifier,
    crisis_type_extractor,
    location_extractor,
])

print("Pipeline created successfully!")
print(f"Number of tasks: {len(pipe.tasks)}")

Pipeline created successfully!
Number of tasks: 3


Run the pipeline

In [17]:
try:
    results = list(pipe(english_docs))
    
    # Display sample results
    for i, doc in enumerate(results):
        print(f"\\n--- Tweet {i+1} ---")
        print(f"Text: {doc.text}...")
        print(f"Crisis Classifier: {doc.results.get('crisis_classifier', 'N/A')}")
        print(f"Crisis Type Extraction: {doc.results.get('crisis_type_extractor')}")
        print(f"Locations Found: {doc.results.get('location_extractor', 'N/A')}")
        
except Exception as e:
    print(f"Error running pipeline: {e}")


Running pipeline: 100%|██████████| 8/8 [00:22<00:00,  2.85s/it]

\n--- Tweet 1 ---
Text: (#Luke 21:11) #MERS: Saudis + 3 more deaths, WHO says increase prevention http://t.co/1vTnlGeO4k #EndTimes #WorldNews http://t.co/cwBbDeMPAz...
Crisis Classifier: [('related_to_crisis', 1.0), ('not_related_to_crisis', 0.0)]
Crisis Type Extraction: [CrisisType(crisis_type='Unknown')]
Locations Found: [Country(name='Unknown')]
\n--- Tweet 2 ---
Text: VIDEO REPORT Kathmandu overwhelmed by rubble after earthquake http://t.co/cYrAbG1OxZ Over 3000 dead, 6,500 plus injured #NepalEarthquake...
Crisis Classifier: [('related_to_crisis', 1.0), ('not_related_to_crisis', 0.0)]
Crisis Type Extraction: [CrisisType(crisis_type='Earthquake')]
Locations Found: [Country(name='Nepal')]
\n--- Tweet 3 ---
Text: #Earthquake 2013-09-28 02:39:43 (M5.0) EAST OF THE SOUTH SANDWICH ISLANDS -59.5 -19.1 (70fa9) http://t.co/uBN98fFmNj notice...
Crisis Classifier: [('related_to_crisis', 1.0), ('not_related_to_crisis', 0.0)]
Crisis Type Extraction: [CrisisType(crisis_type='Earthquake')]
Locatio




### Evaluating Results

Create a structured DataFrame with the results

In [18]:
evaluation_data = []
for i, doc in enumerate(results):
    # Get crisis classification
    crisis_result = doc.results.get('crisis_classifier', [])
    predicted_label = crisis_result[0][0] if crisis_result else 'unknown'
    predicted_conf = crisis_result[0][1] if crisis_result else 0.0
    
    # Get crisis type extraction
    crisis_type_result = doc.results.get('crisis_type_extractor', [])
    crisis_type = crisis_type_result[0].crisis_type if crisis_type_result else 'none_extracted'
    
    # Get location extraction
    location_result = doc.results.get('location_extractor', [])
    locations = [loc.name for loc in location_result] if location_result else []
    
    # Get gold labels from metadata
    gold_label = doc.metadata.get('gold_label', 'unknown')
    gold_crisis_type = doc.metadata.get('crisis_type', 'unknown')
    gold_location = doc.metadata.get('location', 'unknown')
    
    evaluation_data.append({
        'tweet_id': doc.metadata.get('tweet_id', ''),
        'text': doc.text[:100],
        'predicted_label': predicted_label,
        'predicted_confidence': predicted_conf,
        'gold_label': gold_label,
        'predicted_crisis_type': crisis_type,
        'gold_crisis_type': gold_crisis_type,
        'predicted_locations': ', '.join(locations) if locations else 'none',
        'gold_location': gold_location,
        'crisis_type_extracted': crisis_type != 'none_extracted',  # Track if extraction worked
        'location_extracted': any(loc != 'unknown' for loc in locations) if locations else False,  # Track if real locations found
        'location_match': any(gold_location.lower() in loc.lower() for loc in locations) if locations else False
    })

df_eval = pd.DataFrame(evaluation_data)

Analyze the obtained results

In [19]:
print("=" * 80)
print("PIPELINE PERFORMANCE")
print("=" * 80)

# 1. Crisis Classification Performance
df_eval['gold_binary'] = df_eval['gold_label'].apply(
    lambda x: 'not_related_to_crisis' if 'not_related' in x.lower() or 'irrelevant' in x.lower() 
    else 'related_to_crisis'
)
classification_accuracy = (df_eval['predicted_label'] == df_eval['gold_binary']).sum() / len(df_eval) * 100

print(f"\n1. CRISIS CLASSIFICATION")
print(f"   Accuracy: {classification_accuracy:.1f}%")
print(f"   Predicted: {sum(df_eval['predicted_label'] == 'related_to_crisis')} crisis, {sum(df_eval['predicted_label'] == 'not_related_to_crisis')} not crisis")

# 2. Crisis Type Extraction Performance  
crisis_type_success = df_eval['crisis_type_extracted'].sum()
crisis_type_total = len(df_eval)

print(f"\n2. CRISIS TYPE EXTRACTION")
print(f"   Extracted from: {crisis_type_success}/{crisis_type_total} tweets")
print(f"   Success rate: {crisis_type_success/crisis_type_total*100:.1f}%")
if crisis_type_success > 0:
    successful = df_eval[df_eval['crisis_type_extracted'] == True]
    types = successful['predicted_crisis_type'].value_counts()
    print(f"   Types found: {', '.join(types.index.tolist())}")

# 3. Location Extraction Performance
location_success = df_eval['location_extracted'].sum()
location_total = len(df_eval)

print(f"\n3. LOCATION EXTRACTION")
print(f"   Extracted from: {location_success}/{location_total} tweets")
print(f"   Success rate: {location_success/location_total*100:.1f}%")
if location_success > 0:
    successful_locs = df_eval[df_eval['location_extracted'] == True]['predicted_locations']
    all_locations = set()
    for locs in successful_locs:
        all_locations.update(locs.split(', '))
    print(f"   Locations found: {len(all_locations)} unique locations")

print("\n" + "=" * 80)


PIPELINE PERFORMANCE

1. CRISIS CLASSIFICATION
   Accuracy: 100.0%
   Predicted: 8 crisis, 0 not crisis

2. CRISIS TYPE EXTRACTION
   Extracted from: 8/8 tweets
   Success rate: 100.0%
   Types found: Earthquake, Unknown, Cyclone, Hurricane, Ebola

3. LOCATION EXTRACTION
   Extracted from: 8/8 tweets
   Success rate: 100.0%
   Locations found: 4 unique locations



Display the evaluation df

In [20]:
df_eval.head(10)

Unnamed: 0,tweet_id,text,predicted_label,predicted_confidence,gold_label,predicted_crisis_type,gold_crisis_type,predicted_locations,gold_location,crisis_type_extracted,location_extracted,location_match,gold_binary
0,'468077392955990016',"(#Luke 21:11) #MERS: Saudis + 3 more deaths, W...",related_to_crisis,1.0,deaths_reports,Unknown,Middle,Unknown,2014,True,True,False,related_to_crisis
1,'592955183753203715',VIDEO REPORT Kathmandu overwhelmed by rubble a...,related_to_crisis,1.0,injured_or_dead_people,Earthquake,Nepal,Nepal,2015,True,True,False,related_to_crisis
2,'383790723222364161',#Earthquake 2013-09-28 02:39:43 (M5.0) EAST OF...,related_to_crisis,1.0,other_useful_information,Earthquake,Pakistan,Unknown,2013,True,True,False,related_to_crisis
3,'576754146944163840',RT @WFP_Asia: In contact with #vanuatu to unde...,related_to_crisis,1.0,donation_needs_or_offers_or_volunteering_services,Cyclone,Cyclone,Vanuatu,2015,True,True,False,related_to_crisis
4,'514627577831768064',(09/22) ONE WAY TO HELP ODILE VICTIMS #surfin...,related_to_crisis,0.98,other_useful_information,Hurricane,Hurricane,Unknown,2014,True,True,False,related_to_crisis
5,'503765744333901824',"Say what you want, but the earthquake that hit...",related_to_crisis,1.0,other_useful_information,Earthquake,California,California,2014,True,True,False,related_to_crisis
6,'523225601910775808',RT @itsmenanice: Read this. From Miasma to #Eb...,related_to_crisis,1.0,other_useful_information,Ebola,ebola,Unknown,2014,True,True,False,related_to_crisis
7,'451585608612208640',Guys northem chile really needs a support mess...,related_to_crisis,1.0,sympathy_and_emotional_support,Earthquake,Chile,Unknown,2014,True,True,False,related_to_crisis


### Summary

This use case demonstrates the power of **Sieves** for common NLP tasks:

✅ **Language Detection** - Automatically filters non-English tweets  
✅ **Classification** - Zero-shot classification to identify crisis-related content  
✅ **Information Extraction** - Structured extraction of crisis types and locations

- [Sieves Documentation](https://sieves.ai/)
- [CrisisNLP Dataset](https://crisisnlp.qcri.org/lrec2016/lrec2016.html)
- [Sieves GitHub](https://github.com/mantisai/sieves)
