## **What is pynarrative?**
pynarrative is a Python library used for automated storytelling, data-driven narration, and automated report generation. It helps convert data insights into human-readable narratives — ideal for dashboards, reports, and analytics tools.

### **Learning Path (Topics Covered)**

Here’s a complete roadmap to learn pynarrative from scratch:

1. Installation & Setup
2. Understanding Narrative Templates
3. Working with DataFrames
4. Descriptive Narratives
5. Comparative Narratives
6. Time Series Narratives
7. Custom Templates & Dynamic Text
8. Generating Reports
9. Integrating with Dashboards (like Streamlit)
10. Project: Auto-Narrated Sales Report

1. Installation & Setup

In [22]:
#pip install pynarrative

In [58]:
import pynarrative as pn
import pandas as pd
import altair as alt

In [23]:
class Narrative:
    def __init__(self, template):
        self.template = template

    def render(self, variables):
        return self.template.format(**variables)

2. Understanding Narrative Templates

Purpose:
Templates act as reusable sentence structures that insert variables dynamically.

In [24]:
template = "The total sales were ${total_sales} in {region} region."
variables = {
    "total_sales": 120000,
    "region": "North"
}

narrative = Narrative(template)
print(narrative.render(variables))

The total sales were $120000 in North region.


3. Working with DataFrames

pynarrative works beautifully with pandas.

In [50]:
data = {
    "Region": ["North", "South", "East", "West"],
    "Sales": [120000, 95000, 113000, 88000],
    "Profit": [30000, 18000, 24000, 15000],
    "Month": ["Jan", "Feb", "Mar", "Apr"],
    "Revenue": [30000, 45000, 50000, 60000]
}
df = pd.DataFrame(data)

In [42]:
for _, row in df.iterrows():
    template = "Region {Region} made sales worth ${Sales}."
    narrative = Narrative(template)
    print(narrative.render(row.to_dict()))

Region North made sales worth $120000.
Region South made sales worth $95000.
Region East made sales worth $113000.
Region West made sales worth $88000.


4. Descriptive Narratives

Used for summarizing key metrics.

In [43]:
total_sales = df["Sales"].sum()
template = "Overall, the total sales across all regions were ${total_sales}."
narrative = Narrative(template)
print(narrative.render({"total_sales": total_sales}))


Overall, the total sales across all regions were $416000.


5. Comparative Narratives

Use to compare metrics between regions or time periods.

In [44]:
north = df[df["Region"] == "North"]["Sales"].values[0]
south = df[df["Region"] == "South"]["Sales"].values[0]

template = "North region outperformed South by ${difference}."
narrative = Narrative(template)
print(narrative.render({"difference": north - south}))

North region outperformed South by $25000.


6. Time Series Narratives

Used for trend-based storytelling.

In [45]:
diff = df["Revenue"].iloc[-1] - df["Revenue"].iloc[0]
template = "From January to March, revenue changed by ${diff}."
narrative = Narrative(template)
print(narrative.render({"diff": diff}))

From January to March, revenue changed by $30000.


7. Custom Templates & Dynamic Text

You can write conditional logic to vary the tone or message.

In [46]:
def create_narrative(row):
    if row["Sales"] > 100000:
        return Narrative("Great job! {Region} exceeded expectations with ${Sales}.").render(row)
    else:
        return Narrative("{Region} needs improvement with only ${Sales}.").render(row)

for _, row in df.iterrows():
    print(create_narrative(row.to_dict()))

Great job! North exceeded expectations with $120000.
South needs improvement with only $95000.
Great job! East exceeded expectations with $113000.
West needs improvement with only $88000.


8. Generating Reports

Combine multiple narratives into a report.

In [47]:
narratives = []
narratives.append(Narrative("Total sales: ${total}.").render({"total": df["Sales"].sum()}))

for _, row in df.iterrows():
    narratives.append(Narrative("{Region} had sales of ${Sales}.").render(row.to_dict()))

report = "\n".join(narratives)
print(report)

Total sales: $416000.
North had sales of $120000.
South had sales of $95000.
East had sales of $113000.
West had sales of $88000.


In [53]:
def region_summary(row):
    if row["Sales"] >= 100000:
        return Narrative("{Region} performed excellently with sales of ${Sales} and profit of ${Profit}.").render(row)
    else:
        return Narrative("{Region} needs improvement with only ${Sales} in sales and ${Profit} in profit.").render(row)

for _, row in df.iterrows():
    print(region_summary(row.to_dict()))

North performed excellently with sales of $120000 and profit of $30000.
South needs improvement with only $95000 in sales and $18000 in profit.
East performed excellently with sales of $113000 and profit of $24000.
West needs improvement with only $88000 in sales and $15000 in profit.


In [59]:
import matplotlib.pyplot as plt
import seaborn as sns

# Bar Plot
plt.figure(figsize=(8,5))
sns.barplot(x="Region", y="Sales", data=df)
plt.title("Sales by Region")
plt.tight_layout()
plt.show()


  plt.show()


In [55]:
highest = df.loc[df["Sales"].idxmax()]
lowest = df.loc[df["Sales"].idxmin()]

template = "🔍 The highest sales were in {Region} (${Sales}), while the lowest were in {Region2} (${Sales2})."
narrative = Narrative(template)
print(narrative.render({
    "Region": highest["Region"],
    "Sales": highest["Sales"],
    "Region2": lowest["Region"],
    "Sales2": lowest["Sales"]
}))


🔍 The highest sales were in North ($120000), while the lowest were in West ($88000).


In [57]:
narratives = []

# Overall summary
total_sales = df["Sales"].sum()
avg_profit = df["Profit"].mean()

narratives.append(Narrative("Total sales across all regions: ${total_sales}.").render({"total_sales": total_sales}))
narratives.append(Narrative("Average profit: ${avg_profit}.").render({"avg_profit": avg_profit}))

# Region-wise
for _, row in df.iterrows():
    narratives.append(region_summary(row.to_dict()))

# Comparison
narratives.append(Narrative("Top-performing region: {Region} (${Sales}).").render(highest.to_dict()))
narratives.append(Narrative("Lowest-performing region: {Region} (${Sales}).").render(lowest.to_dict()))

# Final Report
full_report = "\n".join(narratives)
print(full_report)

Total sales across all regions: $416000.
Average profit: $21750.0.
North performed excellently with sales of $120000 and profit of $30000.
South needs improvement with only $95000 in sales and $18000 in profit.
East performed excellently with sales of $113000 and profit of $24000.
West needs improvement with only $88000 in sales and $15000 in profit.
Top-performing region: North ($120000).
Lowest-performing region: West ($88000).


In [60]:
hist_sales = (
    pn.Story(df, width=600)
    .mark_bar()
    .encode(
        alt.X('Sales', bin=alt.Bin(maxbins=10), title='Sales Amount'),
        y='count()',
        color='Region'
    )
    .properties(title='Sales Distribution by Region')
)

hist_sales.show()


In [61]:
bar_profit = (
    pn.Story(df, width=600)
    .mark_bar()
    .encode(
        x=alt.X('Region', title='Region'),
        y=alt.Y('Profit', title='Profit ($)'),
        color='Region'
    )
    .properties(title='Profit by Region')
)

bar_profit.show()


In [62]:
line_revenue = (
    pn.Story(df, width=600)
    .mark_line(point=True)
    .encode(
        x=alt.X('Month', sort=["Jan", "Feb", "Mar", "Apr"]),
        y=alt.Y('Revenue', title='Revenue ($)'),
        color=alt.value('steelblue')
    )
    .properties(title='Monthly Revenue Trend')
)

line_revenue.show()


In [63]:
scatter = (
    pn.Story(df, width=600)
    .mark_circle(size=100)
    .encode(
        x=alt.X('Sales', title='Sales ($)'),
        y=alt.Y('Profit', title='Profit ($)'),
        color='Region',
        tooltip=['Region', 'Sales', 'Profit']
    )
    .properties(title='Sales vs Profit by Region')
)

scatter.show()


In [72]:
unemp_df = pd.DataFrame({
    'Year': [2018, 2019, 2020, 2021, 2022],
    'UnemploymentRate': [4.5, 3.9, 8.1, 6.2, 5.3]
})

unemp_story = (pn.Story(unemp_df, width=600)
                 .mark_bar(color='teal')
                 .encode(x='Year:O', y='UnemploymentRate:Q')
                 .add_title("State Unemployment Rate", "2018-2022",
                            title_color="#333")
                 .add_context("Sharp increase in 2020 due to the pandemic", position='top')
                 .add_annotation(2020, 8.1, "Pandemic impact", arrow_color='red', label_color='darkred')
                 .render())
unemp_story

In [5]:
covid_df = pd.DataFrame({
    'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'Cases': [1000, 3000, 7000, 5000, 2000]
})

# Create a narrative chart
covid_story = (pn.Story(covid_df)
                 .mark_line(color='firebrick')
                 .encode(x='Month:O', y='Cases:Q')
                 .add_title("COVID-19 Cases Over Time",
                            "Monthly trend",
                            title_color="#b22222")
                 .add_context("Cases peaked in March and declined in April/May", position='top')
                 .add_annotation('Mar', 7000, "Peak in March", arrow_color='gray', label_color='black')
                 .render())
covid_story