<a href="https://colab.research.google.com/github/timothyow/news_search_summary_agent/blob/main/News_search_and_summary_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install langgraph langchain-openai tavily-python python-docx python-dateutil langchain_community

Collecting langgraph
  Downloading langgraph-0.5.1-py3-none-any.whl.metadata (6.7 kB)
Collecting langchain-openai
  Downloading langchain_openai-0.3.27-py3-none-any.whl.metadata (2.3 kB)
Collecting tavily-python
  Downloading tavily_python-0.7.9-py3-none-any.whl.metadata (7.5 kB)
Collecting python-docx
  Downloading python_docx-1.2.0-py3-none-any.whl.metadata (2.0 kB)
Collecting langchain_community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting langgraph-checkpoint<3.0.0,>=2.1.0 (from langgraph)
  Downloading langgraph_checkpoint-2.1.0-py3-none-any.whl.metadata (4.2 kB)
Collecting langgraph-prebuilt<0.6.0,>=0.5.0 (from langgraph)
  Downloading langgraph_prebuilt-0.5.2-py3-none-any.whl.metadata (4.5 kB)
Collecting langgraph-sdk<0.2.0,>=0.1.42 (from langgraph)
  Downloading langgraph_sdk-0.1.72-py3-none-any.whl.metadata (1.5 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.

In [13]:
import operator
from typing import TypedDict, Annotated
from getpass import getpass
import os
from langgraph.graph import StateGraph, END, START
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
from langchain_community.utilities.tavily_search import TavilySearchAPIWrapper
from datetime import datetime
from docx import Document
from docx.shared import Pt
import os



In [4]:
# --- Setup keys ---
OPENAI_KEY = getpass('Enter Open AI API Key: ')
TAVILY_API_KEY = getpass('Enter Tavily Search API Key: ')
os.environ['OPENAI_API_KEY'] = OPENAI_KEY
os.environ['TAVILY_API_KEY'] = TAVILY_API_KEY

# --- Initialize Services ---
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
search = TavilySearchAPIWrapper()

Enter Open AI API Key: ··········
Enter Tavily Search API Key: ··········


In [14]:
# --- Config ---
COUNTRIES = ["Thailand", "Indonesia", "Malaysia", "Philippines", "Vietnam"]
DATE_RANGE = "2024-11-10 to 2024-11-18"  # You can change this as needed
# --- define these at the top (or allow human input later) ---
start_date = datetime.strptime("2024-11-10", "%Y-%m-%d")
end_date = datetime.strptime("2024-11-18", "%Y-%m-%d")

In [17]:

# === 4. State definition ===
class State(TypedDict):
    state: Annotated[list, operator.add]

# === 5. Define a node function factory ===
def country_summary_node(country_name):
    def _node(_: State) -> State:
        query = f"dairy industry news in {country_name} from {DATE_RANGE}"
        results = search.results(query=query, max_results=15)

        article_summaries = "\n".join(
            [f"Title: {r['title']}\nURL: {r['url']}\n" for r in results]
        )

        prompt = f"""
You are a BCG analyst. Based on the following articles, ensure the articles are correct to the dairy industry in {country_name} between {DATE_RANGE},
read the articles and summarize into a paragraph of no more than 200 words.
After the summary, compile the top 5 article titles and URLs relevant to the summary.
Do not repeat similar news articles and do not complie articles from research agency or articles related to Dairy Market Size, Trends and Forecast.

Articles:
{article_summaries}
        """.strip()

        summary = llm.invoke([HumanMessage(content=prompt)]).content

        print(f"\n=== {country_name.upper()} ({DATE_RANGE}) ===\n")
        print(summary)

        return {"state": [(country_name, summary)]}
    return _node

# === 6. Combine node ===
def combine_all_nodes(data: State) -> State:
    print("\n--- ALL COUNTRY SUMMARIES COMPLETE ---")
    for country, summary in data["state"]:
        print(f"\n>>> {country} <<<\n{summary}\n")
    return data

# === 7. Build LangGraph ===
builder = StateGraph(State)

# Entry point
builder.add_node("start", lambda _: {"state": []})

# Country-specific nodes
for country in COUNTRIES:
    builder.add_node(country, country_summary_node(country))

# Combine node
builder.add_node("combine", combine_all_nodes)

# Edges
builder.set_entry_point("start")
for country in COUNTRIES:
    builder.add_edge("start", country)
    builder.add_edge(country, "combine")

builder.add_edge("combine", END)

# Compile
graph = builder.compile()


In [18]:
graph.invoke({})


=== VIETNAM (2024-11-10 to 2024-11-18) ===

Between November 10 and November 18, 2024, the Vietnamese dairy industry showcased significant developments, particularly during the Vietnam Dairy 2024 event held in Ho Chi Minh City, which highlighted innovations and advancements in the sector. The event underscored the steady growth of Vietnam's milk and dairy processing industry, with major players like TH Group announcing plans for a $234 million factory in southern Vietnam, aimed at enhancing production capacity. Additionally, leaders from Vinamilk expressed optimism about the market's future, projecting positive trends for 2025 despite challenges faced by some companies, such as Lof, which anticipates a steep decline in net profit. The industry is adapting to changing consumer preferences and competitive pressures, with established brands strategizing to maintain their market share amidst these shifts.

**Top 5 Article Titles and URLs:**

1. Vietnam Dairy 2024 returns to HCM City, Show

{'state': [('Indonesia',
   "Between November 10 and November 18, 2024, the Indonesian dairy industry faced significant challenges, primarily due to its heavy reliance on imports, which account for 80% of the country's milk consumption. The government is actively seeking to enhance domestic production through initiatives that promote local milk purchases and strengthen partnerships within the industry. Recent reports indicate that the government is enforcing policies to support local dairy farmers and reduce dependency on imported milk. Additionally, there are plans to import a substantial number of dairy cows to meet the increasing demand for milk. These efforts are part of a broader strategy aimed at achieving self-sufficiency in milk production, which remains a critical goal for Indonesia's agricultural sector.\n\n**Top 5 Article Titles and URLs:**\n\n1. Indonesia Imports 80 Percent of Milk for Consumption, Minister Says ...\n   - [Read more](https://en.tempo.co/read/1939858/indones

In [None]:
def write_summary_to_docx(country: str, summary: str, articles: list, date_range: str):
    doc = Document()
    doc.add_heading(f"{country} — Dairy News Summary", level=1)
    doc.add_paragraph(f"Date Range: {date_range}")
    doc.add_heading("Key Summary", level=2)

    for bullet in summary.split("\n"):
        if bullet.strip():
            doc.add_paragraph(bullet.strip(), style='ListBullet')

    doc.add_heading("Top 10 Articles", level=2)
    for article in articles:
        doc.add_paragraph(f"{article['title']}", style='ListNumber')
        doc.add_paragraph(article['url'], style='Normal')

    os.makedirs("news_summaries", exist_ok=True)
    filename = f"news_summaries/{country}_{date_range.replace(' ', '_').replace(':','')}.docx"
    doc.save(filename)
    print(f"✅ Saved: {filename}")
