# Getting Started with py-gdelt

This notebook provides a comprehensive introduction to the py-gdelt library, covering all essential features.

## Contents
1. Installation and Setup
2. Basic Client Initialization
3. Querying Events Data
4. Using REST APIs (DOC, GEO, Context, TV)
5. Working with GKG and NGrams
6. Accessing Lookup Tables
7. Streaming Large Datasets

## 1. Installation and Setup

First, install py-gdelt and enable async support for Jupyter:

In [None]:
# Install py-gdelt (uncomment if needed)
# !pip install py-gdelt

# Enable async support for Jupyter
import nest_asyncio


nest_asyncio.apply()

print("Setup complete!")

## 2. Basic Client Initialization

The `GDELTClient` is your main entry point to all GDELT data:

In [None]:
from datetime import date, timedelta

from py_gdelt import GDELTClient


# Initialize client with default settings
async with GDELTClient() as client:
    print("Client initialized successfully!")
    print(f"Timeout: {client.settings.timeout}s")
    print(f"Max retries: {client.settings.max_retries}")

## 3. Querying Events Data

Events are the core of GDELT - structured records of global events:

In [None]:
from py_gdelt.filters import DateRange, EventFilter


# Query recent events from the US
async with GDELTClient() as client:
    yesterday = date.today() - timedelta(days=2)

    event_filter = EventFilter(
        date_range=DateRange(start=yesterday, end=yesterday),
        actor1_country="USA",
    )

    try:
        result = await client.events.query(event_filter)
        print(f"Found {len(result)} events from the US on {yesterday}")

        if result:
            first_event = result[0]
            print("\nFirst event:")
            print(f"  ID: {first_event.global_event_id}")
            print(f"  Event Code: {first_event.event_code}")
            print(f"  Goldstein Scale: {first_event.goldstein_scale}")
    except Exception as e:
        print(f"Error: {e}")

## 4. Using REST APIs

### 4.1 DOC API - Article Search

In [None]:
from py_gdelt.filters import DocFilter


async with GDELTClient() as client:
    doc_filter = DocFilter(
        query="climate change",
        timespan="24h",
        max_results=10,
    )

    try:
        articles = await client.doc.query(doc_filter)
        print(f"Found {len(articles)} articles")

        if articles:
            print("\nFirst article:")
            print(f"  Title: {articles[0].title}")
            print(f"  URL: {articles[0].url}")
    except Exception as e:
        print(f"Error: {e}")

### 4.2 GEO API - Geographic Search

In [None]:
async with GDELTClient() as client:
    try:
        result = await client.geo.search(
            "earthquake",
            timespan="7d",
            max_points=10,
        )

        print(f"Found {len(result.points)} locations")

        for point in result.points[:5]:
            print(f"  {point.name}: {point.count} articles at ({point.lat:.2f}, {point.lon:.2f})")
    except Exception as e:
        print(f"Error: {e}")

### 4.3 Context API - Contextual Analysis

In [None]:
async with GDELTClient() as client:
    try:
        result = await client.context.analyze(
            "artificial intelligence",
            timespan="7d",
        )

        print(f"Articles analyzed: {result.article_count}")
        print("\nTop themes:")
        for theme in result.themes[:5]:
            print(f"  {theme.theme}: {theme.count}")
    except Exception as e:
        print(f"Error: {e}")

### 4.4 TV API - Television News

In [None]:
async with GDELTClient() as client:
    try:
        clips = await client.tv.search(
            "economy",
            timespan="24h",
            max_results=5,
        )

        print(f"Found {len(clips)} TV clips")

        for clip in clips[:3]:
            print(f"\n  {clip.station} - {clip.show_name}")
            if clip.snippet:
                print(f"  {clip.snippet[:100]}...")
    except Exception as e:
        print(f"Error: {e}")

## 5. Working with GKG and NGrams

### 5.1 Global Knowledge Graph

In [None]:
from py_gdelt.filters import GKGFilter


async with GDELTClient() as client:
    gkg_filter = GKGFilter(
        date_range=DateRange(start=date.today() - timedelta(days=2)),
        themes=["ENV_CLIMATECHANGE"],
    )

    try:
        # Stream first few records
        count = 0
        async for record in client.gkg.stream(gkg_filter):
            print(f"\nRecord {count + 1}:")
            print(f"  Source: {record.source_name}")
            print(f"  URL: {record.source_url[:80]}...")
            print(f"  Themes: {len(record.themes)}")

            count += 1
            if count >= 3:
                break
    except Exception as e:
        print(f"Error: {e}")

## 6. Accessing Lookup Tables

GDELT provides lookup tables for codes and categories:

In [None]:
async with GDELTClient() as client:
    # CAMEO event codes
    cameo = client.lookups.cameo
    entry = cameo.get("14")
    print(f"Event code '14': {entry.name if entry else 'Unknown'}")

    # Country codes
    countries = client.lookups.countries
    try:
        iso3 = countries.fips_to_iso3("US")
        print(f"FIPS 'US' -> ISO3 '{iso3}'")
    except Exception as e:
        print(f"Error: {e}")

## 7. Streaming Large Datasets

For memory efficiency, use streaming instead of loading all data at once:

In [None]:
async with GDELTClient() as client:
    yesterday = date.today() - timedelta(days=2)

    event_filter = EventFilter(
        date_range=DateRange(start=yesterday, end=yesterday),
    )

    try:
        count = 0
        async for event in client.events.stream(event_filter):
            count += 1
            if count <= 5:
                print(f"Event {count}: {event.global_event_id}")

            if count >= 10:
                break

        print(f"\nStreamed {count} events (stopped early for demo)")
    except Exception as e:
        print(f"Error: {e}")

## Summary

You've learned:
- ✅ How to initialize the GDELTClient
- ✅ Querying Events with filters
- ✅ Using REST APIs (DOC, GEO, Context, TV)
- ✅ Working with GKG data
- ✅ Accessing lookup tables
- ✅ Streaming large datasets efficiently

**Next Steps:**
- Explore `02_advanced_patterns.ipynb` for production patterns
- Check out `03_visualization.ipynb` for data visualization
- Read the [documentation](https://github.com/yourusername/py-gdelt) for detailed API reference