# Tutorial 2: ACS 5-Year Aggregate Data

This tutorial covers the most common use case: fetching aggregate statistics from the ACS 5-Year estimates.

**Goal:** Get poverty statistics for all places (cities, towns, CDPs) in a state.

## Setup

In [None]:
import os
from cendat import CenDatHelper
from dotenv import load_dotenv

load_dotenv()
cdh = CenDatHelper(key=os.getenv("CENSUS_API_KEY"))

## Step 1: Find the ACS 5-Year Product

In [None]:
# The \) at the end matches products ending with a closing paren,
# which filters out sub-products like /profile, /subject, etc.
cdh.list_products(years=[2023], patterns=r"acs/acs5\)")
cdh.set_products()

## Step 2: Explore Variable Groups

For products like ACS with thousands of variables, groups are essential:

In [None]:
# Search for poverty-related groups
cdh.list_groups(patterns="poverty")

In [None]:
# Let's use B17001 (Poverty Status by Sex by Age)
cdh.set_groups("B17001")

# See what variables are in this group
cdh.describe_groups()

## Step 3: Select Variables and Geography

In [None]:
# B17001_001E = Total population for poverty calculation
# B17001_002E = Population below poverty level
cdh.set_variables(["B17001_001E", "B17001_002E"])

# 160 = Places (cities, towns, CDPs)
cdh.set_geos(["160"])

## Step 4: Get Data with Names

In [None]:
response = cdh.get_data(
    include_names=True,      # Include place names
    include_attributes=True  # Include margins of error
)

## Step 5: Analyze

In [None]:
# Convert to DataFrame
df = response.to_polars(concat=True, destring=True)
df.glimpse()

In [None]:
# Quick tabulation: how many places have >10,000 population?
response.tabulate("state", where="B17001_001E > 10_000")

In [None]:
# Weighted by population
response.tabulate(
    "state",
    weight_var="B17001_001E",
    where="B17001_001E > 10_000"
)