<a href="https://colab.research.google.com/github/mf3659/Text_Analysis_Final_Project/blob/main/Final_project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python for Public Policy — Final Project

**State-Level Changes in Teen Abortion Shares in the United States, 2006–2016: Policy Drivers and Trends**

**Introduction:**

Teen pregnancy outcomes offer valuable insights into public health and policy developments in the United States. In this paper, I examine state-level data from 2006 to 2016 on pregnancy, birth, and abortion rates to analyze how both the rate of teen pregnancies and the share of those pregnancies ending in abortion changed across states during this period. These variations reflect not only individual choices but also differences in local policies and access to reproductive health services. This study focuses on identifying which U.S. states experienced the largest changes in the share of teen pregnancies ending in abortion and explores the policy developments that may have contributed to these shifts.

The decade from 2006 to 2016 was a period of gradual but meaningful change in reproductive health policy. Several states expanded access to contraception and comprehensive sex education, while others implemented more restrictive abortion regulations [Guttmatcher,2017](https://www.guttmacher.org/report/us-adolescent-pregnancy-trends-2013). Although national teen pregnancy rates declined overall [CDC,2024](https://www.cdc.gov/reproductive-health/teen-pregnancy/index.html?utm_source=chatgpt.com) the proportion of pregnancies ending in abortion varied widely among states, influenced by differing healthcare systems, political environments, and community attitudes toward reproductive rights.

Using Python for data analysis and visualization, this study identifies the top five states that experienced the most significant year over year changes in teen abortion share. These states are: District of Columbia, Vermont, Alaska, Delaware, and New Jersey. Moreover, the paper highlights the timing and direction of these changes. The aim is to describe how abortion shares evolved across states and to link these patterns to possible public health or legislative developments. This research provides a descriptive overview of how teen reproductive outcomes shifted over time and what factors may help explain these changes across the United States.




**Methodology:**

This analysis was conducted using Python due to its versatility and ability to handle, clean, and visualize large datasets efficiently. The main libraries employed were Pandas for data manipulation and analysis, and Plotly for creating interactive visualizations. Pandas was chosen because it provides powerful tools for data cleaning and transformation, including filtering, pivoting, and calculating new variables. Plotly was used to visualize multi-state comparisons over time in a clear and interactive way, ideal for identifying year-to-year changes across states.

The dataset, titled “State Pregnancy, Birth, and Abortion Rates”, was obtained from a public GitHub repository. It includes annual data from 1988 to 2016 on pregnancy, birth, and abortion rates per 1,000 women across all U.S. states, disaggregated by age group. To align with the research question, which focuses on teenage reproductive outcomes, the dataset was filtered to include only the 15–19 age group, representing teens. Observations for the United States as a whole were excluded to focus on state-level variation.

The data directly supports the research question by enabling a comparison of how the share of teen pregnancies ending in abortion changed over time across states. Using Pandas’ pivot_table() function, the dataset was transformed into a wide format, allowing a straightforward calculation of abortion share, defined as the ratio of the abortion rate to the pregnancy rate. This measure captures how the outcome of teen pregnancies differs among states and across years, which is essential to identifying where the largest shifts occurred.

Next, year-to-year changes in abortion share were computed within each state using the .diff() method. The analysis then identified the top five states with the largest one-year changes between 2006 and 2016. These years were chosen due to lack of some data om older years. The states are: District of Columbia, Vermont, Alaska, Delaware, and New Jersey. These were visualized using Plotly’s small-multiple line plots, illustrating the timing and direction of changes.

**Results:**

In [None]:
import pandas as pd
#Loading data
csv_url = "https://raw.githubusercontent.com/strfry688/Abortion-Rates-by-State/refs/heads/main/State%20Pregnancy-Birth-Abortion%20Rates.csv"
df = pd.read_csv(csv_url)


In [None]:
#checking first rows and columns
df.head()


Unnamed: 0,Year,State,Metric,Age Range,"Events per 1,000 women"
0,1988,AL,Abortion Rate,15-17,24.0
1,1992,AL,Abortion Rate,15-17,19.1
2,1996,AL,Abortion Rate,15-17,12.5
3,2000,AL,Abortion Rate,15-17,9.3
4,2005,AL,Abortion Rate,15-17,7.0


In [None]:
#I only want to study teen group
#Filter rows where the Age Range column is "15–19"
teen_df = df[df["Age Range"] == "15-19"].copy()
#check
print("Before filtering:", len(df))
print("After filtering:", len(teen_df))
teen_df.head()

Before filtering: 23172
After filtering: 2574


Unnamed: 0,Year,State,Metric,Age Range,"Events per 1,000 women"
5157,1988,AL,Birth Rate,15-19,63.3
5158,1992,AL,Birth Rate,15-19,72.0
5159,1996,AL,Birth Rate,15-19,67.1
5160,2000,AL,Birth Rate,15-19,60.6
5161,2005,AL,Birth Rate,15-19,48.1


In [None]:
#Droping the US aggregate from the data
teen_df = teen_df[teen_df["State"] != "US"]

#check
print("After filtering:", len(teen_df))

After filtering: 2442


In [None]:
#Wide format: one row per (State, Year), metrics as columns (USED Chatgpt)
#my prompt: I have rows of State, year, Age Range, Metric, my actual numbers are stores in Events per 1,000 women. Help me transform this to have each state-year pair becomes one row and each metric becomes a column.
wide = teen_df.pivot_table(
    index=["State", "Year"],
    columns="Metric",
    values="Events per 1,000 women"
).reset_index()

In [None]:
#Compute abortion share
wide["abortion_share"] = wide["Abortion Rate"] / wide["Pregnancy Rate"]
#check
wide.head()

Metric,State,Year,Abortion Rate,Birth Rate,Pregnancy Rate,abortion_share
0,AK,1988,35.9,55.6,106.3,0.337723
1,AK,1992,30.0,65.2,111.2,0.269784
2,AK,1996,18.9,50.8,81.7,0.231334
3,AK,2000,14.6,48.9,74.7,0.195448
4,AK,2005,15.7,39.9,65.2,0.240798


In [None]:

#sorting dates to calculate the differences
last_10 = wide.sort_values(["State", "Year"])

#check
last_10.head()


Metric,State,Year,Abortion Rate,Birth Rate,Pregnancy Rate,abortion_share
0,AK,1988,35.9,55.6,106.3,0.337723
1,AK,1992,30.0,65.2,111.2,0.269784
2,AK,1996,18.9,50.8,81.7,0.231334
3,AK,2000,14.6,48.9,74.7,0.195448
4,AK,2005,15.7,39.9,65.2,0.240798


In [None]:
#Calculate diff within each state
last_10["abs_change"] = last_10.groupby("State")["abortion_share"].diff()
last_10["pct_change"] = last_10.groupby("State")["abortion_share"].pct_change(fill_method=None) * 100
last10_period = last_10.query("2006 <= Year <= 2016").copy()
#check
last10_period.head(10)

Metric,State,Year,Abortion Rate,Birth Rate,Pregnancy Rate,abortion_share,abs_change,pct_change
5,AK,2006,14.9,41.8,66.6,0.223724,-0.017074,-7.09053
6,AK,2007,14.3,42.9,67.2,0.212798,-0.010926,-4.883749
7,AK,2008,14.8,44.2,69.3,0.213564,0.000767,0.360246
8,AK,2009,14.1,43.2,67.3,0.20951,-0.004055,-1.898518
9,AK,2010,16.9,38.5,64.8,0.260802,0.051293,24.482313
10,AK,2011,14.1,36.4,59.2,0.238176,-0.022627,-8.675836
11,AK,2012,12.1,34.7,55.0,0.22,-0.018176,-7.631206
12,AK,2013,10.1,30.6,47.8,0.211297,-0.008703,-3.955877
13,AK,2014,9.4,28.3,44.3,0.21219,0.000893,0.422412
14,AK,2015,7.8,29.5,44.1,0.176871,-0.035319,-16.644956


In [None]:
#focus on largest states yoy changes
last10_period["abs_change_abs"] = last10_period["abs_change"].abs()

top_changes = (
    last10_period.sort_values("abs_change_abs", ascending=False)
    .groupby("State")
    .head(1)
    .drop(columns="abs_change_abs")
)

top_changes.head(5)

Metric,State,Year,Abortion Rate,Birth Rate,Pregnancy Rate,abortion_share,abs_change,pct_change
124,DC,2013,26.5,32.1,67.7,0.391433,0.09706,32.972022
741,VT,2006,13.7,19.8,38.9,0.352185,-0.055222,-13.554569
9,AK,2010,16.9,38.5,64.8,0.260802,0.051293,24.482313
134,DE,2007,34.3,39.2,84.7,0.404959,0.050322,14.189761
502,NJ,2007,28.7,24.9,61.4,0.467427,-0.047857,-9.287528


In [None]:
print("Top states:", top_changes["State"].head(5).tolist())
print("Rows in last10_period:", len(last10_period))
print("Unique states in last10_period:", sorted(last10_period["State"].unique())[:10])

# Make sure Year is numeric and abortion_share exists & is numeric
last10_period = last10_period.copy()
last10_period["Year"] = pd.to_numeric(last10_period["Year"], errors="coerce")
last10_period["abortion_share"] = pd.to_numeric(last10_period["abortion_share"], errors="coerce")

Top states: ['DC', 'VT', 'AK', 'DE', 'NJ']
Rows in last10_period: 561
Unique states in last10_period: ['AK', 'AL', 'AR', 'AZ', 'CA', 'CO', 'CT', 'DC', 'DE', 'FL']


In [None]:
import plotly.express as px
import plotly.graph_objects as go

# take the 5 states from the table
top_states = top_changes["State"].astype(str).head(5).tolist()
print("Using top_states:", top_states)

# filter safely and show a peek
facet = (last10_period[last10_period["State"].isin(top_states)]
         .dropna(subset=["abortion_share", "Year"])
         .sort_values(["State","Year"])
        )
print("Facet shape:", facet.shape)
print(facet.head())

# if facet is empty, stop early
if facet.empty:
    raise ValueError("Facet dataframe is empty. Check that top_states exist in last10_period['State'].")

fig = px.line(
    facet, x="Year", y="abortion_share",
    facet_col="State", facet_col_wrap=5,
    markers=True,
    title="Teen Abortion Share (2006–2016) — Small Multiples",
    labels={"abortion_share": "Abortion Share"}
)

# cleaner facet titles
fig.for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))

fig.update_traces(hovertemplate="Year %{x}<br>Share %{y:.3f}<extra></extra>")
fig.update_yaxes(tickformat=".3f", rangemode="tozero")
fig.update_layout(hovermode="x unified", showlegend=False)
fig.show()

Using top_states: ['DC', 'VT', 'AK', 'DE', 'NJ']
Facet shape: (55, 9)
Metric State  Year  Abortion Rate  Birth Rate  Pregnancy Rate  abortion_share  \
5         AK  2006           14.9        41.8            66.6        0.223724   
6         AK  2007           14.3        42.9            67.2        0.212798   
7         AK  2008           14.8        44.2            69.3        0.213564   
8         AK  2009           14.1        43.2            67.3        0.209510   
9         AK  2010           16.9        38.5            64.8        0.260802   

Metric  abs_change  pct_change  abs_change_abs  
5        -0.017074   -7.090530        0.017074  
6        -0.010926   -4.883749        0.010926  
7         0.000767    0.360246        0.000767  
8        -0.004055   -1.898518        0.004055  
9         0.051293   24.482313        0.051293  


In [None]:
#(USED CHATGPT) to better represent the data points

import plotly.graph_objects as go
from plotly.subplots import make_subplots

# use  existing tables
top_states = top_changes["State"].head(5).tolist()

# build a 1x5 subplot layout
fig = make_subplots(
    rows=1, cols=5, shared_yaxes=True,
    subplot_titles=top_states
)

for i, st in enumerate(top_states, start=1):
    s = (last10_period[last10_period["State"] == st]
         .sort_values("Year")
         .dropna(subset=["abortion_share"]))

    # main line
    fig.add_trace(
        go.Scatter(
            x=s["Year"], y=s["abortion_share"], mode="lines+markers",
            name=st, showlegend=False, hovertemplate="Year %{x}<br>Share %{y:.3f}<extra></extra>"
        ),
        row=1, col=i
    )

    # spike year (largest absolute change for this state)
    spike_year = int(top_changes.loc[top_changes["State"] == st, "Year"].iloc[0])
    spike_val  = float(s.loc[s["Year"] == spike_year, "abortion_share"].iloc[0])
    spike_chg  = float(s.loc[s["Year"] == spike_year, "abs_change"].iloc[0])

    # highlight spike point
    fig.add_trace(
        go.Scatter(
            x=[spike_year], y=[spike_val],
            mode="markers+text",
            marker=dict(size=10, color="crimson", line=dict(color="black", width=1)),
            text=[f"Δ={spike_chg:+.3f}"],
            textposition="top center",
            showlegend=False,
            hovertemplate="Year %{x}<br>Share %{y:.3f}<extra></extra>"
        ),
        row=1, col=i
    )

    # optional vertical line at spike year
    fig.add_vline(
        x=spike_year, line_width=1, line_dash="dash", line_color="crimson",
        row=1, col=i
    )

fig.update_yaxes(title_text="Abortion Share", tickformat=".3f", rangemode="tozero")
fig.update_layout(
    title="Teen Abortion Share (2006–2016) — Top 5 States (largest one-year changes)",
    hovermode="x unified",
    template="plotly_white",
    height=450, width=1200, margin=dict(l=60, r=20, t=60, b=40)
)

fig.show()

# export
fig.write_html("teen_abortion_share_top5_subplots.html", include_plotlyjs="cdn")

Across 2006–2016, as shown in the figure above, the five largest one-year changes in the share of teen pregnancies ending in abortion occurred in the District of Columbia (DC), Vermont (VT), Alaska (AK), Delaware (DE), and New Jersey (NJ).

The District of Columbia (DC) displays the largest one-year increase in teen abortion share (≈ +0.097) in 2013. This period coincides with the restoration of local public funding for abortion services and expanded clinic access following the easing of federal restrictions that had previously limited DC’s ability to use local tax revenues for abortion care. In 2011, Congress reinstated the federal ban preventing DC from funding abortions for low-income women through its Medicaid program; the restriction was lifted again in 2013, allowing the city to resume funding [Eugene Boyd, 2015](https://www.congress.gov/crs_external_products/R/PDF/R41772/R41772.19.pdf)
. This policy shift likely contributed to improved access for adolescents and low-income residents, reflected in the sharp rise in abortion share that year. DC’s consistently permissive policy environment, with no parental consent or waiting-period requirements [PPFA, 2025](https://www.plannedparenthood.org/learn/teens/stds-birth-control-pregnancy/parental-consent-and-notification-laws), amplified the impact of these funding and access changes. Together, these factors provide a plausible explanation for the notable jump in teen abortion share observed in 2013.

Vermont (VT) experienced a one-year decrease in teen abortion share (≈ −0.055). This change aligns with the state’s expansion of school-based contraception programs and comprehensive sex education in the mid-2000s [Vermont Department of Health, 2017](https://www.healthvermont.gov/sites/default/files/documents/pdf/hsvr_yrbs_db_sexualactivity.pdf). Vermont strengthened collaboration between schools and public health centers to improve access to birth control and reproductive health counseling for adolescents. These efforts, helped reduce unintended teen pregnancies and shifted outcomes toward prevention rather than abortion.

Alaska (AK) shows a sharp drop around 2010 (≈ −0.051), likely linked to a brief period of reduced contraceptive access and policy uncertainty following debates over the 2007–2010 parental notification law. In 2010, voters approved Ballot Measure 2 [Ballotpedia, 2010](https://ballotpedia.org/Alaska_Ballot_Measure_2%2C_Parental_Notification_of_Abortion_Initiative_%28August_2010%29), requiring physicians to notify a parent or guardian and imposing a 48-hour waiting period for minors, which may have temporarily limited access before the law was struck down in 2016.

Delaware (DE) shows a one-year increase in teen abortion share (≈ +0.050) around 2007, consistent with the state’s relatively high abortion rate and clinic availability during the mid-2000s. According to the Guttmacher Institute, Delaware maintained above-average access to abortion providers and relatively few restrictions [Guttmacher,2014](https://www.guttmacher.org/sites/default/files/pdfs/pubs/sfaa/pdf/delaware.pdf), which may explain the short-term rise.

New Jersey (NJ) shows a one-year decrease in teen abortion share (≈ −0.048) in 2007, coinciding with statewide teen pregnancy prevention campaigns and expanded contraceptive outreach. During the mid-2000s, New Jersey’s Department of Health and Family Services strengthened public education and access to contraception [NJDOH, 2017](https://www.nj.gov/health/fhs/maternalchild/teens/teen-pregnancy-prevention//), helping reduce unintended teen pregnancies and contributing to the observed decline.



**Discussion & Conclusion:**

The results indicate that state-level reproductive policies likely played a causal role in shaping teen abortion outcomes between 2006 and 2016. The close timing between specific policy actions، such as D.C.’s restoration of abortion funding and Vermont’s expansion of contraception programs, and subsequent shifts in abortion share suggests these measures influenced teen reproductive behavior rather than reflecting coincidental trends. States that strengthened contraception access generally saw declines, while those restoring abortion services experienced temporary increases, underscoring how policy design can alter reproductive outcomes. These findings highlight that targeted legislative and public health initiatives can meaningfully affect adolescent decision-making and access to care.