# FORMATIVE ASSESSMENT OF ADOLESCENT GIRLS AND YOUNG WOMEN’S HIV, GENDER-BASED VIOLENCE AND SEXUAL AND REPRODUCTIVE HEALTH STATUS

## Background
Teenage pregnancy and motherhood have been a major health and social concern in Uganda as it infringes upon the human rights of girls but also hinders their ability to achieve their full socioeconomic development. Teenagers who engage in sexual intercourse at a young age face an elevated risk of becoming pregnant and giving birth. The 2022 UDHS indicated that 23.5% of women age 15-19 had initiated childbearing by the time of the survey, with 18.4% having already had a live birth, while 5.1% were pregnant with their first child.

Patterns by background characteristics:
* By age 16, 1 in every 10 women age 15-19 has begun childbearing. This percentage significantly rises to almost 4 out of every 10 by the time they reach 18 (Table 5.12).
* Teenagers in rural areas started childbearing earlier than those in urban areas. Twenty five percent of women age 15-19 in rural areas have begun childbearing, compared with 21% in urban areas.
* Teenage childbearing varies by region. The percentage of women age 15-19 who have begun childbearing ranges from 15% in Kigezi region to 28 % -30% in Busoga and Bukedi sub regions.
* The proportion of women age 15-19 who have begun childbearing decreases with both education and wealth.

Regions: The selection of the districts that we surveyed was informed by HIV prevalence dynamics and implementing partner support: we went to districts where there were Global Fund-supported implementing partners working to reduce the new number of new HIV infections among AGYW, improve SRH (e.g. reduce teenage pregnancy) and GBV indicators in the targeted districts.

## Data Analysis

The output of this notebook includes a data analysis responding to the research questions.

### Data Loading

In [2]:
# Libraries
import warnings
import os
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import OneHotEncoder
from spicy import stats
from scipy.stats import zscore
from sklearn.preprocessing import StandardScaler
import statsmodels.api as sm
from sklearn.metrics import r2_score

# Set-up environment
pd.options.display.float_format = '{:.2f}'.format
pd.set_option('display.max_colwidth', None)
sns.set_theme(style="whitegrid", context="paper")
os.chdir('/Users/nataschajademinnitt/Documents/5. Data Analysis/teenage_pregnancy')
print("Current directory:", os.getcwd())
warnings.filterwarnings("ignore")

Current directory: /Users/nataschajademinnitt/Documents/5. Data Analysis/teenage_pregnancy


In [4]:
# Load the data
df = pd.read_csv("./data/processed_df.csv")
df_preg = df.loc[(df['been_preg'] == 1) & (df['age_preg'] <= 19)]

## Research Questions

### Socio-Demographic and Educational Factors

**1. How does household wealth predict the likelihood of teenage pregnancy?**

Sample: Between group (been pregnant = 1,925 | Not been pregnant = 1,513)

Interpretation:
* Girls in the Medium wealth group have odds of teenage pregnancy that are 44% of the odds for girls in the Low wealth group. This implies a 56% reduction in odds compared to the reference group (since 1 - 0.44 = 0.56).
* Girls in the High wealth group have odds of teenage pregnancy that are only 11% of those for girls in the Low wealth group—a reduction of 89% (1 - 0.11 = 0.89).
* This model clearly shows a gradient in risk: as wealth increases, the odds of teenage pregnancy decrease significantly.

Controls:
* Nearly everyone has the same value (e.g., 97% attend school), the control adds little and can cause numerical issues.
* Given that almost all girls who have been married tend to experience teenage pregnancy afterwards, using the marriage variable to predict pregnancy across groups can lead to separation issues and isn’t as informative.
* Household vulnerability was not a significant predictor.

In [10]:
# Create dummies for wealth tertile with 'Low' as reference
df['wealth_tertile'] = pd.Categorical(
    df['wealth_tertile'],
    categories=['Low', 'Medium', 'High'],
    ordered=True
)
# Create wealth dummies from the original df
wealth_dummies = pd.get_dummies(df['wealth_tertile'], prefix='wealth', drop_first=True)

# Concatenate dummies to df
df_model_cat = pd.concat([df, wealth_dummies], axis=1)

# Design matrix using the dummy column names
X_cat = df_model_cat[wealth_dummies.columns]
X_cat = sm.add_constant(X_cat)
X_cat = X_cat.astype(float)
y_cat = df_model_cat['been_preg']

# Fit the logistic regression
model_cat = sm.Logit(y_cat, X_cat).fit(disp=False)
print(model_cat.summary())

# Convert coefficients to odds ratios
or_cat = np.exp(model_cat.params)
print("Odds Ratios (categorical):\n", or_cat)

                           Logit Regression Results                           
Dep. Variable:              been_preg   No. Observations:                 3438
Model:                          Logit   Df Residuals:                     3435
Method:                           MLE   Df Model:                            2
Date:                Sun, 06 Apr 2025   Pseudo R-squ.:                  0.1363
Time:                        16:36:01   Log-Likelihood:                -2037.0
converged:                       True   LL-Null:                       -2358.3
Covariance Type:            nonrobust   LLR p-value:                2.770e-140
                    coef    std err          z      P>|z|      [0.025      0.975]
---------------------------------------------------------------------------------
const             1.2919      0.072     17.983      0.000       1.151       1.433
wealth_Medium    -0.8140      0.094     -8.652      0.000      -0.998      -0.630
wealth_High      -2.2476      0.098    -

**2. What is the relationship between school attendance/educational attainment and the risk of teenage pregnancy?**

Revised subsetting:
* Between‑groups: If you have historical schooling data, compare the educational status at the time of pregnancy for the 1,295 girls with a similar measure from the 1,514 girls (now older than 19) who never experienced pregnancy.
* Rationale: This comparison uses complete risk window information for non‑pregnant girls, reducing bias from including those who might later become pregnant.

**3. Among girls who experienced teenage pregnancy, does being in school versus out of school influence outcomes such as dropout or willingness to return to school?**

Subsetting remains:
* Within‑group: Focus solely on the 1,295 girls who experienced pregnancy between 10–19 years.
* Rationale: This allows you to explore how schooling status at or after pregnancy is associated with subsequent educational outcomes.

### Sexual Behavior and Contraceptive Use

**4. Does the age at first sexual intercourse and the context of that encounter influence the risk of teenage pregnancy?**

Revised subsetting:
* Between‑groups: Compare girls who experienced pregnancy (n = 1,295) with girls who have fully passed through the risk period (n = 1,514).
* Rationale: Using girls older than 19 in the non‑pregnant group gives a full account of sexual behavior and risk exposure without the censoring of younger non‑pregnant girls.

**5. How does early initiation of contraceptive use predict sustained use and lower rates of teenage pregnancy?**

Dual approach:
* For contraceptive behavior: Analyze within the subset of sexually active girls regardless of current age, tracking those who used contraception at first sex versus those who did not.
* For pregnancy risk: Compare the contraceptive patterns of the 1,295 girls with teenage pregnancy to the 1,514 girls who are older than 19 and never experienced pregnancy.
* Rationale: This approach allows you to assess both the immediate impact of early contraceptive use and its association with having avoided pregnancy over the full risk period.

**6. Are there differences in reproductive health knowledge and contraceptive practices between pregnant and non‑pregnant adolescents?**

Revised subsetting:
* Between‑groups: Again, compare the 1,295 girls (pregnancy event during 10–19) with the 1,514 girls older than 19 who have never been pregnant.
* Rationale: This contrast ensures that non‑pregnant girls have had the full window of exposure, which makes differences in knowledge and practices more interpretable.

### Marital Status and Social Norms

**7. Does marriage or a consensual union mediate the relationship between teenage pregnancy and school dropout?**

Revised subsetting:
* Within‑group: Focus on the 1,295 girls who experienced teenage pregnancy, using retrospective data on marital status and schooling at the time of the event.
* Rationale: Since marital status can change over time, using the pregnant subgroup helps clarify the temporal ordering and mediating role of marriage.

**8. How do social norms and attitudes influence teenage pregnancy risk and subsequent reproductive choices?**

Revised subsetting:
* Between‑groups: Compare attitudes among the 1,295 girls (pregnant during 10–19) with those of the 1,514 girls (non‑pregnant, aged >19).
* Rationale: This comparison leverages the complete exposure period for the non‑pregnant group, allowing you to assess whether certain attitudes correlate with having experienced pregnancy.

### Pregnancy Outcomes and Abortion Practices

**9. Among those who experienced teenage pregnancy, what is the prevalence of induced abortion, and what factors predict the likelihood of seeking an abortion?**

Subsetting remains:
* Within‑group: Focus on the 1,295 girls who experienced pregnancy between 10–19.
* Rationale: This focused subgroup allows for detailed analysis of pregnancy outcomes, including abortion practices.

**10. How do the timing and context of pregnancy relate to the decision to induce an abortion, and does this vary by schooling status?**

Subsetting remains:
* Within‑group: Analyze within the 1,295 girls who experienced teenage pregnancy.
* Rationale: This allows you to assess the interplay of timing, contextual factors (e.g., age at pregnancy, schooling status), and abortion decisions without additional confounding from non‑pregnant girls.

### Information Sources and Health Knowledge
**11. What role do different sources of sexual and reproductive health information play in shaping knowledge and practices that affect teenage pregnancy risk?**

Revised subsetting:
* Between‑groups: Compare the 1,295 girls (pregnant during 10–19) with the 1,514 girls (non‑pregnant, aged >19).
* Rationale: Using the full risk window for the non‑pregnant group provides a more definitive comparison of the influence of information sources.

**12. How do misconceptions or a lack of reproductive health knowledge correlate with the occurrence of teenage pregnancy?**

Revised subsetting:
* Between‑groups: Compare the 1,295 girls with teenage pregnancy to the 1,514 older, non‑pregnant girls.
* Rationale: This approach minimizes the potential misclassification of non‑pregnant girls who are still at risk, as the older group has already passed through the adolescent risk window.