# Data Exploration: Adolescent Sexual Behavior Survey

This notebook performs initial exploration of survey data on adolescent sexual behavior.

**Research Question:** What behavioral profiles exist among adolescents regarding sexual activity?

**Dataset:** Survey responses from adolescents (13-17 years) in a rural area covering:
- Demographics
- Sexual behavior and attitudes
- Partnership status
- Risk perceptions

**Author:** Isabella Rodas  
**Last Updated:** October 2025


## 1. Setup & Imports


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import missingno as msno
import warnings
import sys
sys.path.append('..')
warnings.filterwarnings('ignore')

# Set visualization defaults
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("✓ Libraries loaded successfully")


## 2. Load and Explore Data

**Note:** This notebook requires access to the raw data files. See `Data/README.md` for information about data access.


In [None]:
# Load data - adjust path if needed
data = pd.read_excel('../Data/0_Raw/2. Participants attributes.xlsx', 
                     sheet_name='IsBaru_Consolidado')

print(f"Dataset shape: {data.shape}")
print(f"Participants: {data.shape[0]} | Variables: {data.shape[1]}")
data.head()


## Next Steps

Continue with:
- `02_lca_preprocessing.ipynb` - Missing data handling and preprocessing for LCA
- `03_clustering_analysis.ipynb` - K-modes clustering analysis
- Then proceed to R scripts in `scripts/` for LCA analysis

**For detailed preprocessing code, see the original 2021 analysis notebooks in `original_code/` directory.**
