# Conjunto de datos sobre ansiedad social

#### Descripción
Este conjunto de datos contiene más de **10 000 muestras** que representan a personas con distintos niveles de ansiedad social, desde leve hasta grave.  

#### Contexto
La **ansiedad social** (fobia social) afecta a millones de personas en todo el mundo.  
Está relacionada con una combinación de:
- Patrones de comportamiento  
- Estados psicológicos  
- Estilo de vida  
- Predisposiciones genéticas  

Este dataset sintético refleja patrones del mundo real e incluye casos de alta ansiedad para apoyar la investigación en detección e intervención.

#### Características incluidas
- **Datos demográficos**: edad, sexo, ocupación  
- **Estilo de vida**: horas de sueño, actividad física, calidad de la dieta, consumo de alcohol, consumo de cafeína, hábitos de tabaquismo  
- **Indicadores de salud y mentales**: frecuencia cardíaca, frecuencia respiratoria, nivel de estrés, nivel de sudoración, mareos  
- **Historial de salud mental**: antecedentes familiares de ansiedad, uso de medicamentos, frecuencia de terapia  
- **Acontecimientos vitales**: eventos importantes recientes  
- **Variable objetivo**: nivel de ansiedad (1–10), cuantifica la intensidad de la ansiedad social  

**Fuente del dataset**: [Social Anxiety Dataset en Kaggle](https://www.kaggle.com/datasets/natezhang123/social-anxiety-dataset)

In [5]:
import pandas as pd

In [6]:
data = pd.read_csv('../data/raw/enhanced_anxiety_dataset.csv')
data

Unnamed: 0,Age,Gender,Occupation,Sleep Hours,Physical Activity (hrs/week),Caffeine Intake (mg/day),Alcohol Consumption (drinks/week),Smoking,Family History of Anxiety,Stress Level (1-10),Heart Rate (bpm),Breathing Rate (breaths/min),Sweating Level (1-5),Dizziness,Medication,Therapy Sessions (per month),Recent Major Life Event,Diet Quality (1-10),Anxiety Level (1-10)
0,29,Female,Artist,6.0,2.7,181,10,Yes,No,10,114,14,4,No,Yes,3,Yes,7,5.0
1,46,Other,Nurse,6.2,5.7,200,8,Yes,Yes,1,62,23,2,Yes,No,2,No,8,3.0
2,64,Male,Other,5.0,3.7,117,4,No,Yes,1,91,28,3,No,No,1,Yes,1,1.0
3,20,Female,Scientist,5.8,2.8,360,6,Yes,No,4,86,17,3,No,No,0,No,1,2.0
4,49,Female,Other,8.2,2.3,247,4,Yes,No,1,98,19,4,Yes,Yes,1,No,3,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10995,23,Female,Engineer,6.1,3.1,566,9,Yes,No,8,91,28,1,Yes,Yes,1,No,3,6.0
10996,50,Other,Teacher,6.6,3.6,64,17,Yes,No,7,95,17,3,No,No,2,No,7,3.0
10997,29,Male,Nurse,6.7,6.9,159,14,No,No,8,72,16,1,Yes,Yes,2,Yes,7,4.0
10998,53,Other,Artist,5.7,2.7,248,8,No,No,4,112,28,3,Yes,Yes,1,Yes,2,4.0


In [8]:
data = pd.read_csv('../data/raw/family_anxiety_14_dataset.csv')
data

Unnamed: 0,Age,Gender,Occupation,Sleep Hours,Physical Activity (hrs/week),Caffeine Intake (mg/day),Alcohol Consumption (drinks/week),Smoking,Family History of Anxiety,Stress Level (1-10),Heart Rate (bpm),Breathing Rate (breaths/min),Sweating Level (1-5),Dizziness,Medication,Therapy Sessions (per month),Recent Major Life Event,Diet Quality (1-10),Anxiety Level (1-10)
0,58,Male,Nurse,6.2,1.3,192,16,No,Yes,1,117,21,1,No,Yes,9,No,1,3.0
1,39,Female,Engineer,8.6,3.8,367,15,No,No,10,113,14,3,Yes,Yes,6,Yes,3,4.0
2,42,Female,Doctor,6.6,0.5,132,1,No,No,10,79,20,1,Yes,No,1,No,6,6.0
3,43,Female,Athlete,7.0,1.6,361,15,No,No,4,69,25,2,Yes,No,5,Yes,10,2.0
4,55,Other,Athlete,7.6,2.8,531,0,No,No,3,65,12,4,No,No,6,Yes,4,4.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10995,42,Male,Athlete,11.2,3.3,2,3,No,No,1,65,26,3,No,No,5,No,2,2.0
10996,45,Female,Lawyer,6.2,3.6,31,18,No,No,4,111,23,2,No,Yes,4,No,1,4.0
10997,38,Male,Lawyer,7.2,2.5,222,6,No,Yes,10,98,28,3,No,No,5,Yes,6,6.0
10998,28,Other,Lawyer,8.0,6.3,405,13,No,No,8,118,27,1,Yes,Yes,2,Yes,1,3.0


In [7]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11000 entries, 0 to 10999
Data columns (total 19 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   Age                                11000 non-null  int64  
 1   Gender                             11000 non-null  object 
 2   Occupation                         11000 non-null  object 
 3   Sleep Hours                        11000 non-null  float64
 4   Physical Activity (hrs/week)       11000 non-null  float64
 5   Caffeine Intake (mg/day)           11000 non-null  int64  
 6   Alcohol Consumption (drinks/week)  11000 non-null  int64  
 7   Smoking                            11000 non-null  object 
 8   Family History of Anxiety          11000 non-null  object 
 9   Stress Level (1-10)                11000 non-null  int64  
 10  Heart Rate (bpm)                   11000 non-null  int64  
 11  Breathing Rate (breaths/min)       11000 non-null  int

In [11]:
data.describe()

Unnamed: 0,Age,Sleep Hours,Physical Activity (hrs/week),Caffeine Intake (mg/day),Alcohol Consumption (drinks/week),Stress Level (1-10),Heart Rate (bpm),Breathing Rate (breaths/min),Sweating Level (1-5),Therapy Sessions (per month),Diet Quality (1-10),Anxiety Level (1-10)
count,11000.0,11000.0,11000.0,11000.0,11000.0,11000.0,11000.0,11000.0,11000.0,11000.0,11000.0,11000.0
mean,40.241727,6.650691,2.942136,286.09,9.701636,5.856364,90.916,20.957545,3.080636,2.427818,5.181818,3.929364
std,13.23614,1.227509,1.827825,144.813157,5.689713,2.927202,17.325721,5.160107,1.398877,2.183106,2.895243,2.122533
min,18.0,2.3,0.0,0.0,0.0,1.0,60.0,12.0,1.0,0.0,1.0,1.0
25%,29.0,5.9,1.5,172.0,5.0,3.0,76.0,17.0,2.0,1.0,3.0,2.0
50%,40.0,6.7,2.8,273.0,10.0,6.0,92.0,21.0,3.0,2.0,5.0,4.0
75%,51.0,7.5,4.2,382.0,15.0,8.0,106.0,25.0,4.0,4.0,8.0,5.0
max,64.0,11.3,10.1,599.0,19.0,10.0,119.0,29.0,5.0,12.0,10.0,10.0
