

```markdown
### Desafío: Análisis de Datos Demográficos con Pandas

En este desafío debes **analizar datos demográficos utilizando Pandas**. Se te proporciona un conjunto de datos extraído de la base de datos del **Censo de 1994**.

---


### 🧠 Preguntas a responder usando Pandas:

1. ¿Cuántas personas hay de cada raza en este conjunto de datos?  
   *(Debe devolverse como una Serie de Pandas con los nombres de las razas como índice — columna `race`).*

2. ¿Cuál es la edad promedio de los hombres?

3. ¿Cuál es el porcentaje de personas con título de **Bachillerato** (`Bachelors`)?

4. ¿Qué porcentaje de personas con **educación avanzada** (*Bachelors*, *Masters* o *Doctorate*) gana más de **50K**?

5. ¿Qué porcentaje de personas **sin educación avanzada** gana más de **50K**?

6. ¿Cuál es el número mínimo de horas que una persona trabaja por semana?

7. ¿Qué porcentaje de personas que trabajan el número mínimo de horas por semana tienen un salario mayor a **50K**?

8. ¿Qué país tiene el mayor porcentaje de personas que ganan **>50K** y cuál es ese porcentaje?

9. ¿Cuál es la ocupación más popular entre quienes ganan **>50K en India**?

---

### 🧪 Requisitos técnicos

- Usa el código base en el archivo `demographic_data_analyzer.py`.
- Actualiza el código para que todas las variables inicializadas en `None` sean reemplazadas con los cálculos correspondientes.
- **Redondea todos los decimales a la décima más cercana.**
```



In [1]:
import pandas as pd

In [2]:
df = pd.read_csv("datasources/1994_census.csv")
df.head()

Unnamed: 0,age,workclass,fnlwgt,education,education.num,marital.status,occupation,relationship,race,sex,capital.gain,capital.loss,hours.per.week,native.country,income
0,90,?,77053,HS-grad,9,Widowed,?,Not-in-family,White,Female,0,4356,40,United-States,<=50K
1,82,Private,132870,HS-grad,9,Widowed,Exec-managerial,Not-in-family,White,Female,0,4356,18,United-States,<=50K
2,66,?,186061,Some-college,10,Widowed,?,Unmarried,Black,Female,0,4356,40,United-States,<=50K
3,54,Private,140359,7th-8th,4,Divorced,Machine-op-inspct,Unmarried,White,Female,0,3900,40,United-States,<=50K
4,41,Private,264663,Some-college,10,Separated,Prof-specialty,Own-child,White,Female,0,3900,40,United-States,<=50K


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32561 entries, 0 to 32560
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   age             32561 non-null  int64 
 1   workclass       32561 non-null  object
 2   fnlwgt          32561 non-null  int64 
 3   education       32561 non-null  object
 4   education.num   32561 non-null  int64 
 5   marital.status  32561 non-null  object
 6   occupation      32561 non-null  object
 7   relationship    32561 non-null  object
 8   race            32561 non-null  object
 9   sex             32561 non-null  object
 10  capital.gain    32561 non-null  int64 
 11  capital.loss    32561 non-null  int64 
 12  hours.per.week  32561 non-null  int64 
 13  native.country  32561 non-null  object
 14  income          32561 non-null  object
dtypes: int64(6), object(9)
memory usage: 3.7+ MB


In [4]:
df.shape

(32561, 15)

In [6]:

def calculate_demographic_data(print_data=True):
    # Read data from file
    df = pd.read_csv("datasources/1994_census.csv")
    
    # How many of each race are represented in this dataset? This should be a Pandas series with race names as the index labels.
    race_count = df['race'].value_counts()

    # What is the average age of men?
    men_population = df[df['sex'] == "Male"]
    average_age_men = men_population['age'].mean()

    # What is the percentage of people who have a Bachelor's degree?
    total_population= sum(df['education'].value_counts())
    bachelor_population = sum(df[df['education'] == "Bachelors"].value_counts())
    percentage_bachelors = f"{round((bachelor_population/total_population)*100, 2)}"

    # What percentage of people with advanced education (`Bachelors`, `Masters`, or `Doctorate`) make more than 50K?
   
    high_education_population = sum(df[df['education'].isin(['Bachelors', 'Masters', 'Doctorate'])].value_counts())
    higher_education = f"{round((high_education_population/total_population)*100, 2)} "
    
    # What percentage of people without advanced education make more than 50K?
    lower_education_population = sum(df[~df['education'].isin(['Bachelors', 'Masters', 'Doctorate'])].value_counts())
    lower_education = f"{round((lower_education_population/total_population)*100, 2)} "

    
    # percentage with salary >50K
    higher_education_rich_population = ((df['education'].isin(['Bachelors', 'Masters', 'Doctorate'])) & (df['income']=='>50K')).sum().tolist()
    higher_education_rich = round((higher_education_rich_population/total_population)*100,2)

    lower_education_rich_population = ((~df['education'].isin(['Bachelors', 'Masters', 'Doctorate'])) & (df['income']=='>50K')).sum().tolist()
    lower_education_rich = round((lower_education_rich_population/total_population)*100,2)

    # What is the minimum number of hours a person works per week (hours-per-week feature)?
    min_work_hours = df['hours.per.week'].min()

    # What percentage of the people who work the minimum number of hours per week have a salary of >50K?
    num_min_workers = (df['hours.per.week'] == df['hours.per.week'].min()).sum().tolist()

    rich_percentage = f"{round((num_min_workers/total_population)*100,2)} "

    # What country has the highest percentage of people that earn >50K?
    high_earning_population = df[df['income']== '>50K']
    highest_earning_country = high_earning_population['native.country'].value_counts().idxmax()
    highest_earning_country_percentage = f'{round((high_earning_population["native.country"].value_counts().max() / high_earning_population["native.country"].value_counts().sum()) * 100, 2)}%'
    # Identify the most popular occupation for those who earn >50K in India.
    rich_india_population = df[(df['income']== '>50K') & (df['native.country']=='India')]
    top_IN_occupation = rich_india_population['occupation'].value_counts()

    # DO NOT MODIFY BELOW THIS LINE

    if print_data:
        print("Number of each race:\n", race_count) 
        print("Average age of men:", average_age_men)
        print(f"Percentage with Bachelors degrees: {percentage_bachelors}%")
        print(f"Percentage with higher education that earn >50K: {higher_education_rich}%")
        print(f"Percentage without higher education that earn >50K: {lower_education_rich}%")
        print(f"Min work time: {min_work_hours} hours/week")
        print(f"Percentage of rich among those who work fewest hours: {rich_percentage}%")
        print("Country with highest percentage of rich:", highest_earning_country)
        print(f"Highest percentage of rich people in country: {highest_earning_country_percentage}%")
        print("Top occupations in India:", top_IN_occupation)

    return {
        'race_count': race_count,
        'average_age_men': average_age_men,
        'percentage_bachelors': percentage_bachelors,
        'higher_education_rich': higher_education_rich,
        'lower_education_rich': lower_education_rich,
        'min_work_hours': min_work_hours,
        'rich_percentage': rich_percentage,
        'highest_earning_country': highest_earning_country,
        'highest_earning_country_percentage':
        highest_earning_country_percentage,
        'top_IN_occupation': top_IN_occupation
    }


In [7]:
calculate_demographic_data()

Number of each race:
 race
White                 27816
Black                  3124
Asian-Pac-Islander     1039
Amer-Indian-Eskimo      311
Other                   271
Name: count, dtype: int64
Average age of men: 39.43354749885268
Percentage with Bachelors degrees: 16.45%
Percentage with higher education that earn >50K: 10.71%
Percentage without higher education that earn >50K: 13.37%
Min work time: 1 hours/week
Percentage of rich among those who work fewest hours: 0.06 %
Country with highest percentage of rich: United-States
Highest percentage of rich people in country: 91.46%%
Top occupations in India: occupation
Prof-specialty      25
Exec-managerial      8
Tech-support         2
Other-service        2
Transport-moving     1
Sales                1
Adm-clerical         1
Name: count, dtype: int64


{'race_count': race
 White                 27816
 Black                  3124
 Asian-Pac-Islander     1039
 Amer-Indian-Eskimo      311
 Other                   271
 Name: count, dtype: int64,
 'average_age_men': np.float64(39.43354749885268),
 'percentage_bachelors': '16.45',
 'higher_education_rich': 10.71,
 'lower_education_rich': 13.37,
 'min_work_hours': np.int64(1),
 'rich_percentage': '0.06 ',
 'highest_earning_country': 'United-States',
 'highest_earning_country_percentage': '91.46%',
 'top_IN_occupation': occupation
 Prof-specialty      25
 Exec-managerial      8
 Tech-support         2
 Other-service        2
 Transport-moving     1
 Sales                1
 Adm-clerical         1
 Name: count, dtype: int64}