# Reading data
Read the train.csv file as a pandas dataframe.

In [7]:
import titanic_fns as tf
import pandas as pd
datos = pd.read_csv("DATA/train.csv")

# Indexing
1. Create a function that returns the name of a passenger given their PassengerId.
2. Create a function that returns the PassengerId of a passenger given their Name.
3. Print a message with the ID of passenger **Montvila, Rev. Juozas** with the following format: 'The ID pf passenger Montvila, Rev. Juozas is ##'
4. Print a message with the name of the passenger with ID **42** with the following format: 'The passenger with ID 42 is X'

In [8]:
## 1. Create a function that returns the name of a passenger given their PassengerId.
def Name_byId(df: pd.DataFrame, id: int) -> str:
    """Regresa el nombre del pasajero dado su id

    Args:
        df (pd.DataFrame): El conjunto de datos
        id (int): El id del pasajero

    Returns:
        str: El nombre del pasajero
    """
    name = df.loc[df["PassengerId"] == id, "Name"]
    return name.iloc[0] if not name.empty else "Passenger not found"

## 2. Create a function that returns the PassengerId of a passenger given their Name.
def Id_byName(df: pd.DataFrame, name: str) -> int:
    """Regresa el id del pasajero dado su nombre

    Args:
        df (pd.DataFrame): El conjunto de datos
        name (str): El nombre del pasajero

    Returns:
        int: El id del pasajero
    """
    id = df.loc[df["Name"] == name, "PassengerId"]
    return id.iloc[0] if not id.empty else -1  # Retorna -1 si no se encuentra el pasajero

## 3. Print a message with the ID of passenger **Montvila, Rev. Juozas**
passenger_name = "Montvila, Rev. Juozas"
passenger_id = Id_byName(datos, passenger_name)

print(f'The ID of passenger {passenger_name} is {passenger_id}')

## 4. Print a message with the name of the passenger with ID **42**
passenger_id_42 = 42
passenger_name_42 = Name_byId(datos, passenger_id_42)

print(f'The passenger with ID {passenger_id_42} is {passenger_name_42}')

The ID of passenger Montvila, Rev. Juozas is 887
The passenger with ID 42 is Turpin, Mrs. William John Robert (Dorothy Ann Wonnacott)


5. Print all information about the oldest passenger.

oldest_passenger = datos[datos["Age"] == datos["Age"].max()]

print(oldest_passenger)

# Subseting
We are asked to share data for analysis by a third party. Since our dataset contains personal details, we only want to share with them the following information: ticket classes, fares and port of embarkation. We are asked to deliver a sample of the first 100 rows of this dataset.

6. Create and save the new dataset in **data/port_fares.csv**.

In [None]:
subset = datos[["Pclass", "Fare", "Embarked"]].head(100)
subset.to_csv("data/port_fares.csv", index=False)

print("The file 'data/port_fares.csv' has been uploaded successfully.")

El archivo 'data/port_fares.csv' ha sido guardado correctamente.


# Counting
7. We want to know if there were any survivors over the age of 60, print all of their information.
8. How many people over 60 survived?
9. What percentage of people over 60 survived?

In [9]:
#7. We want to know if there were any survivors over the age of 60, print all of their information.
survivors_over_60 = datos[(datos["Age"] > 60) & (datos["Survived"] == 1)]
print("Survivors over 60:")
print(survivors_over_60)
#8. How many people over 60 survived?
num_survivors_over_60 = survivors_over_60.shape[0]
print(f"Number of survivors over 60: {num_survivors_over_60}")
#9. What percentage of people over 60 survived?
total_over_60 = datos[datos["Age"] > 60].shape[0]
percentage_survived_over_60 = (num_survivors_over_60 / total_over_60) * 100 if total_over_60 > 0 else 0
print(f"Percentage of people over 60 who survived: {percentage_survived_over_60:.2f}%")


Survivors over 60:
     PassengerId  Survived  Pclass                                       Name  \
275          276         1       1          Andrews, Miss. Kornelia Theodosia   
483          484         1       3                     Turkula, Mrs. (Hedwig)   
570          571         1       2                         Harris, Mr. George   
630          631         1       1       Barkworth, Mr. Algernon Henry Wilson   
829          830         1       1  Stone, Mrs. George Nelson (Martha Evelyn)   

        Sex   Age  SibSp  Parch       Ticket     Fare Cabin Embarked  
275  female  63.0      1      0        13502  77.9583    D7        S  
483  female  63.0      0      0         4134   9.5875   NaN        S  
570    male  62.0      0      0  S.W./PP 752  10.5000   NaN        S  
630    male  80.0      0      0        27042  30.0000   A23        S  
829  female  62.0      0      0       113572  80.0000   B28      NaN  
Number of survivors over 60: 5
Percentage of people over 60 who surv

# Women and children first?
10. Find out if women and children were more likely to survive.

11. Write a function that returns the percentage of people that survived from a subset given as a boolean Pandas series.

# Summarizing

12. What is the median age of the passengers?
13. How many passengers embarked from each port?

14. Generate two hypotheses about how does the survival rate differ among groups of passengers. Write your code to explore both hypotheses.