
# üß© Titanic Dataset Exploration with pandas

üéØ **Goal:**  
In this notebook, you'll explore the Titanic dataset using **pandas**.  
You'll practice the most common pandas functions for data inspection, selection, filtering, cleaning, and analysis.

For each function in the list below:
1. Explain what it does (in your own words, in a Markdown cell).
2. Give at least **two examples** using the Titanic dataset.
3. Add a short comment about the output or why it‚Äôs useful.


In [2]:

import pandas as pd

# Load Titanic dataset
# (Make sure titanic.csv is in the same folder as this notebook)
df = pd.read_csv("titanic_dataset.csv")

# Show first few rows
df.head()


Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S



## üß† Step 1: Inspecting the Data

Functions to explore:
- df.head()
- df.tail()
- df.info()
- df.describe()
- df.shape
- df.columns


In [None]:

# df.head()
#Muestra las primeras 5 filas de la tabla

# df.head(10)
#Muestra N numero de filas que se le pasa por el metodo .head(N)

# df.tail()
#Muestra las 5 ultimas filas de la tabla

# df.tail(10)
#Muestra N numero de filas del final pasadas por el metodo .tail(N)

# df.info()
#Este metodo imrpime la informacion de la DataFrame tanto el indice que tenga, valores nulos y el uso de memoria que usa

# df.describe()
#Este metodo duelve el sumatorio, media, minimo, y los porcentajes de un 25%, 50%, 75% y el maximo

# df.shape
#Devuelve una tupla representnado las dimensiones del DataFrame

df.columns
#Devuelve un index con las tablas y el tipo que ser√≠a

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')


## üîç Step 2: Selecting Data

Functions to explore:
- df["column"]
- df[["col1", "col2"]]
- df.loc[]
- df.iloc[]


In [None]:

# Example: Selecting columns
# df["Age"].head()
# df[["Sex", "Age", "Survived"]].head()

# df[["Age"]]
#Muetra solo los valores de la columna pasada pero con una indexsacion
#Si se deja dentro de una lista imprime tambien el encabezado

# df[["Age","Survived"]]
#Si pasas mas de 2 parametros imrpime tambien el encabezado

df.loc[]

KeyError: '22'


## üîé Step 3: Filtering Rows

Functions to explore:
- df[df["Age"] > 30]
- df.query("Sex == 'female' and Survived == 1")


In [None]:

# Example: Filtering data
df[df["Age"] > 50].head()
df.query("Sex == 'female' and Survived == 1").head()



## üßπ Step 4: Handling Missing Data

Functions to explore:
- df.isna()
- df.isna().sum()
- df.dropna()
- df.fillna()


In [None]:

# Example: Check missing values
df.isna().sum()

# Fill missing ages with median
df["Age"] = df["Age"].fillna(df["Age"].median())
df["Age"].head()



## üìä Step 5: Grouping and Aggregating

Functions to explore:
- df.groupby("Sex")["Survived"].mean()
- df["Fare"].mean()
- df["Age"].median()


In [None]:

# Example: Aggregation
df.groupby("Sex")["Survived"].mean()
df.groupby("Pclass")["Fare"].mean()



## üìà Step 6: Sorting and Counting

Functions to explore:
- df.sort_values("Age")
- df["Sex"].unique()
- df["Pclass"].value_counts()


In [None]:

# Example: Sorting and counting
df.sort_values("Age").head()
df["Pclass"].value_counts()



## ‚öôÔ∏è Step 7: Creating or Modifying Columns

Functions to explore:
- df.assign()
- df.apply()
- df["new_col"] = ...
- pd.concat()
- pd.merge()


In [None]:

# Example: Create new column
df["Fare_per_Age"] = df["Fare"] / df["Age"]
df[["Age", "Fare", "Fare_per_Age"]].head()



## üíæ Step 8: Exporting Data

Function to explore:
- df.to_csv("output.csv", index=False)


In [None]:

# Example: Save cleaned data
df.to_csv("titanic_cleaned.csv", index=False)



## üß© Step 9: Summary

Reflect on what you learned:
- Which functions were most useful?
- What insights did you gain from the Titanic dataset?
