# Task 01 - Track B: Python Basics & Advanced Dataset Exploration

**Course:** Database Applications Development  
**Lesson:** 01 - Introduction to JupyterLab & Python Fundamentals  

---

## Instructions

Complete all exercises in this notebook. Each section includes example code to help guide you. Read the examples carefully, then complete the exercises below them.

**Track B** includes all Track A exercises PLUS additional challenges and deeper analysis.

**Submission:**
1. Save this notebook as `dbAppsTask01TrackB.ipynb`
2. Add, commit, and push to your `databaseApplications` repository on GitHub
3. Verify the file appears correctly on GitHub

---

## Titanic Dataset - Data Dictionary

You'll be working with the Titanic dataset throughout this task. Here's what each column means:

| Column | Description | Data Type | Example Values |
|--------|-------------|-----------|----------------|
| **pclass** | Passenger class (ticket class) | Integer | 1 = First Class<br>2 = Second Class<br>3 = Third Class |
| **survived** | Survival status | Integer | 0 = Did not survive<br>1 = Survived |
| **name** | Passenger's full name | String | "Braund, Mr. Owen Harris" |
| **sex** | Gender | String | "male" or "female" |
| **age** | Age in years | Float | 22.0, 38.0, 26.0 |
| **sibsp** | Number of siblings/spouses aboard | Integer | 0, 1, 2, etc. |
| **parch** | Number of parents/children aboard | Integer | 0, 1, 2, etc. |
| **ticket** | Ticket number | String | "A/5 21171", "PC 17599" |
| **fare** | Passenger fare (ticket price) | Float | 7.25, 71.28, 8.05 |
| **cabin** | Cabin number | String | "C85", "E46", "B96 B98" |
| **embarked** | Port of embarkation | String | C = Cherbourg<br>Q = Queenstown<br>S = Southampton |
| **boat** | Lifeboat number | String | "13", "4", "D" |
| **body** | Body identification number | Integer | For victims recovered |
| **home.dest** | Home/destination | String | "New York, NY", "Montreal, PQ" |

**Note:** Some columns may have missing values (NaN = Not a Number), which is common in real-world datasets.

---

## Part 1: Variables and Data Types

Variables store data for later use. Python has several built-in data types.

### Example: Creating Variables

In [1]:
# Example of different data types
student_name = "Alice Johnson"  # String (text)
student_age = 17                # Integer (whole number)
student_gpa = 3.75              # Float (decimal number)
is_passing = True               # Boolean (True/False)

# Print with descriptive labels
print("Name:", student_name)
print("Age:", student_age)
print("GPA:", student_gpa)
print("Passing:", is_passing)

Name: Alice Johnson
Age: 17
GPA: 3.75
Passing: True


### Exercise 1.1: Create Your Own Variables

Create the following variables with your own information:
- `my_name` - your full name (string)
- `my_age` - your age (integer)
- `my_gpa` - a GPA value (float)
- `enrolled` - whether you're enrolled in this course (boolean)

Print each variable with a descriptive label.

In [2]:
# Your code here
my_name = "Gavin Waibel"
my_age = 17
my_gpa = 4.00
enrolled = True

# Print statements
print(f"{my_name} is {my_age} and is {enrolled} (enrolled) with a GPA of {my_gpa}")

Gavin Waibel is 17 and is True (enrolled) with a GPA of 4.0


### Example: Checking Data Types

In [3]:
# Use type() to check what kind of data is stored
course_name = "Database Applications"
course_code = 145085

print("Type of course_name:", type(course_name))  # <class 'str'>
print("Type of course_code:", type(course_code))  # <class 'int'>

Type of course_name: <class 'str'>
Type of course_code: <class 'int'>


### Exercise 1.2: Check Data Types

Use the `type()` function to check the data type of each variable you created in Exercise 1.1. Print the results.

In [4]:
# Your code here
my_name = "Gavin Waibel"
my_age = 17
my_gpa = 4.00
enrolled = True

print("Type of my_name:", type(my_name))
print("Type of my_age:", type(my_age))
print("Type of my_gpa:", type(my_gpa))
print("Type of enrolled:", type(enrolled))

Type of my_name: <class 'str'>
Type of my_age: <class 'int'>
Type of my_gpa: <class 'float'>
Type of enrolled: <class 'bool'>


### Example: Type Conversion (Track B)

In [5]:
# Sometimes you need to convert between data types
number_as_string = "42"
print("Original type:", type(number_as_string))  # <class 'str'>

# Convert string to integer
number_as_int = int(number_as_string)
print("Converted type:", type(number_as_int))    # <class 'int'>

# Now we can do math with it
result = number_as_int + 8
print("Result:", result)  # 50

Original type: <class 'str'>
Converted type: <class 'int'>
Result: 50


### Exercise 1.3 (Track B): Type Conversion

Create a variable `grade_string = "87"` (a string).

1. Convert it to an integer and store it in `grade_int`
2. Add 5 points to it
3. Print the result with a descriptive message

In [6]:
# Your code here
grade_string = "87"
grade_int = int(grade_string)
print("Grade's new data type and value: ", grade_int, " ", type(grade_int))
grade_int += 5
print("Grade's new value after +5: " , grade_int)

Grade's new data type and value:  87   <class 'int'>
Grade's new value after +5:  92


---

## Part 2: Basic Operations

Python can perform calculations and manipulate strings.

### Example: Mathematical Operations

In [7]:
# Calculate total cost for movie tickets
ticket_price = 10.50
number_of_tickets = 4

total_cost = ticket_price * number_of_tickets

print("Total cost for", number_of_tickets, "tickets: $", total_cost)

Total cost for 4 tickets: $ 42.0


### Exercise 2.1: Calculate Ticket Prices

A theater charges:
- Adult tickets: $12.50
- Child tickets: $7.50

Calculate the total cost for a family with:
- 2 adults
- 3 children

Store the result in a variable called `total_cost` and print it with a descriptive message.

In [8]:
# Your code here
adult_price = 12.50
child_price = 7.50
num_adults = 2
num_children = 3

total_cost = (adult_price * num_adults) + (child_price * num_children)

print(total_cost)


47.5


### Example: String Concatenation

In [9]:
# Combining strings together
first = "John"
last = "Smith"
age = 25

# Method 1: Using + operator
full_name = first + " " + last
print(full_name)

# Method 2: Using f-strings (formatted string literals)
message = f"Hello! My name is {first} {last} and I am {age} years old."
print(message)

John Smith
Hello! My name is John Smith and I am 25 years old.


### Exercise 2.2: String Concatenation

Create three variables:
- `first_name`
- `last_name`
- `favorite_subject`

Combine them to create and print a sentence like:  
"Hello! My name is [first] [last] and my favorite subject is [subject]."

You can use either the + operator or f-strings.

In [10]:
# Your code here
first_name = "Gavin"
last_name = "Waibel"
favorite_subject = "SEWD"

sentence = f"Hello! My name is {first_name} {last_name} and my favorite subject is {favorite_subject}"

print(sentence)


Hello! My name is Gavin Waibel and my favorite subject is SEWD


### Example: Integer Division and Modulus (Track B)

In [11]:
# Integer division (//) gives you the whole number part
# Modulus (%) gives you the remainder

total_students = 47
students_per_team = 5

full_teams = total_students // students_per_team  # How many complete teams?
leftover = total_students % students_per_team     # How many students left over?

print(f"We can make {full_teams} full teams with {leftover} students left over.")

We can make 9 full teams with 2 students left over.


### Exercise 2.3 (Track B): Advanced Calculations

Calculate the following:

1. If a student scores 87% on a test worth 150 points, how many points did they earn? (Use regular division, then convert to int)
2. A dataset has 1309 rows. If you split it into batches of 50, how many full batches do you get? (Use `//`)
3. How many remaining rows don't fit in a full batch? (Use modulus `%`)

In [12]:
# Your code here
test_value = int(0.87 * 150)
batch_num = 1309 // 50
remainder = 1309 % 50
print(test_value)
print(batch_num)
print(remainder)

130
26
9


### Example: String Methods (Track B)

In [13]:
# Strings have built-in methods for manipulation
text = "hello world"

print(text.upper())       # HELLO WORLD
print(text.title())       # Hello World
print(text.replace("world", "Python"))  # hello Python
print(text.count("l"))    # 3 (counts how many times 'l' appears)

HELLO WORLD
Hello World
hello Python
3


### Exercise 2.4 (Track B): String Methods

Create a variable `passenger_name = "smith, mr. john"`. 

Use string methods to:
1. Convert it to uppercase
2. Convert it to title case (first letter of each word capitalized)
3. Replace "mr." with "Mr."
4. Count how many times the letter 'm' appears (lowercase)

In [14]:
# Your code here
passenger_name = "smith, mr. john"
pass_name_upper = passenger_name.upper()
pass_name_title = passenger_name.title()
pass_name_replace = passenger_name.replace("mr.", "Mr.")
m_count = passenger_name.count("m")

print(pass_name_upper)
print(pass_name_title)
print(pass_name_replace)
print(m_count)

SMITH, MR. JOHN
Smith, Mr. John
smith, Mr. john
2


---

## Part 3: Working with Comments

Comments explain what your code does. They're ignored by Python but help humans understand your work.

### Example: Using Comments

In [15]:
# Calculate the area of a circle
# Formula: area = π × radius²

radius = 5          # Circle radius in centimeters
pi = 3.14159        # Approximation of π (pi)

area = pi * radius ** 2  # ** means "to the power of"

print("Area of circle:", area, "square cm")

Area of circle: 78.53975 square cm


### Exercise 3.1: Add Comments

The code below calculates the area of a rectangle. Add comments to explain each step.

In [16]:
# Calculate the area of a rectangle
# Formula: area = length * width

length = 10
width = 5
area = length * width
print("The area is:", area) # Print Result


The area is: 50


### Exercise 3.2 (Track B): Write Documented Code

Write code that calculates the perimeter of a rectangle.

**Formula:** perimeter = 2 × length + 2 × width

Use variables for length and width, and add comments explaining each step.

In [17]:
# Your code here
# Formula: perimeter = 2 × length + 2 × width

length = 2
width = 2
perimeter = (2 * length) + (2 * width)

print(perimeter)

8


---

## Part 4: Introduction to Pandas

Pandas is Python's most powerful library for working with data. We'll use it to load and explore the Titanic dataset.

### Example: Importing Pandas

In [18]:
# Import pandas with the standard alias 'pd'
import pandas as pd

# Now we can use pandas functions by typing pd.function_name()
print("Pandas version:", pd.__version__)

Pandas version: 2.2.3


### Exercise 4.1: Import Pandas

Import the pandas library using the standard alias `pd`.

In [19]:
# Your code here
import pandas as pd

print(pd.__version__)

2.2.3


### Example: Loading a CSV File

In [20]:
# Load a CSV file into a DataFrame
# Replace 'sample.csv' with your actual filename
data = pd.read_csv('sample.csv')

# The data is now stored in a DataFrame object
# A DataFrame is like a spreadsheet - it has rows and columns

### Exercise 4.2: Load the Titanic Dataset

Use pandas to read the `Titanic_Dataset.csv` file into a DataFrame called `titanic`.

**Note:** Make sure the CSV file is in the same folder as this notebook!

In [21]:
# Your code here
titanic = pd.read_csv('Titanic_Dataset.csv')


### Example: Viewing the First Rows

In [22]:
# View the first 5 rows of any DataFrame
data.head()

# You can specify how many rows to show
data.head(10)  # Shows first 10 rows

NameError: name 'data' is not defined

### Exercise 4.3: View the First Rows

Display the first 10 rows of the Titanic dataset using the `.head(10)` method.

In [23]:
# Your code here
titanic.head(10)

Unnamed: 0,pclass,survived,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked,boat,body,home.dest
0,1,1,"Allen, Miss. Elisabeth Walton",female,29.0,0,0,24160,211.3375,B5,S,2,,"St Louis, MO"
1,1,1,"Allison, Master. Hudson Trevor",male,0.92,1,2,113781,151.55,C22 C26,S,11,,"Montreal, PQ / Chesterville, ON"
2,1,0,"Allison, Miss. Helen Loraine",female,2.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON"
3,1,0,"Allison, Mr. Hudson Joshua Creighton",male,30.0,1,2,113781,151.55,C22 C26,S,,135.0,"Montreal, PQ / Chesterville, ON"
4,1,0,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",female,25.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON"
5,1,1,"Anderson, Mr. Harry",male,48.0,0,0,19952,26.55,E12,S,3,,"New York, NY"
6,1,1,"Andrews, Miss. Kornelia Theodosia",female,63.0,1,0,13502,77.9583,D7,S,10,,"Hudson, NY"
7,1,0,"Andrews, Mr. Thomas Jr",male,39.0,0,0,112050,0.0,A36,S,,,"Belfast, NI"
8,1,1,"Appleton, Mrs. Edward Dale (Charlotte Lamson)",female,53.0,2,0,11769,51.4792,C101,S,D,,"Bayside, Queens, NY"
9,1,0,"Artagaveytia, Mr. Ramon",male,71.0,0,0,PC 17609,49.5042,,C,,22.0,"Montevideo, Uruguay"


### Example: Dataset Information

In [None]:
# Get information about the DataFrame structure
data.info()

# This shows:
# - Number of rows and columns
# - Column names
# - Data types of each column
# - How many non-null (non-missing) values in each column

### Exercise 4.4: Dataset Information

Use the `.info()` method to see the structure of the Titanic dataset.

In [24]:
# Your code here
titanic.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1309 entries, 0 to 1308
Data columns (total 14 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   pclass     1309 non-null   int64  
 1   survived   1309 non-null   int64  
 2   name       1309 non-null   object 
 3   sex        1309 non-null   object 
 4   age        1046 non-null   float64
 5   sibsp      1309 non-null   int64  
 6   parch      1309 non-null   int64  
 7   ticket     1309 non-null   object 
 8   fare       1308 non-null   float64
 9   cabin      295 non-null    object 
 10  embarked   1307 non-null   object 
 11  boat       486 non-null    object 
 12  body       121 non-null    float64
 13  home.dest  745 non-null    object 
dtypes: float64(3), int64(4), object(7)
memory usage: 143.3+ KB


### Exercise 4.5: Answer Questions

Based on the `.info()` output, answer these questions:

1. How many rows (passengers) are in the dataset?
2. How many columns are in the dataset?
3. What is the data type of the 'age' column?
4. What is the data type of the 'survived' column?
5. Which columns have missing (null) values?

**Your Answers:**

1. Number of rows: 1309
2. Number of columns: 13
3. Data type of 'age': float64
4. Data type of 'survived': int64
5. Columns with missing values: age, fare, cabin, embarked, boat, body, home.dest

---

## Part 5: Basic DataFrame Exploration

Let's explore the Titanic dataset using pandas methods.

### Example: Common DataFrame Methods

In [None]:
# Useful DataFrame methods:

data.tail()       # View last rows
data.columns      # Get column names
data.shape        # Get (rows, columns) as a tuple
data.describe()   # Get statistics for numerical columns

### Exercise 5.1: View Last Rows

Display the last 10 rows of the dataset using the `.tail(10)` method.

In [25]:
# Your code here
titanic.tail(10)

Unnamed: 0,pclass,survived,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked,boat,body,home.dest
1299,3,0,"Yasbeck, Mr. Antoni",male,27.0,1,0,2659,14.4542,,C,C,,
1300,3,1,"Yasbeck, Mrs. Antoni (Selini Alexander)",female,15.0,1,0,2659,14.4542,,C,,,
1301,3,0,"Youseff, Mr. Gerious",male,45.5,0,0,2628,7.225,,C,,312.0,
1302,3,0,"Yousif, Mr. Wazli",male,,0,0,2647,7.225,,C,,,
1303,3,0,"Yousseff, Mr. Gerious",male,,0,0,2627,14.4583,,C,,,
1304,3,0,"Zabour, Miss. Hileni",female,14.5,1,0,2665,14.4542,,C,,328.0,
1305,3,0,"Zabour, Miss. Thamine",female,,1,0,2665,14.4542,,C,,,
1306,3,0,"Zakarian, Mr. Mapriededer",male,26.5,0,0,2656,7.225,,C,,304.0,
1307,3,0,"Zakarian, Mr. Ortin",male,27.0,0,0,2670,7.225,,C,,,
1308,3,0,"Zimmerman, Mr. Leo",male,29.0,0,0,315082,7.875,,S,,,


### Exercise 5.2: Get Column Names

Print all the column names in the dataset using `.columns`.

In [27]:
# Your code here
titanic.columns

Index(['pclass', 'survived', 'name', 'sex', 'age', 'sibsp', 'parch', 'ticket',
       'fare', 'cabin', 'embarked', 'boat', 'body', 'home.dest'],
      dtype='object')

### Exercise 5.3: Get Dataset Shape

Print the shape (rows, columns) of the dataset using `.shape`.

**Hint:** This will return a tuple like (1309, 14) meaning 1309 rows and 14 columns.

In [28]:
# Your code here
titanic.shape

(1309, 14)

### Example: Descriptive Statistics

In [None]:
# Get basic statistics about numerical columns
data.describe()

# This shows:
# count - number of non-missing values
# mean - average value
# std - standard deviation (measure of spread)
# min - minimum value
# 25% - first quartile
# 50% - median (middle value)
# 75% - third quartile
# max - maximum value

### Exercise 5.4: Basic Statistics

Use the `.describe()` method to see basic statistics about numerical columns in the Titanic dataset.

In [29]:
# Your code here
titanic.describe()

Unnamed: 0,pclass,survived,age,sibsp,parch,fare,body
count,1309.0,1309.0,1046.0,1309.0,1309.0,1308.0,121.0
mean,2.294882,0.381971,29.881138,0.498854,0.385027,33.295479,160.809917
std,0.837836,0.486055,14.413493,1.041658,0.86556,51.758668,97.696922
min,1.0,0.0,0.17,0.0,0.0,0.0,1.0
25%,2.0,0.0,21.0,0.0,0.0,7.8958,72.0
50%,3.0,0.0,28.0,0.0,0.0,14.4542,155.0
75%,3.0,1.0,39.0,1.0,0.0,31.275,256.0
max,3.0,1.0,80.0,8.0,9.0,512.3292,328.0


### Exercise 5.5: Interpret Statistics

Based on the `.describe()` output, answer these questions:

1. What is the average (mean) age of passengers?
2. What is the maximum fare paid?
3. What percentage of passengers survived? (Hint: Look at the 'survived' column mean - it will be a decimal between 0 and 1)
4. What is the median (50th percentile) fare?
5. What is the standard deviation of age?

**Your Answers:**

1. Average age: 29.88
2. Maximum fare: $512.33
3. Survival rate (as percentage): 38.20%
5. Median fare: $14.45
6. Standard deviation of age: 14.41

---

## Part 6 (Track B): Advanced DataFrame Operations

Perform more advanced exploration and analysis.

### Example: Selecting Columns

In [None]:
# Select a single column (returns a Series)
names = data['name']

# Select multiple columns (returns a DataFrame)
subset = data[['name', 'age', 'sex']]

# Display first few rows
subset.head()

### Exercise 6.1: Select a Single Column

Select and display the 'name' column from the Titanic dataset. Show the first 10 entries using `.head(10)`.

In [31]:
# Your code here
names = titanic['name']
names.head(10)

0                      Allen, Miss. Elisabeth Walton
1                     Allison, Master. Hudson Trevor
2                       Allison, Miss. Helen Loraine
3               Allison, Mr. Hudson Joshua Creighton
4    Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
5                                Anderson, Mr. Harry
6                  Andrews, Miss. Kornelia Theodosia
7                             Andrews, Mr. Thomas Jr
8      Appleton, Mrs. Edward Dale (Charlotte Lamson)
9                            Artagaveytia, Mr. Ramon
Name: name, dtype: object

### Exercise 6.2: Select Multiple Columns

Create a new DataFrame containing only these columns: 'name', 'sex', 'age', 'survived'.

Display the first 10 rows.

In [33]:
# Your code here
subset = titanic[['name','sex','age','survived']]
subset.head(10)

Unnamed: 0,name,sex,age,survived
0,"Allen, Miss. Elisabeth Walton",female,29.0,1
1,"Allison, Master. Hudson Trevor",male,0.92,1
2,"Allison, Miss. Helen Loraine",female,2.0,0
3,"Allison, Mr. Hudson Joshua Creighton",male,30.0,0
4,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",female,25.0,0
5,"Anderson, Mr. Harry",male,48.0,1
6,"Andrews, Miss. Kornelia Theodosia",female,63.0,1
7,"Andrews, Mr. Thomas Jr",male,39.0,0
8,"Appleton, Mrs. Edward Dale (Charlotte Lamson)",female,53.0,1
9,"Artagaveytia, Mr. Ramon",male,71.0,0


### Example: Value Counts

In [None]:
# Count how many times each value appears in a column
data['column_name'].value_counts()

# For example, to count genders:
data['sex'].value_counts()

### Exercise 6.3: Count Values

Use `.value_counts()` to answer these questions:

1. How many passengers were in each class (pclass)?
2. How many male vs. female passengers?
3. How many passengers survived vs. died?

In [34]:
# Passenger class distribution
titanic['pclass'].value_counts()

pclass
3    709
1    323
2    277
Name: count, dtype: int64

In [35]:
# Gender distribution
titanic['sex'].value_counts()

sex
male      843
female    466
Name: count, dtype: int64

In [36]:
# Survival distribution
titanic['survived'].value_counts()

survived
0    809
1    500
Name: count, dtype: int64

### Example: Column Statistics

In [None]:
# Calculate specific statistics for a column
data['age'].min()      # Minimum age
data['age'].max()      # Maximum age
data['age'].mean()     # Average age
data['age'].sum()      # Sum of all ages
data['age'].median()   # Median age

### Exercise 6.4: Calculate Specific Statistics

Calculate the following:

1. Minimum age in the dataset
2. Maximum fare in the dataset
3. Total number of siblings/spouses (sum of 'sibsp' column)
4. Average fare for all passengers

In [38]:
# Your code here
titanic['age'].min()
titanic['fare'].max()
titanic['sibsp'].sum()
titanic['fare'].mean()

np.float64(33.29547928134557)

### Example: Checking for Missing Data

In [None]:
# Count missing values in each column
data.isnull().sum()

# This returns how many NaN (missing) values are in each column

### Exercise 6.5: Check for Missing Data

Use `.isnull().sum()` to count how many missing values exist in each column.

Which columns have the most missing data?

In [40]:
# Your code here
titanic.isnull().sum()
print("----------------------------")
titanic.info()

----------------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1309 entries, 0 to 1308
Data columns (total 14 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   pclass     1309 non-null   int64  
 1   survived   1309 non-null   int64  
 2   name       1309 non-null   object 
 3   sex        1309 non-null   object 
 4   age        1046 non-null   float64
 5   sibsp      1309 non-null   int64  
 6   parch      1309 non-null   int64  
 7   ticket     1309 non-null   object 
 8   fare       1308 non-null   float64
 9   cabin      295 non-null    object 
 10  embarked   1307 non-null   object 
 11  boat       486 non-null    object 
 12  body       121 non-null    float64
 13  home.dest  745 non-null    object 
dtypes: float64(3), int64(4), object(7)
memory usage: 143.3+ KB


**Answer:** The columns with the most missing data are: body, cabin, boat, home.dest, and age

### Exercise 6.6: Data Type Identification

Look at the dataset and the data dictionary. Answer these questions:

1. Which columns contain **strings (text)**?
2. Which columns contain **integers**?
3. Which columns contain **floats (decimals)**?
4. Why do you think 'survived' is stored as an integer (0 or 1) instead of a boolean (True/False)?

**Your Answers:**

1. String columns: home.dest, boat, embarked, cabin, ticket, sex, name
2. Integer columns: pclass, survived, sibsp, parch
3. Float columns: age, fare, body
4. Why survived is an integer: its easier to manipulate the data for calculations

---

## Part 7 (Track B): Critical Thinking

Answer these questions based on your exploration.

### Exercise 7.1: Dataset Context

Based on what you've learned about the Titanic dataset, answer these questions:

1. What does the 'pclass' column represent? What values does it contain?
2. What does a 'survived' value of 1 mean? What about 0?
3. Why might the 'cabin' column have so many missing values?
4. What could the 'embarked' column represent? (Hint: Look at the values - they're single letters. Check the data dictionary!)
5. Why might the 'age' column have so many missing values? 

**Your Answers:**

1. pclass represents the class of the ship that each person was on, aka how fancy thier experience was
2. Survived value of 1 is true and 0 is false, just like bool values
3. Something like cabin is hard to figure out after the boat sinks.
4. Embarked represents what section of the ship they were on
5. You cant figure out the age of missing people.

---

## Submission Checklist

Before submitting, make sure you have:

- [ ] Completed all exercises (including Track B sections)
- [ ] Run all cells successfully (no errors)
- [ ] Added your name and date at the top
- [ ] Answered all written questions
- [ ] Saved the notebook as `dbAppsTask01TrackB.ipynb`
- [ ] Pushed to your `databaseApplications` repository on GitHub
- [ ] Verified the file appears on GitHub

**Excellent work on completing the Track B challenges!**