## Step 1: Import Libraries
First, we import the tools we need for data manipulation, visualization, and object-oriented programming.

In [None]:
# Import pandas for data manipulation
import pandas as pd

# Import numpy for numerical operations
import numpy as np

# Import matplotlib for creating plots and visualizations
import matplotlib.pyplot as plt

# Import ABC (Abstract Base Class) and abstractmethod decorator for creating abstract classes
from abc import ABC, abstractmethod

# Make plots show up in the notebook
%matplotlib inline

## Step 2: Making the Data Set
They were in a jam because their data was lost in a fire, set off by an unfortunate event between an elf, a gnome and a reindeer. So they needed to make a new data set and take a guess at the data.

### Task: Complete the code to create the data set

In [None]:
# Read the denmark_population.csv file into a pandas DataFrame
df = pd.read_csv('denmark_population.csv')

# Create new dataframe with only the required columns (city, admin_name, 2023 population)
the_list = df[['city', 'admin_name', '2023']].copy()

# Remove commas from the 2023 column and convert to numeric
# This is necessary because the CSV file has numbers like "1,234,567"
the_list['2023'] = the_list['2023'].str.replace(',', '').astype(int)

# Calculate 14% of the 2023 population
# Assumption: children aged 1-13 years old represent approximately 14% of the population
total_14_percent = the_list['2023'] * 0.14

# TODO: Set random seed for reproducibility
# Uncomment the line below and replace ___ with 42
# np.random.seed(___)

# Generate random values between 0 and 1 for each row
random_splits = np.random.random(len(the_list))

# TODO: Calculate the 'nice' column
# Multiply total_14_percent by random_splits and convert to integer
# the_list['nice'] = (total_14_percent * ___).astype(int)

# TODO: Calculate the 'naughty' column
# Multiply total_14_percent by (1 - random_splits) and convert to integer
# the_list['naughty'] = (total_14_percent * (1 - ___)).astype(int)

# Save to new CSV file
the_list.to_csv('the_list.csv', index=False)

# Print confirmation and preview the data
print(f"Successfully created the_list.csv with {len(the_list)} rows")
print("\nFirst few rows:")
print(the_list.head())

## Step 3: Now the Work Begins - Using OOP!

<div style="text-align: center;">
    <img src="elf_oop.png" alt="Elf coder" width="40%">
</div>

### What is OOP?
**Object-Oriented Programming (OOP)** helps us organize our code by creating "classes" - think of them as blueprints for creating objects.

**Abstract Base Class (ABC)**: A template that other classes must follow. It says "any class that inherits from me must have these methods!"

### The Four Pillars of OOP:
1. **Encapsulation** - Bundling data and methods together
2. **Abstraction** - Hiding complex implementation details
3. **Inheritance** - Creating new classes based on existing ones
4. **Polymorphism** - Different classes can use the same method names

## Step 4: Define Our Classes

We'll create:
1. **ChildList** - An abstract base class (the blueprint)
2. **NiceList** - A class for nice children (inherits from ChildList)
3. **NaughtyList** - A class for naughty children (inherits from ChildList)
4. **ListManager** - A class to manage everything

In [None]:
# ===================================================================
# ABSTRACT BASE CLASS - The Blueprint!
# ===================================================================

class ChildList(ABC):
    """
    Abstract base class for managing lists of children.
    
    This class defines the common structure and behavior that both
    NiceList and NaughtyList must follow.
    
    Attributes:
        data (DataFrame): The pandas DataFrame containing child data
    """
    
    def __init__(self, dataframe):
        """
        Initialize the ChildList with a DataFrame.
        
        Args:
            dataframe: A pandas DataFrame containing the child data
        """
        # Store the dataframe as an instance variable
        self.data = dataframe
    
    @abstractmethod
    def get_list_type(self):
        """
        Abstract method that must be implemented by child classes.
        
        Returns:
            str: The type of list ("nice" or "naughty")
        """
        # This is an abstract method - child classes MUST implement this!
        pass
    
    def sort_by_city(self):
        """
        Sort the data alphabetically by city name.
        
        Returns:
            DataFrame: Sorted DataFrame
        """
        # Use pandas sort_values to sort by the 'city' column
        sorted_data = self.data.sort_values('city')
        return sorted_data
    
    def sort_by_admin(self):
        """
        Sort the data alphabetically by administrative region.
        
        Returns:
            DataFrame: Sorted DataFrame
        """
        # TODO: Sort the data by 'admin_name' column
        # sorted_data = self.data.sort_values(___)
        # return sorted_data
        pass


# ===================================================================
# NICE LIST CLASS - Inherits from ChildList
# ===================================================================

class NiceList(ChildList):
    """
    Class for managing the list of nice children.
    
    Inherits all methods from ChildList and implements the required
    abstract method get_list_type().
    """
    
    def get_list_type(self):
        """
        Return the type of this list.
        
        Returns:
            str: "nice"
        """
        # TODO: Return the string "nice"
        # return ___
        pass


# ===================================================================
# NAUGHTY LIST CLASS - Inherits from ChildList
# ===================================================================

class NaughtyList(ChildList):
    """
    Class for managing the list of naughty children.
    
    Inherits all methods from ChildList and implements the required
    abstract method get_list_type().
    """
    
    def get_list_type(self):
        """
        Return the type of this list.
        
        Returns:
            str: "naughty"
        """
        # TODO: Return the string "naughty"
        # return ___
        pass


# ===================================================================
# LIST MANAGER CLASS - Brings Everything Together!
# ===================================================================

class ListManager:
    """
    Manager class that coordinates both Nice and Naughty lists.
    
    This class handles loading data, creating list objects, and
    providing various analysis and visualization methods.
    
    Attributes:
        dataframe (DataFrame): The main data containing all cities
        nice_list (NiceList): Object managing the nice children
        naughty_list (NaughtyList): Object managing the naughty children
    """
    
    def __init__(self):
        """
        Initialize the ListManager with empty data.
        
        All attributes start as None until load_from_csv is called.
        """
        # Initialize instance variables as None
        self.dataframe = None
        self.nice_list = None
        self.naughty_list = None
    
    def load_from_csv(self, filepath):
        """
        Load data from a CSV file and create NiceList and NaughtyList objects.
        
        Args:
            filepath (str): Path to the CSV file containing the data
        """
        # Read the CSV file into a DataFrame
        self.dataframe = pd.read_csv(filepath)
        
        # Create a NiceList object with the dataframe
        self.nice_list = NiceList(self.dataframe)
        
        # TODO: Create a NaughtyList object with the dataframe
        # self.naughty_list = ___(self.dataframe)
        
        # Print confirmation message
        print(f"Loaded {len(self.dataframe)} cities from CSV")
    
    def show_summary(self):
        """
        Display a summary of the total nice and naughty children.
        
        This method calculates and prints:
        - Total number of cities
        - Total nice children
        - Total naughty children
        - Grand total of all children
        """
        # Sum all values in the 'nice' column
        total_nice = self.dataframe['nice'].sum()
        
        # TODO: Sum all values in the 'naughty' column
        # total_naughty = self.dataframe[___].sum()
        
        # Print formatted output
        print("\n" + "="*70)
        print("SUMMARY")
        print("="*70)
        print(f"Total cities: {len(self.dataframe)}")
        print(f"Total nice children: {total_nice:,}")
        # TODO: Print total naughty children with formatting
        # print(f"Total naughty children: {___:,}")
        # print(f"Total children: {total_nice + ___:,}")
        print()
    
    def show_top_nice_cities(self, num_cities):
        """
        Display the top N cities with the most nice children.
        
        Args:
            num_cities (int): Number of top cities to display
        """
        print(f"\nTop {num_cities} Cities with Nice Children:")
        print("-" * 60)
        
        # Sort DataFrame by 'nice' column in descending order and get top N rows
        top_cities = self.dataframe.sort_values('nice', ascending=False).head(num_cities)
        
        # Iterate through the top cities with enumerate for numbering
        for i, row in enumerate(top_cities.itertuples(), 1):
            print(f"{i}. {row.city} ({row.admin_name}): {row.nice} children")
    
    def show_top_naughty_cities(self, num_cities):
        """
        Display the top N cities with the most naughty children.
        
        Args:
            num_cities (int): Number of top cities to display
        """
        # TODO: Print header
        # print(f"\nTop {___} Cities with Naughty Children:")
        # print("-" * 60)
        
        # TODO: Sort by 'naughty' column in descending order and get top N
        # top_cities = self.dataframe.sort_values(___, ascending=False).head(___)
        
        # TODO: Iterate and print each city with its naughty count
        # for i, row in enumerate(top_cities.itertuples(), 1):
        #     print(f"{i}. {row.city} ({row.admin_name}): {row.___} children")
        pass
    
    def filter_by_region(self, region_name):
        """
        Filter the data to show only cities in a specific region.
        
        Args:
            region_name (str): Name of the administrative region
            
        Returns:
            DataFrame: Filtered DataFrame containing only the specified region
        """
        # Filter the dataframe where admin_name equals the region_name
        region_data = self.dataframe[self.dataframe['admin_name'] == region_name]
        return region_data
    
    def plot_regions_overview(self):
        """
        Create a bar chart comparing nice vs naughty children across all regions.
        
        This method:
        1. Groups data by region
        2. Sums nice and naughty counts for each region
        3. Creates a side-by-side bar chart
        """
        # Group by administrative region and sum the nice and naughty columns
        region_totals = self.dataframe.groupby('admin_name')[['nice', 'naughty']].sum()
        
        # Create a new figure with specified size
        plt.figure(figsize=(12, 6))
        
        # Create x-axis positions for the bars
        x = range(len(region_totals))
        width = 0.35  # Width of each bar
        
        # Create bars for nice children (green)
        plt.bar(x, region_totals['nice'], width, label='Nice', color='green', alpha=0.7)
        
        # Create bars for naughty children (red), shifted by width
        plt.bar([i + width for i in x], region_totals['naughty'], width, 
                label='Naughty', color='red', alpha=0.7)
        
        # Add labels and title
        plt.xlabel('Region')
        plt.ylabel('Number of Children')
        plt.title('Nice vs Naughty Children by Region')
        
        # Set x-axis tick positions and labels (rotated for readability)
        plt.xticks([i + width/2 for i in x], region_totals.index, rotation=45)
        
        # Add legend to identify which bar is which
        plt.legend()
        
        # Adjust layout to prevent label cutoff
        plt.tight_layout()
        
        # Display the plot
        plt.show()
    
    def plot_top_cities_in_region(self, region_name, num_cities):
        """
        Create a bar chart showing top cities in a specific region.
        
        Args:
            region_name (str): Name of the region to analyze
            num_cities (int): Number of top cities to display
        """
        # Get data for the specified region
        region_data = self.filter_by_region(region_name)
        
        # Create a copy to avoid modifying the original
        region_data = region_data.copy()
        
        # TODO: Add a 'total' column that sums nice and naughty
        # region_data['total'] = region_data[___] + region_data[___]
        
        # TODO: Sort by total and get top N cities
        # top_cities = region_data.sort_values(___, ascending=False).head(___)
        
        # Create figure
        plt.figure(figsize=(10, 6))
        
        # Create x positions and bar width
        x = range(len(top_cities))
        width = 0.35
        
        # TODO: Create bars for nice children
        # plt.bar(x, top_cities[___], width, label='Nice', color='green', alpha=0.7)
        
        # TODO: Create bars for naughty children (shifted)
        # plt.bar([i + width for i in x], top_cities[___], width, 
        #         label='Naughty', color='red', alpha=0.7)
        
        # Add labels and formatting
        plt.xlabel('City')
        plt.ylabel('Number of Children')
        plt.title(f'Top {num_cities} Cities in {region_name}')
        plt.xticks([i + width/2 for i in x], top_cities['city'], rotation=45, ha='right')
        plt.legend()
        plt.tight_layout()
        plt.show()

print("Classes defined successfully! âœ“")

## Step 5: Use Our Classes!

Now let's create a `ListManager` object and use it to analyze our data.

**Remember:** In OOP, we first create an object from a class, then we can call its methods.

In [None]:
# TODO: Create a ListManager object and store it in a variable called 'manager'
# manager = ___()

# TODO: Load the data from 'the_list.csv'
# manager.___('the_list.csv')

# TODO: Display the summary
# manager.___()

## Step 6: Find Top Cities

Use the methods we created to find the top cities with nice and naughty children.

In [None]:
# TODO: Show top 10 cities with nice children
# manager.___(10)

# TODO: Show top 10 cities with naughty children
# manager.___(10)

## Step 7: Using Our Classes to Sort Data

Remember we created `NiceList` and `NaughtyList` classes? Let's use their sorting methods!

This demonstrates **inheritance** - both classes inherited the sorting methods from `ChildList`.

In [None]:
# Sort nice list by city using the sort_by_city() method
nice_by_city = manager.nice_list.sort_by_city()
print("Nice List - Sorted by City (first 10):")
print(nice_by_city[['city', 'admin_name', 'nice']].head(10))

print("\n" + "="*60 + "\n")

# TODO: Sort naughty list by admin region using the sort_by_admin() method
# naughty_by_admin = manager.naughty_list.___()
# print("Naughty List - Sorted by Admin Region (first 10):")
# print(naughty_by_admin[['city', 'admin_name', 'naughty']].head(10))

## Step 8: Filter by Region

Use the `filter_by_region()` method to get data for a specific region.

In [None]:
# TODO: Get data for Hovedstaden region (Copenhagen area)
# hovedstaden_data = manager.___('Hovedstaden')

# Print the results
# print("Cities in Hovedstaden:")
# print(hovedstaden_data[['city', 'nice', 'naughty']])

## Step 9: Create Visualizations

Let's make some charts to see the data!

Visualization helps us understand patterns in the data that are hard to see in tables.

In [None]:
# TODO: Plot overall regions comparison
# manager.___()

In [None]:
# TODO: Plot top 5 cities in Hovedstaden
# manager.___('Hovedstaden', 5)

In [None]:
# TODO: Plot top 5 cities in Midtjylland
# manager.___('Midtjylland', 5)

In [None]:
# TODO: Create a loop to plot all regions
# Define a list of all Danish regions
# regions = ['Hovedstaden', 'Midtjylland', 'Syddanmark', 'Nordjylland', 'SjÃ¦lland']

# Loop through each region and create a plot
# for region in regions:
#     print(f"\nPlotting {region}...")
#     manager.___(region, 5)

## ðŸŽ„ Conclusion

### What We Learned About OOP:

1. **Abstract Base Class (ABC)** - `ChildList` is our blueprint that forces child classes to have certain methods using the `@abstractmethod` decorator

2. **Inheritance** - `NiceList` and `NaughtyList` inherit from `ChildList`, getting all its methods without having to rewrite them

3. **Classes and Objects** - We created the `ListManager` class (blueprint) and then made a `manager` object (instance) from it

4. **Methods** - Functions inside classes that work with the data (like `sort_by_city()`, `plot_regions_overview()`)

5. **Encapsulation** - All our data and methods are organized together in classes, keeping related functionality bundled

6. **Docstrings** - Documentation strings that explain what each class and method does

### Benefits of OOP in This Project:
- **Code Organization**: Related data and functions are grouped together
- **Reusability**: Methods like `sort_by_city()` work for both nice and naughty lists
- **Maintainability**: If we need to change how sorting works, we only change it in one place
- **Scalability**: Easy to add new types of lists or new methods

### Key Findings:
- Copenhagen has the most children overall (both nice and naughty!)
- Different regions show different patterns of nice vs naughty
- Aarhus has lots of nice children ðŸŽ…

### Challenge Questions:
1. How would you add a method to find the city with the highest ratio of nice to naughty children?
2. Can you create a new class that inherits from `ChildList` for children who are "undecided"?
3. How would you modify the code to track children by age group instead of just nice/naughty?

**Merry Christmas! ðŸŽ…ðŸŽ„**

<div style="text-align: center;">
    <img src="santa_ride.png" alt="santa_rides" width="60%">
</div>