## 🎯 Welcome! Your Journey Starts Here

Welcome to the world of programming! You're here because you want to solve business problems using data, and Python is the #1 language for doing just that.

**Why Python for AI and Business?**

Think of Python not as a complex coding language, but as your personal, highly-intelligent assistant. You give it instructions, and it performs complex data tasks in seconds.

1.  **The Brain of AI:** Python is the foundation for the tools you'll use every day:
    *   `pandas`: Your go-to for analyzing and cleaning data (like an Excel spreadsheet on steroids).
    *   `scikit-learn`: Your toolkit for building predictive machine learning models.
    *   `numpy`: The engine for high-speed numerical calculations.
2.  **Automate Anything:** From cleaning thousands of customer records to running predictive models, Python turns repetitive work into automated pipelines.
3.  **Community Power:** A massive global community means if you have a question, someone has already answered it.

In this session, you'll learn the basic grammar of Python. This is the foundation upon which we will build powerful data analysis and machine learning models. Let's get started!

---

# Part 1: In-Class Session (60 Mins) 👨‍🏫

Our goal in this hour is to get you comfortable writing and running your first Python code. We'll focus on the absolute essentials.

## Your First Lines of Code (10 mins)

In a Jupyter notebook, code is organized in "cells". You can run a cell by clicking on it and pressing `Shift + Enter`.

Let's start with the traditional first program.

In [None]:
# This is a comment. Python ignores anything after a '#'.
# Comments help us explain our code to others (and our future selves!).

# The print() function displays output.
print('Hello, Data Scientist!')

### Variables: Storing Your Data

Think of variables as labeled boxes where you store information. In data science, these "boxes" hold everything from a customer's age to a sales figure or a model's accuracy score.

In [None]:
# Storing a customer's age (an integer)
customer_age = 34
print('Customer Age:', customer_age)

# Storing a product price (a float, i.e., a number with decimals)
product_price = 199.99
print('Product Price:', product_price)

# Storing a customer segment (a string, i.e., text)
customer_segment = 'High-Value'
print('Customer Segment:', customer_segment)

::: {.callout-note}
## Why Do Data Types Matter?
Python automatically detects the type of data (`integer`, `float`, `string`). This is crucial because the type determines what you can do with the data. You can calculate the average `product_price`, but you can't calculate the average `customer_segment`!
:::

### Basic Operations: From Raw Data to Insights

You can use standard math operations to start analyzing your data.

In [None]:
# Let's say we have data for a single sale
price_per_unit = 49.95
units_sold = 5
cost_per_unit = 20.0

# Calculate total revenue
total_revenue = price_per_unit * units_sold
print('Total Revenue:', total_revenue)

# Calculate total cost
total_cost = cost_per_unit * units_sold
print('Total Cost:', total_cost)

# Calculate profit
profit = total_revenue - total_cost
print('Profit for this sale:', profit)

### Comparisons: Asking Questions About Your Data

Comparisons are the basis of filtering and decision-making. They always return a `True` or `False` value (a "boolean").

In [None]:
# Is this a profitable sale?
is_profitable = profit > 0
print('Was the sale profitable?', is_profitable)

# Did we sell at least 3 units?
min_units_sold = 3
sold_enough = units_sold >= min_units_sold
print('Did we meet the sales target?', sold_enough)

# Is the customer's age exactly 34?
is_age_34 = customer_age == 34 # Use '==' for comparison, '=' is for assignment
print('Is the customer 34 years old?', is_age_34)

> **🧠 Try It Yourself:** In the cell below, create a variable `revenue_target` with a value of `200`. Then, write a comparison to check if `total_revenue` met or exceeded this target.

In [None]:
# Your code here!

## Core Data Structures (15 mins)

Single data points are useful, but data science works with collections of data. The most fundamental collection is a **list**.

### Lists: Your First Dataset 📊

A list is an ordered collection of items, enclosed in `[]`. In data science, you'll use lists to hold features, columns of data, or prediction results.

In [None]:
# A list of sales figures for a week
daily_sales = [150.50, 230.10, 99.75, 450.00, 175.25]
print(daily_sales)

# A list of feature names for a model
feature_names = ['age', 'income', 'days_since_last_purchase']
print(feature_names)

You can access elements in a list using their **index**. Python is 0-indexed, meaning the first element is at index `0`.

In [None]:
# Get the first day's sales
first_day_sales = daily_sales[0]
print('Sales on Day 1:', first_day_sales)

# Get the last feature name
# You can use -1 to get the last item, -2 for the second to last, and so on.
last_feature = feature_names[-1]
print('Last Feature:', last_feature)

::: {.callout-warning}
#### Common Mistake: The "Off-by-One" Error
Forgetting that Python starts counting at 0 is one of the most common errors for beginners. `my_list[3]` gives you the *fourth* item, not the third!
:::

> **🧠 Try It Yourself:** In the cell below, get the sales figure for the *third* day from the `daily_sales` list.

In [None]:
# Your code here!

## Functions: Creating Reusable Tools 🛠️ (10 mins)

As a data scientist, you'll perform the same tasks over and over: cleaning data, calculating metrics, preparing features. Functions let you package up a piece of code so you can reuse it easily. This is the heart of automation!

Let's create a function to calculate the profit margin.

$$ \text{Profit Margin} = \frac{\text{Revenue} - \text{Cost}}{\text{Revenue}} \times 100 $$

In [None]:
# Before we define our function, we need to import a library for math
from math import sqrt # We'll use this later, but it's good practice to import at the top

def calculate_profit_margin(revenue, cost):
    """
    Calculates the profit margin as a percentage.

    Arguments:
    revenue: The total revenue from a sale.
    cost: The total cost of the goods sold.
    """
    # Important: Check for division by zero!
    if revenue == 0:
        return 0

    margin = ((revenue - cost) / revenue) * 100
    return margin

Now, we can *call* our function with the data we defined earlier.

In [None]:
# Let's use our variables from before
# total_revenue = 249.75
# total_cost = 100.0

margin = calculate_profit_margin(total_revenue, total_cost)

# The 'f' before the string lets us embed variables directly inside {}
print(f'The profit margin is: {margin:.2f}%')

::: {.callout-tip}
### Why Functions are Your Best Friend
1.  **Reusability:** Write once, use everywhere. No more copy-pasting.
2.  **Readability:** `calculate_profit_margin(rev, cost)` is much clearer than a complex formula.
3.  **Maintainability:** If you need to change the formula, you only have to change it in one place.
:::

## Practice & Discussion (15 mins)

Let's solidify what we've learned with a mini-challenge.

### Mini-Challenge: Customer Discount Function

You are tasked with creating a function for an e-commerce site. The function should take a `cart_total` and a `customer_status` (`'VIP'` or `'Standard'`) and apply a discount.

*   VIPs get a 20% discount.
*   Standard customers get a 10% discount.

The function should return the final price after the discount.

In [None]:
# Mini-Challenge
def calculate_final_price(cart_total, customer_status):
    """
    Calculates the final price after applying a status-based discount.
    """
    # Your code goes here!
    # Hint: Use an if/else statement to check the customer_status.
    if customer_status == 'VIP':
        # Apply 20% discount
        final_price = cart_total * 0.80
    else:
        # Apply 10% discount
        final_price = cart_total * 0.90
    
    return final_price

# --- Test your function ---
vip_price = calculate_final_price(200, 'VIP')
print(f'A VIP customer with a $200 cart pays: ${vip_price}')

standard_price = calculate_final_price(200, 'Standard')
print(f'A Standard customer with a $200 cart pays: ${standard_price}')

---

# Part 2: After-Class Exploration 🏠

Congratulations on completing the in-class session! Now it's time to explore a few more concepts on your own. These are essential building blocks for handling larger datasets and automating your workflows.

## Flow Control: Working with Data in Batches

Often, you'll need to perform an action on every item in a list. This is where loops come in.

### `for` Loops

A `for` loop iterates over a sequence (like a list) and executes a block of code for each item.

**Scenario:** You have a list of sales prices and you need to apply a 19% VAT to each one.

In [None]:
sales_prices = [100, 250, 75.5, 300]
prices_with_vat = [] # Start with an empty list to store the results

# The loop: "for each 'price' in the 'sales_prices' list..."
for price in sales_prices:
    final_price = price * 1.19
    prices_with_vat.append(final_price) # .append() adds an item to the end of a list

print("Original prices:", sales_prices)
print("Prices with VAT:", prices_with_vat)

### `if`/`else` Controls Inside Loops

You can combine loops with conditional logic to perform more complex tasks.

**Scenario:** You want to create a list of all sales over €100.

In [None]:
large_sales = [] # Create an empty list for our results

for price in sales_prices:
    if price > 100:
        large_sales.append(price)

print("All sales over €100:", large_sales)

### 🧑‍💻 Your Turn: Categorize Expenses
You have a list of business expenses. Write a loop that categorizes each expense as either 'Small' (<= 50) or 'Large' (> 50).

In [None]:
expenses = [25, 150, 49, 300, 50, 120]
expense_categories = []

# Write your for loop here
# Hint: You'll need an if/else statement inside the loop.
# Append either 'Small' or 'Large' to the expense_categories list.


# --- Check your result ---
# print(expense_categories) # Expected output: ['Small', 'Large', 'Small', 'Large', 'Small', 'Large']

## More Powerful Data Structures

Lists are great, but Python offers other structures for different needs.

### Dictionaries: Labeled Data Records

A dictionary (`dict`) stores data in `{key: value}` pairs. Think of it as a single record of data, like one row in a spreadsheet or a JSON object.

In [None]:
# A dictionary representing a single customer
customer_record = {
    'customer_id': 'CUST-007',
    'name': 'James Bond',
    'age': 45,
    'is_active': True,
    'last_purchases': [99.50, 120.00, 350.75]
}

print(customer_record)

You access data in a dictionary using its keys, not its index.

In [None]:
# Get the customer's name
customer_name = customer_record['name']
print(f"Customer Name: {customer_name}")

# Add a new piece of information
customer_record['country'] = 'UK'
print("Updated record:", customer_record)

::: {.callout-note}
#### Connection to Data Science
Dictionaries are everywhere! They are used for:
-   Storing model parameters (`{'learning_rate': 0.01, 'epochs': 100}`).
-   Representing data from APIs (JSON format).
-   Building DataFrames, our primary tool for analysis.
:::

## The Main Event: Introduction to Pandas DataFrames 🐼

While lists and dictionaries are Python's building blocks, `pandas` is the powerhouse for real-world data science. A **DataFrame** is the core structure in pandas—a 2D table, like a spreadsheet.

First, we need to import the libraries we'll be using. It's a strong convention to import `numpy` as `np` and `pandas` as `pd`.

In [None]:
import numpy as np
import pandas as pd

Let's create a DataFrame from a list of dictionaries. Each dictionary will become a row.

In [None]:
# A list of customer data (each is a dictionary)
customer_data = [
    {'ID': 1, 'FirstName': "Jesper", 'Female': False, 'Age': 22},
    {'ID': 2, 'FirstName': "Jonas", 'Female': False, 'Age': 33},
    {'ID': 3, 'FirstName': "Pernille", 'Female': True, 'Age': 44},
    {'ID': 4, 'FirstName': "Helle", 'Female': True, 'Age': 55}
]

# Create the DataFrame
df = pd.DataFrame(customer_data)

Now, let's look at our data. In a Jupyter notebook, simply typing the variable name of a DataFrame at the end of a cell will display it nicely.

In [None]:
df

### Basic DataFrame Operations

Here are a few essential operations that show the power of DataFrames.

In [None]:
# Get a summary of the data types and missing values
df.info()

In [None]:
# Select a single column (this returns a pandas Series)
ages = df['Age']
print(ages)

In [None]:
# Select multiple columns
customer_info = df[['FirstName', 'Age']]
customer_info

In [None]:
# The real magic: Filtering data based on a condition
# Find all customers older than 30
seniors = df[df['Age'] > 30]
seniors

In [None]:
# Another way to do the same thing using the .query() method
df.query('Age > 30')

## 🚀 Final Challenge: Data Analysis Task

You've been given a small dataset of product sales. Your task is to perform a simple analysis.

1.  Create a DataFrame from the data provided below.
2.  Calculate a new column called `Revenue` which is `Price` * `UnitsSold`.
3.  Find all sales where more than 10 units were sold.
4.  Calculate the total revenue from all sales.

In [None]:
# 1. Data for the challenge
sales_data = {
    'ProductID': ['A101', 'B202', 'C303', 'A101', 'D404'],
    'Price': [10.0, 25.5, 5.75, 9.5, 15.0],
    'UnitsSold': [15, 8, 20, 12, 6]
}

# Create the DataFrame
sales_df = pd.DataFrame(sales_data)
print("--- Initial DataFrame ---")
print(sales_df)

# 2. Calculate the 'Revenue' column
# Your code here


# 3. Find all sales with more than 10 units sold
# Your code here


# 4. Calculate the total revenue
# Hint: You can select the 'Revenue' column and use the .sum() method
# total_revenue = sales_df['Revenue'].sum()
# print(f"\nTotal Revenue: ${total_revenue:.2f}")