# Python & Pandas Basics – The Electric Vehicle Showdown ⚡🚗


Welcome to the Python & Pandas Basics Tutorial! Today, we’ll be using data on electric vehicles to learn Python fundamentals.

The goal? To use Python as a powerful alternative to spreadsheets and manual analysis!


## Exercise 1: Electric Vehicle Models & Lists 🚙🔋

In [47]:
# We have a list of popular electric vehicle (EV) models:
ev_models = ["Tesla Model S", "Nissan Leaf", "Chevrolet Bolt", "BMW i3", "Audi e-Tron",
             "Jaguar I-PACE", "Hyundai Kona EV", "Volkswagen ID.4", "Ford Mustang Mach-E", "Porsche Taycan"]

In [None]:
# Task 1: Print all models to see the list of EVs.
print("🚗 EV Models:", ev_models)

In [None]:
# Task 2: Retrieve the first and last EV model and print them with a message.
first_ev = ev_models[0]
last_ev = ev_models[-1]
print("The first EV in our collection:", first_ev)
print("The last EV in our collection:", last_ev)

In [None]:
# Task 3: Imagine these models have a new 2024 edition! Add "2024 Edition" to each model and print the updated list.
new_models = [model + " 2024 Edition" for model in ev_models]
print("🚙 Updated EV Models:", new_models)


## Exercise 2: Simple Math with EV Range (in Miles) 🌍🔋

In [51]:
# Here’s a list of driving ranges (in kilometers) for each EV model above.
ranges = [575, 340, 417, 307, 545, 380, 390, 445, 480, 495]

In [None]:
# Task 1: Calculate and print the average driving range of these EVs.
# Hint:
# First do by using a for loop. Sum up the ranges and count the number of entries within the for loop.
# Then try to find a short version with functions defined on the list itself.

print("Using a foor loop:")
sum_ranges = 0
cnt_entries = 0
for range in ranges:
    sum_ranges = sum_ranges + range
    cnt_entries = cnt_entries + 1
average_range = sum_ranges / cnt_entries
print("🔋 Average EV Range (kilometer):", average_range)

print("\n\nUsing list functions:")
average_range = sum(ranges) / len(ranges)
print("🔋 Average EV Range (kilometer):", average_range)

In [None]:
# Task 2: Find and print the longest and shortest ranges with context.
max_range = max(ranges)
min_range = min(ranges)
print("🏆 Longest Range (miles):", max_range)
print("💡 Shortest Range (miles):", min_range)

In [None]:
# Task 3: Find the difference in miles between the longest and shortest ranges.
range_difference = max_range - min_range
print("📏 Range Difference (miles):", range_difference)

## Exercise 3: Creating an EV Database with Dictionaries 📊

In [None]:
# Task 1: Pair each EV model with its driving range using a dictionary called `ev_database`.
ev_database = dict(zip(ev_models, ranges))
print("🔋 EV Database:", ev_database)

In [None]:
# Task 2: Check if "Tesla Model S" is in the database and print its range if found.
if "Tesla Model S" in ev_database:
    print("🚀 Range for 'Tesla Model S':", ev_database["Tesla Model S"])

In [None]:
# Task 3: A new EV called "Lucid Air" with a range of 516 miles has launched! Add it to the database and print the updated dictionary.
ev_database["Lucid Air"] = 665
print("Updated EV Database:", ev_database)

## Exercise 4: Using a For Loop for High-Range EVs 🔋🏆

In [None]:
# Let’s use a for loop to find EVs with a range of 400 kilometers or higher. We'll call these "Long-Range EVs."
long_range_evs = []
for model, range_miles in ev_database.items():
    if range_miles >= 400:
        long_range_evs.append(model)
print("🚗 Long-Range EVs:", long_range_evs)

## Exercise 5: Introducing Functions with Charging Cost Calculations 💸

In [60]:
# Let’s create a function to calculate the cost to fully charge an EV based on its range and battery efficiency.
# Assume an electricity cost of 0.31€ per kWh and a battery efficiency of 8 kilometers per kWh.

def calculate_charging_cost(range_kilometers):
    kWh_needed = range_kilometers / 8  # miles per kWh
    cost = kWh_needed * 0.31
    return round(cost, 2)


In [None]:
# Test the function with the range for "Tesla Model S"
charging_cost_tesla = calculate_charging_cost(ev_database["Tesla Model S"])
print("💵 Charging Cost for 'Tesla Model S': €", charging_cost_tesla)


## Exercise 6: Data Analysis with Pandas 📊


In [134]:
# Creating a dictionary with EV data:
data = {
    "Model": ev_models,
    "RangeKilometers": ranges,
    "BatteryCapacity_kWh": ['95', '59', '66', '42', '97', '84.7', '65.4', '77', '91', '97'],
    "Price_Euro": [94990, 35900, 26400, 39900, 74700, 92400, 32800, 47666, 65900, 106400]
}

In [None]:
# Task 1: Let's create a pandas dataframe using the dictionary
import pandas as pd

df = pd.DataFrame(data)
print("🚗 EV Data DataFrame:\n", df)

In [None]:
# Task 2: Create a column 'Brand' by transforming the column 'Model'.
# Hint: It's always the first word which corresponds to the brand.
df['Brand'] = df['Model'].apply(lambda model: model.split()[0])
print(df)

In [None]:
# Task 3: Calculate the total range miles across all EVs using pandas
total_range = df["RangeKilometers"].sum()
print("Total Range (miles):", total_range)



In [None]:
# Task 4: Create a column called 'ClassRange' with short_range (<= 400 kilometers) and long_range (> 400 kilometers)
df['ClassRange'] = 'short_range'
df.loc[df['RangeKilometers'] > 400, 'ClassRange'] = 'long_range'
print(df)

In [None]:
# Task 5: Calculate the average price for all long range and short range cars
average_price_per_range = df.groupby("ClassRange")["Price_Euro"].mean()
print("Average Price per range class:\n", average_price_per_range)

In [None]:
# Task 6: Find the EV with the highest battery capacity
highest_capacity_ev = df.loc[df["BatteryCapacity_kWh"].idxmax()]
print("EV with Highest Battery Capacity:\n", highest_capacity_ev)


In [None]:
# Task 7: Which car is the most or least efficient, i.e. which car requires the least and most kwh / kilometer?

# Hint: Sometimes the data type is not appropriate to support a mathematical operation.
# E.g. the 'BatteryCapacity_kWh' column is initialized as a string. It's not possible to divide a numeric value
# with an int. Therefore we first need to transform this column into a numeric one.
# So first print out the (d)types of each column, then transform the BatteryCapacity_kWh column into a
# numeric one and finally solve the task.
print('Before transformation')
print(df.dtypes)
df['BatteryCapacity_kWh'] = df['BatteryCapacity_kWh'].astype(float)
print('After transformation')
print(df.dtypes)

df['efficiency'] = df['BatteryCapacity_kWh'] / df['RangeKilometers']
print(df)

In [None]:
# Task 8
# Create 4 new columns, which calculates the costs for 50.000, 100.000,
# 500.000 and 1.000.000 kilometers for each car.
# We want to include the car's acquisition costs and
# the costs for electric power. Please assume a price of 0.31€ per kwh.
df['costs_50k'] = df['BatteryCapacity_kWh'] / df['RangeKilometers'] * 0.31 * 50000 + df['Price_Euro']
df['costs_100k'] = df['BatteryCapacity_kWh'] / df['RangeKilometers'] * 0.31 * 100000 + df['Price_Euro']
df['costs_500k'] = df['BatteryCapacity_kWh'] / df['RangeKilometers'] * 0.31 * 500000 + df['Price_Euro']
df['costs_1000k'] = df['BatteryCapacity_kWh'] / df['RangeKilometers'] * 0.31 * 1000000 + df['Price_Euro']
print(df)


In [None]:
# Task 9: Sort the dataframe according to the 50k and 1000k colunm. Does the order change? What does that mean?
df_sorted_50 = df.sort_values(by='costs_50k')
print(df_sorted_50)
df_sorted_1000 = df.sort_values(by='costs_1000k')
print(df_sorted_1000)

In [None]:
# Task 10: Please filter all cars which have a price less than 50.000€, a range more than 360 km and a battery capacity of less than 70kWh.
# Hint: You can use either the 'query' function of a pandas dataframe or chaining the boolean index.

print('Using the query function:')
print(df.query('Price_Euro < 50000 and RangeKilometers > 360 and BatteryCapacity_kWh < 70'))

print('\n\n\nChaining the boolean index:')
print(df[(df['Price_Euro'] < 50000) & (df['RangeKilometers'] > 360) & (df['BatteryCapacity_kWh'] < 70)])

In [None]:
# [Bonus] Task 11
# Is there a correlation between the range of a car and its price?

# One way of answering this question is to make a plot with a correlation line:
import matplotlib.pyplot as plt
import numpy as np

rng = np.random.default_rng(1234)


x = df['RangeKilometers'].to_numpy()
y = df['Price_Euro'].to_numpy()

# Initialize layout
fig, ax = plt.subplots(figsize=(9, 9))

# Add scatterplot
ax.scatter(x, y, s=60, alpha=0.7, edgecolors="k")

# Fit linear regression via least squares with numpy.polyfit
# It returns an slope (b) and intercept (a)
# deg=1 means linear fit (i.e. polynomial of degree 1)
b, a = np.polyfit(x, y, deg=1)

# Create sequence of 100 numbers from 0 to 100
xseq = np.linspace(min(x), max(x), num=100)

# Plot regression line
ax.plot(xseq, a + b * xseq, color="k", lw=2.5)