# Python Programming Revision

This notebook includes the Python Programming Revision lecture and examples of useful Python programming concepts. These examples aim to familiarize students with Python fundamentals and introduce scientific libraries, such as Numpy, Pandas, and Plotly, which will be used in subsequent lectures.

The link to the GitHub repository: 

## Learning aims

- Set up and manage virtual environments
- Work effectively within Jupyter Notebooks
- Review useful Python features for data science
- Perform numerical operations using the NumPy library
- Manipulate data with Pandas DataFrames
- Create interactive visualisations using Plotly


## Remember

The course content might sometimes deviate from best Software Engineering practices. The primary goal of the code examples is to illustrate statistical concepts and build intuition through simulations, rather than adhering strictly to engineering standards.


## What is a Jupyter Notebook?

- Interactive Python environment for running code.
- Combines code, output, and documentation in one place.
- Supports rich outputs: text, images, plots, and interactive visualizations.
- Ideal for live coding, testing, and quick feedback.
- Allows exporting and sharing notebooks as HTML, PDF, or .ipynb files.


## How to run Jupyter Notebooks?

- Hosted Jupyter Services:
    - Binder: https://mybinder.org/
    - Google Colab: https://colab.research.google.com/
    - Kaggle: https://www.kaggle.com/code
- Local Setup: recommended with a virtual environment, like uv, Conda or venv.

## What is uv?

- A package and project manager.
- Helps manage dependencies and libraries in isolated environments.
- Keeps projects organized by avoiding version conflicts.
- Allows different projects to use different versions of Python or libraries.

To install uv and set up an environment called in-stk1050 with Python 3.12 locally, and to install necessary packages:

```
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv in-stk1050 --python=3.12
source in-stk1050/bin/activate
uv pip install numpy pandas plotly jupyter
```

To run Python console:
```
(in-stk1050) user@computer ~ % python
Python 3.12.11 (main, Sep 18 2025, 19:41:45) [Clang 20.1.4 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello world!")
Hello world!
>>> import numpy as np
>>> np.__version__
'2.4.1'
```

Other environment managers:
- venv (https://docs.python.org/3/library/venv.html)
- conda (https://www.anaconda.com/docs/getting-started/main)
- more on uv: https://docs.astral.sh/uv/


## Useful Python features for data science

In this section, we will cover:
- f-strings,
- revisit lists and dictionaries,
- list and dict comprehensions, 
- functions,
- typing.

## F-strings

F-strings provide an easy and efficient way to format strings in Python. They allow embedding expressions inside string literals, using curly braces {}.

In [None]:
# Classic Hello World example:
print("Hello World!")

In [None]:
# Using f-strings:
name = "Alice"
score = 90
print(f"{name} scored {score} in the test.")  # Output: Alice scored 90 in the test.

In [None]:
# Complex expression inside f-string:
print(f"Half of {score} is {score / 2}.")

## Working with Lists

Lists are ordered, mutable collections used to store a sequence of elements. They allow for operations like appending, accessing, and slicing.


In [None]:
# Creating and adding values to the list:
my_list = [10, 20, 30, 40, 50]
print(my_list)  # Output: [10, 20, 30, 40, 50]
my_list.append(60)
print(my_list)  # Output: [10, 20, 30, 40, 50, 60]

# Deleting fifth element
my_list.pop(4)
print(my_list)  # Output: [10, 20, 30, 40, 60]
# Deleting a specific element from the list:
my_list.remove(60)
print(my_list) # Output: [10, 20, 30, 40, 60]

In [None]:
# Accessing Elements and Slicing:
print(my_list)
print(my_list[1])  # Output: 20 (second element)
print(my_list[-1])  # Output: 40 (last element)
print(my_list[:3])  # Output: [10, 20, 30] (first three elements)
print(my_list[-2:])  # Output: [30, 40] (last two elements)
print(my_list[1:3])  # Output: [20, 30] (second and third elements)

## Working with Dictionaries

Dictionaries store key-value pairs. They are unordered and mutable, making them ideal for mapping relationships (like names and scores).


In [None]:
# Creating and accessing a dictionary:

student_scores = {"Alice": 85, "Bob": 92, "Charlie": 78}

print(student_scores)  # Output: {'Alice': 85, 'Bob': 92, 'Charlie': 78}
print(student_scores["Bob"])  # Output: 92 (Bob's score)
print(student_scores.get("Charlie"))  # Output: 78 (Charlie's score)
print(student_scores.values()) # Output: dict_values([85, 92, 78])
print(student_scores.keys()) # Output: dict_keys(['Alice', 'Bob', 'Charlie'])

In [None]:
# Adding a new key-value pair:
student_scores["David"] = 90
print(student_scores)  # Output: {'Alice': 85, 'Bob': 92, 'Charlie': 78, 'David': 90}

# Updating an existing value:
student_scores["Charlie"] = 80
print(student_scores)  # Output: {'Alice': 85, 'Bob': 92, 'Charlie': 80, 'David': 90}

# Deleting a key-value pair:
del student_scores["Alice"]
print(student_scores)  # Output: {'Bob': 92, 'Charlie': 80, 'David': 90}

In [None]:
# Iterating over a dictionary with for loop:
for key, value in student_scores.items():
    print(f"{key} scored {value}")

## List comprehension

List comprehensions offer a concise way to create lists in Python. They provide an elegant and efficient alternative to using loops for generating lists.
They are often used to apply an expression to each item in an iterable (such as a list or range) or to filter elements based on a condition.


In [None]:
my_list = list(range(1, 6))
print(my_list)

# find squares of all elements in the list:
squares = []
for x in my_list:
    squares.append(x**2)

print(squares)

In [None]:
# the same with list comprehension:
squares = [x**2 for x in my_list]
print(squares)

In [None]:
# filtering a list in a for loop:
even_numbers = []
for x in my_list:
    if x % 2 == 0:
        even_numbers.append(x)

print(even_numbers)

# list comprehension for filtering:
even_numbers = [x for x in my_list if x % 2 == 0]
print(even_numbers)

In [None]:
# Conditional processing of elements of the list:
even_squares = []
for x in my_list:
    if x % 2 == 0:
        even_squares.append(x**2)
print(even_squares)

# List comprehension with condition:
even_squares = [x**2 for x in range(1, 6) if x % 2 == 0]
print(even_squares)  # Output: [4, 16]

### Exercise: student grades

Given a list of student scores, first filter out the ones below 60, and then convert the remaining scores to grades in the following way: for scores above 90, the grade is "A"; for scores above 75, the grade is "B", and for scores above 60, the grade is "C".

In [None]:
scores = [55, 81, 31, 78, 93, 61]

# TODO: fill this
scores_passed = [el for el in scores if el > 60]
grades = ["A" if el > 90 else "B" if el > 75 else "C" for el in scores_passed]

print(scores_passed)
print(grades)

## Dict comprehension

- Dict comprehensions provide a concise way to create dictionaries in Python, just like list comprehensions do for lists.
- They allow you to generate dictionaries from an iterable by specifying both the keys and the values in a single, readable line of code.
- You can also apply conditions to filter which key-value pairs are added to the dictionary.


In [None]:
# Make a dict where keys are numbers 1 to 5 and values are their squares:
squares_dict = {}
for x in range(1, 6):
    squares_dict[x] = x**2

print(squares_dict)

# Dict comprehension:
squares_dict = {x: x**2 for x in range(1, 6)}
print(squares_dict)

In [None]:
# The same task with only even numbers in the given range:
even_squares_dict = {}
for x in range(1, 6):
    if x % 2 == 0:
        even_squares_dict[x] = x**2

print(even_squares_dict)

# Dict comprehension with condition:
even_squares_dict = {x: x**2 for x in range(1, 6) if x % 2 == 0}
print(even_squares_dict)


In [None]:
# Swap keys and values in an existing dictionary
original_dict = {'a': 1, 'b': 2, 'c': 3}
swapped_dict = {v: k for k, v in original_dict.items()}
print(f"Original dict: {original_dict}")
print(f"Swapped dict: {swapped_dict}")  # Output: {1: 'a', 2: 'b', 3: 'c'}

### Exercise: student grades

Given a dictionary with students' names and scores, make a new dictionary only with students that have passed the exam (score > 60) and map each student score to a grade (score > 90 -> "A", score > 75 -> "B", score > 60 -> "C").

In [None]:
scores = {"Alice": 55, "Bob": 81, "Charlie": 31, "Diana": 78, "Eve": 93, "Helen": 61}

# TODO: implement this
passed_scores = {name: score for name, score in scores.items() if score > 60}
grades = {name: "A" if score > 90 else "B" if score > 75 else "C" 
          for name, score in passed_scores.items()}

print(passed_scores)
print(grades)

Starting from the scores, make a dictionary that will have information for each student if they passed/failed and the score they got.

In [None]:
# TODO: implement this
grades = {name: {"status": "passed" if score > 60 else "failed", "score": score} 
          for name, score in scores.items()}

print(grades)
print(grades['Eve'])

## Functions

- Functions are reusable blocks of code that can accept input (arguments), process it, and return output. They help in organizing and simplifying your code.
- Python functions support default arguments, multiple arguments, multiple return values, and lambda (anonymous) functions.



In [None]:
# Functions:
def greet(name):
    return f"Hello, {name}!"

print(greet("Alice"))  # Output: Hello, Alice!

## Type hints

- Python is a dynamically typed language: types are checked at runtime
- **Type hints** allow you to annotate expected types, even through the annotated types are not enforced at runtime
- Type hints improve readability and documentation of the code, especially in larger codebases

In [None]:
# argument name is expected to be a string and the function returns a string
def greet(name: str) -> str: 
    return f"Hello, {name}!"

return_value = greet("Alice")
print(return_value)
print(type(return_value))

In [None]:
# Functions with default arguments:
def greet(name: str = "Alice"):
    return f"Hello, {name}!"

print(greet())  # Output: Hello, Alice!
print(greet("Bob"))  # Output: Hello, Bob!

In [None]:
# Functions with multiple arguments:
def greet(name: str, message: str):
    return f"{message}, {name}!"

print(greet("Alice", "Good Morning"))  # Output: Good Morning, Alice!

## Packing and unpacking with functions

In [None]:
def get_student_info():
    name = "Alice"
    score = 85
    return name, score

student_name, student_score = get_student_info()
print(f"{student_name} scored {student_score}")  # Output: Alice scored 85
student_information = get_student_info()
print(f"{student_information[0]} scored {student_information[1]}")  # Output: Alice scored 85

## Lambda functions

Lambda functions are small anonymous functions with no name, often used for short operations.

In [None]:
# Lambda functions:
multiply = lambda x, y: x * y
print(multiply(5, 4))  # Output: 20

# Using lambda function with map
numbers = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x**2, numbers))
print(squared)  # Output: [1, 4, 9, 16, 25]

# Libraries for Data Science
In this section you will learn about Python libraries used in Data Science, such as:
- Numpy,
- Pandas,
- Plotly.

## Importing libraries

- Importing libraries allows you to use additional functionality from external modules and packages.
- Python has a rich ecosystem of libraries for various tasks such as data manipulation, visualization, and scientific computing.
- You can import entire libraries, specific functions, or give aliases for easier usage.


In [None]:
# Importing libraries:
import numpy as np
import pandas as pd
import plotly.express as px
import random

## Random functions in Python (random library)

- The random module in Python provides functions to generate random numbers and select random elements. It is widely used in simulations, randomized testing, and creating random datasets for data analysis.
- You can generate random numbers, select random elements from lists, and control reproducibility with a seed.


In [None]:
# Random Functions
# You can set a seed to generate the same random numbers:
random.seed(0)

# Generate a random float:
print(random.random())  # Output: Random float between 0 and 1
print(random.uniform(1, 10))  # Output: Random float between 1 and 10

In [None]:
# Generate a random integer:
print(random.randint(1, 10))  # Output: Random integer between 1 and 10

In [None]:
# Generate a random sample from a list:
my_list = [10, 20, 30, 40, 50]

# Sampling without replacement: already sampled element is not available to sample again in the same function call
print(random.sample(my_list, 2))  # Output: Random sample of 2 elements [no replacement]

print(random.choice(my_list))  # Output: Random choice from the list

# Sampling with replacement: choosing from the whole list for every of the k elements
print(random.choices(my_list, k=2))  # Output: Random choice of 2 elements [with replacement]

### Exercise: daily temperatures

Write a function to simulate daily temperature given the baseline temperature. Assume that all temperatures +/- 5 degrees around the baseline temperature are equally likely.

In [None]:
# TODO: implement this

def simulate_daily_temperature(baseline_temp: int = 20) -> int:
    return baseline_temp + random.randint(-5, 5)

simulate_daily_temperature()

## Introduction to NumPy

Numpy (Numerical Python) is a powerful library for numerical computations. It provides support for multi-dimensional arrays and functions for numerical operations on these arrays.


### Creating NumPy arrays

Numpy arrays are more efficient than Python lists for numerical operations.

In [None]:
# Creating Numpy arrays:
# 1D array (vector)
arr_1d = np.array([1, 2, 3, 4, 5])
print("1D array:\n", arr_1d)

# 2D array (matrix)
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("\n2D array:\n", arr_2d)

# 3D array (tensor)
arr_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("\n3D array:\n", arr_3d)

In [None]:
# Checking the type of the arrays
print(type(arr_1d))  # Output: <class 'numpy.ndarray'>
print(type(arr_2d))  # Output: <class 'numpy.ndarray'>

In [None]:
# Shape of the arrays
print("Shape of 1D array:", arr_1d.shape)  # Output: (5,)
print("Shape of 2D array:", arr_2d.shape)  # Output: (3, 3)
print("Shape of 3D array:", arr_3d.shape)  # Output: (2, 2, 2)

In [None]:
# Number of dimensions (ndim) of the arrays
print("Dimensions of 1D array:", arr_1d.ndim)  # Output: 1
print("Dimensions of 2D array:", arr_2d.ndim)  # Output: 2
print("Dimensions of 3D array:", arr_3d.ndim)  # Output: 3

In [None]:
# Basic operations on Numpy arrays:

# Element-wise addition
arr_sum = arr_1d + 2  # Adds 2 to each element
print(f"1D array after addition:\n{arr_sum}")

# Note the difference with Python lists:
list1 = [1, 2, 3]
list2 = [5]
list_sum = list1 + list2
print(f"\n\"Sum\" of two lists: {list_sum}")

In [None]:
# Element-wise multiplication
arr_product = arr_2d * 2  # Multiplies each element by 2
print(f"2D array after multiplication:\n{arr_product}")

In [None]:
# Matrix multiplication
arr_mult = np.dot(arr_2d, arr_2d)  # 2D matrix multiplication
print(f"Matrix multiplication result:\n{arr_mult}")

In [None]:
# Accessing elements in a 1D array
print(f"Element at index 2 in 1D array: {arr_1d[2]}")  # Output: 3

# Accessing elements in a 2D array
print(f"\nElement at row 1, col 2 in 2D array: {arr_2d[1, 2]}")  # Output: 6

# Slicing 1D array
print(f"\nSlicing 1D array [1:4]: {arr_1d[1:4]}")  # Output: [2, 3, 4]

# Slicing 2D array
print(f"\nSlicing 2D array [0:2, 1:3]:\n{arr_2d[0:2, 1:3]}")

In [None]:
# Array Broadcasting:
# Broadcasting a scalar to a 2D array
arr_broadcast = arr_2d + 10
print(f"2D array after broadcasting addition:\n{arr_broadcast}")

# Broadcasting a 1D array to a 2D array
arr_2d_broadcast = arr_2d + arr_1d[:3]  # arr_1d[:3] = [1, 2, 3]
print(f"Broadcasted 2D array:\n{arr_2d_broadcast}")

In [None]:
# Creating Arrays with Special Functions
# Array of zeros
arr_zeros = np.zeros((3, 3))
print(f"Array of zeros:\n{arr_zeros}")

# Array of ones
arr_ones = np.ones((2, 2))
print(f"Array of ones:\n{arr_ones}")

In [None]:
# Array of evenly spaced values (like a range)
arr_range = np.arange(0, 10, 2)
print(f"Array with range 0 to 10 with step 2: {arr_range}")

# Array of random values
np.random.seed(0)
arr_random = np.random.random((2, 3))
print(f"Array of random values:\n{arr_random}")

In [None]:
# Numpy functions for numerical analysis
arr = np.array([1, 2, 3, 4, 5])

print(np.mean(arr))  # Output: 3.0 (mean)
print(np.std(arr))   # Output: 1.414... (standard deviation)
print(np.sum(arr))   # Output: 15 (sum of all elements)
print(np.max(arr))   # Output: 5 (maximum value)

### Exercise: Daily Temperature

Daily temperatures for a week is provided. Write functions to compute the following summary: the average temperature, the coldest and warmest day, and number of days with temperature above and below the historical daily average.

In [None]:
temperatures = np.array([-1, 1, -4, -1, -2, 2, 1])
historical_daily_average = -1

# TODO: implement this
def summarize_week(temperatures: np.ndarray, historical_daily_average: int) -> dict:
    return {
        'average_temperature': float(round(np.mean(temperatures), 2)),
        'coldest_day': int(np.argmin(temperatures) + 1),
        'warmest_day': int(np.argmax(temperatures) + 1),
        'days_above_avg': int(np.sum(temperatures > historical_daily_average)),
        'days_below_avg': int(np.sum(temperatures < historical_daily_average))
    }

summarize_week(temperatures, historical_daily_average)

## Vectorization with NumPy

Vectorization refers to performing operations on entire arrays or matrices (vectors) without explicit loops. It allows you to perform element-wise operations in bulk, making the code more efficient and faster compared to using traditional loops.


In [None]:
# Vectorization
# Without vectorization (using a Python loop)
arr = np.array([1, 2, 3, 4, 5])
squared = np.zeros_like(arr)

for i in range(len(arr)):
    squared[i] = arr[i] ** 2

print(squared)  # Output: [ 1  4  9 16 25]

# With vectorization (using NumPy's array operations)
squared_vectorized = arr ** 2
print(squared_vectorized)  # Output: [ 1  4  9 16 25]

## Introduction to Pandas

Pandas is a powerful Python library for data manipulation and analysis. It provides the **DataFrame**, a 2D labeled data structure that makes it easy to organize, filter, and manipulate structured data, similar to an Excel spreadsheet or SQL table.


In [None]:
# Creating a Pandas DataFrame
data = {
    "Name": ["Alice", "Bob", "Charlie", "David"],
    "Age": [25, 30, 35, 40],
    "Score": [85, 90, 88, 92]
}
df = pd.DataFrame(data)
print(df)

In [None]:
# Reading a CSV file
df = pd.read_csv("week_1_data.csv")
# Displaying the first few rows of the DataFrame
print(df.head())

In [None]:
# Displaying first 10 rows of the DataFrame
print(df.head(10))

In [None]:
# Displaying the last few rows of the DataFrame
print(df.tail())

In [None]:
# Displaying the columns of the DataFrame
print(df.columns)

In [None]:
# Displaying the shape of the DataFrame
print(df.shape)

In [None]:
# DataFrame operations
# Accessing columns
print(df[["Name", "Age"]])

In [None]:
# Filtering rows based on a condition
print(df[df["Age"] > 30])

In [None]:
# Sorting the DataFrame
df_sorted = df.sort_values("Score", ascending=False)
print(df_sorted)

## Saving and exporting data frames

CSV is a very common format for saving and sharing tabular data.

Depending on your needs, you can also export your DataFrame into Excel or JSON formats, giving you flexibility when interacting with different tools or sharing data.

In [None]:
# Saving the DataFrame to a CSV file
df_sorted.to_csv("week_1_data_sorted.csv", index=False)

## Introduction to Plotly

Plotly is a library for creating interactive visualizations directly in Python. It is widely used in data science for creating interactive plots, such as scatter plots, bar charts, and line charts.


In [None]:
# Plotting with Plotly
# Basic scatter plot
fig = px.scatter(df, x='Name', y='Score', title='Student Scores by Name (scatter plot)')
fig.show()

In [None]:
# Adding color and size based on another variable
fig = px.scatter(df, x='Name', y='Score', size='Hours_Studied', color='Hours_Studied',
                 title='Student Scores and Hours Studied', hover_name='Name')
fig.show()

In [None]:
# Basic bar plot
fig = px.bar(df, x='Name', y='Score', title='Student Scores by Name (bar plot)')
fig.show()

In [None]:
# Customizing bar colors and adding hover information on sorted data
fig = px.bar(df_sorted, x='Name', y='Score', title='Student Scores by Name (sorted bar plot)',
             color='Hours_Studied', hover_name='Name')
fig.show()

In [None]:
# Histogram
fig = px.histogram(df, x='Score', nbins=5, title='Student Scores Distribution (histogram)')
fig.show()

In [None]:
# Line plot
time_data = {'Week': [1, 2, 3, 4, 5],
             'Score': [75, 80, 82, 85, 90]}
df_time = pd.DataFrame(time_data)

fig = px.line(df_time, x='Week', y='Score', title='Score Trend Over Time', markers=True)
fig.show()

In [None]:
# Customizing the line plot
fig.update_layout(
    title='Student Scores Trend Over Time',
    xaxis_title='Week Number',
    yaxis_title='Score Value'
)
fig.update_layout(template='plotly_white')
fig.show()
# Save the plot to an HTML file
fig.write_html("student_scores.html")

# Comprehensive Coding Exercise - Live Coding
This task combines lists, dictionaries, functions, Numpy, Pandas, and Plotly.

Problem Statement:
- Generate 10 random student scores.
- Create a function to calculate the average score.
- Store student names and scores in a Pandas DataFrame.
- Filter students who scored above the average.
- Visualize the data using Plotly.

In [None]:
import numpy as np
import pandas as pd
import plotly.express as px
from random import randint

# 1. Generate random student scores
students = ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank', 'Grace', 'Hannah', 'Ivy', 'Jack']
scores = [randint(50, 100) for _ in range(10)]

# 2. Function to calculate average score
def calculate_average(scores):
    return np.mean(scores)

average_score = calculate_average(scores)
print(f"Average score: {average_score}")

# 3. Create a DataFrame
data = {'Name': students, 'Score': scores}
df = pd.DataFrame(data)

# 4. Filter students with scores above average
above_average = df[df['Score'] > average_score]
print(above_average)

# 5. Visualize results with Plotly
fig = px.bar(above_average, x='Name', y='Score', title='Students Scoring Above Average')
fig.show()

# Exercises
In this section, you will find exercises to practice the concepts covered in this notebook. You can try to solve these exercises on your own or discuss them with your peers. The solutions are provided below.

# Stock Market Simulation
Simulate random price fluctuations of a stock over time.

Problem statement:

- Use random.uniform(-0.05, 0.05) to generate random daily percentage changes (between -5% and +5%) for 100 days.
- Assume a starting stock price of $100.
- Calculate the daily price changes based on the random percentage changes.
- Store the dates and prices in a Pandas DataFrame.
- Visualize the stock prices over time using Plotly.
- You can also simulate multiple stocks and compare their random price changes in a single visualization.

In [None]:
def simulate_stock_prices(stock_name, days=100, start_price=100):
    """
    Simulates random stock price changes for a given number of days.    
    :param stock_name: The name of the stock.
    :param days: The number of days to simulate.
    :param start_price: The starting price of the stock.
    :return: A dataframe with the simulated stock prices and dates.
    """
    random_percentage_changes = [random.uniform(-0.05, 0.05) for _ in range(days)]
    prices = [start_price]
    
    for change in random_percentage_changes:
        new_price = prices[-1] * (1 + change)
        prices.append(new_price)
    
    dates = list(range(days + 1))
    
    df = pd.DataFrame({
        'Day': dates,
        'Price': prices,
        'Stock': stock_name
    })
    
    return df

stocks = ['Stock_A', 'Stock_B', 'Stock_C']
dfs = [simulate_stock_prices(stock, days=100) for stock in stocks]

combined_df = pd.concat(dfs)

fig = px.line(combined_df, x='Day', y='Price', color='Stock', title='Simulated Stock Prices Over 100 Days for Multiple Stocks')
fig.update_layout(
    xaxis_title='Day',
    yaxis_title='Price',
    legend_title='Stock',
    template='plotly_white'
)
fig.show()

# Dice Roll Simulation 
Simulate rolling two dice using random.randint() and visualize the results.

Problem statement:
- Use random.randint(1, 6) to simulate rolling two dice 10,000 times.
- Store the sum of the two dice for each roll in a Pandas DataFrame.
- Use Plotly to plot the frequency distribution of the sums (2 to 12) as a bar chart.
- Add a feature to simulate dice with different numbers of faces (e.g., 8-sided dice) by changing the range in random.randint().

In [None]:
def simulate_dice_rolls(num_rolls=10000, dice_faces=6):
    """
    Simulates rolling two dice a given number of times and records the sum of their results.
    :param num_rolls: number of times to roll the dice
    :param dice_faces: number of faces on each die
    :return: a DataFrame containing the sums of the two dice rolls
    """
    sums = []
    
    for _ in range(num_rolls):
        die1 = random.randint(1, dice_faces)
        die2 = random.randint(1, dice_faces)
        total = die1 + die2
        sums.append(total)
    
    df = pd.DataFrame({
        'Sum': sums
    })
    
    return df

df = simulate_dice_rolls(num_rolls=10000, dice_faces=6)

sum_counts = df['Sum'].value_counts().sort_index()

fig = px.bar(
    sum_counts, 
    x=sum_counts.index, 
    y=sum_counts.values, 
    labels={'x': 'Sum of Two Dice', 'y': 'Frequency'},
    title='Frequency Distribution of Dice Roll Sums (Two 6-sided Dice)'
)
fig.update_layout(
    xaxis=dict(dtick=1),  # Ensure the x-axis has integer ticks
    template="plotly_white"
)
fig.show()

# Random Daily Temperature Simulation
Simulate daily temperature changes for a year using randomness.

Problem statement:
- Use random.uniform(-5, 5) to simulate daily temperature deviations around an average value (e.g., 25째C).
- Add seasonal effects by varying the average temperature based on the month (e.g., cooler in winter, warmer in summer).
- Store the data in a Pandas DataFrame with columns: "Date" and "Temperature".
- Calculate the average monthly temperature.
- Use Plotly to visualize the daily temperature as a time series and a bar chart for monthly averages.


In [None]:
def simulate_daily_temperatures(days=365, base_temperature=20):
    """
    Simulates daily temperature changes for a period of time (supposedly a year).
    :param days: number of days to simulate
    :param base_temperature: average temperature around which daily deviations occur
    :return: a DataFrame with daily temperatures
    """
    temperatures = []
    for day in range(days):
        # Determine the month (roughly dividing 365 days into 12 months)
        month = (day // 30.5) + 1
        # Apply seasonal effects (cooler in winter, warmer in summer)
        if month in [12, 1, 2]:  # Winter months
            avg_temp = base_temperature - 10
        elif month in [6, 7, 8]:  # Summer months
            avg_temp = base_temperature + 10
        else:  # Spring/Autumn months
            avg_temp = base_temperature
        
        # Simulate daily temperature deviation
        daily_temp = avg_temp + random.uniform(-5, 5)
        temperatures.append(daily_temp)
    
    days_range = list(range(1, days + 1))
    
    df = pd.DataFrame({
        'Day': days_range,
        'Temperature': temperatures
    })
    
    return df

df = simulate_daily_temperatures()

df['Month'] = df['Day'].apply(lambda x: int((x // 30.5) + 1))  # Map days to months
monthly_avg_temp = df.groupby('Month')['Temperature'].mean()
print(monthly_avg_temp)

fig1 = px.line(df, x='Day', y='Temperature', title='Daily Temperature Over One Year')
fig1.update_layout(xaxis_title='Day', yaxis_title='Temperature (째C)', template='plotly_white')

fig2 = px.bar(monthly_avg_temp, x=monthly_avg_temp.index, y=monthly_avg_temp.values, 
              labels={'x': 'Month', 'y': 'Average Temperature (째C)'},
              title='Average Monthly Temperature')
fig2.update_layout(xaxis_title='Month', yaxis_title='Average Temperature (째C)', template='plotly_white')
fig1.show()
fig2.show()