# Python & Data Science Fun Guide and Cheat Sheet

Welcome, Code Warrior to your Pythong cheat sheet! This notebook covers Python basics, control flow, data structures, file I/O, exception handling, object-oriented programming, and data libraries (Pandas, NumPy), plus API & web scraping tips. Each section includes practical examples and explanations to help you understand why things work as they do.

## 1. Python Fundamentals

### Comments

In [None]:
# Single-line comment: your mission log in the Matrix
# Example: 
# This is a comment

"""
Multi-line comments: perfect for epic backstories (or documenting your functions).
"""

### Variable Assignments & Data Types

In [None]:
# Variables store your data—like secret codes:
hero = "Neo"            # String
lives = 3               # Integer (extra lives, like in a video game)
pi = 3.14159            # Float
is_awake = True         # Boolean

### String Operations, Indexing & Slicing

In [None]:
# Strings are sequences (imagine them as lines of code in the Matrix):
spell = "Expecto Patronum"
print('Spell:', spell)

# Indexing: Get the first character
print('First character:', spell[0])  # 'E'
print('First letter:', spell[-1])  # 'm'

# Slicing: Extract part of the spell
print('Partial spell:', spell[0:6])  # 'Expect'

# Common string methods:
print('Lowercase:', spell.lower()) # 'expecto patronum'
print('Uppercase:', spell.upper()) # 'EXPECTO PATRONUM'
print('Replaced:', spell.replace("Patronum", "Lumos")) # 'Expecto Lumos'
words = spell.split() # Split into words
print('Words in spell:', words) # ['Expecto', 'Patronum']

### Operators & Comparisons

Operators are the building blocks of expressions. Here are some common ones:

| Operator | Description | Example |
|----------|-------------|---------|
| `+`      | Addition    | `3 + 5` |
| `-`      | Subtraction | `10 - 2` |
| `*`      | Multiplication | `4 * 3` |
| `/`      | Division (float) | `10 / 2` |
| `//`     | Floor Division | `10 // 3` |
| `%`      | Modulus     | `10 % 3` |
| `**`     | Exponentiation | `2 ** 3` |
| `==`     | Equal       | `x == y` |
| `!=`     | Not Equal   | `x != y` |
| `>=`     | Greater Than or Equal | `x >= y` |
| `>`      | Greater Than | `x > y` |
| `<=`     | Less Than or Equal | `x <= y` |
| `<`      | Less Than   | `x < y` |
| `and`    | Logical AND | `x > 5 and y < 10` |
| `or`     | Logical OR  | `x == 5 or y == 10` |
| `not`    | Logical NOT | `not flag` |

These operators let you compare values and build complex conditions.

For example, use `and` to ensure multiple conditions are met:

```python
marks = 90
attendance = 87
if marks >= 80 and attendance >= 85:
    print("Qualify for honors")
else:
    print("Not qualified for honors")
```

## 2. Control Flow & Looping

Both **for** and **while** loops follow a pattern:
1. **Initialization:** Set up your starting point or conditions.
2. **Condition:** Decide when the loop should keep going.
3. **Execution:** Run the code inside the loop.
4. **Update:** Change your starting point or condition to eventually end the loop.

**For Loops:** Use when you know the number of iterations or are iterating over a collection(like looping through your playlist).

**While Loops:** Use when tje number of iterations is uncertain and depends on a condition is met.

### For Loop Examples: Iterating over a collection

In [None]:
# Example 1: Iterating over a list of years
release_years = [2001, 2003, 2005]
for i in range(len(release_years)):
    print(f"Arsenal star year: {release_years[i]}")

# Example 2: Modifying elements in a list
colors = ['red', 'yellow', 'green', 'purple', 'blue']
for i in range(len(colors)):
    print(f"Before: Color {i} is {colors[i]}")
    colors[i] = 'white'
    print(f"After: Color {i} is {colors[i]}")

# Example 3: Updating dictionary values
systems = {'warp_drive': 5, 'shields': 3, 'phasers': 4}
for key in systems:
    systems[key] += 1
print('Upgraded systems:', systems)

### While Loop Examples

In [None]:
# Example: Print numbers until a condition is met. E.g countdown before a hyperspace jump
countdown = 5
while countdown > 0:
    print(f"Jump in {countdown}...")
    countdown -= 1
print("Warp drive engaged!")

### Using `enumerate` for Advanced Looping

In [None]:
# Example: Looping over a playlist with indices
tracks = ['Breakbeat', 'Neurofunk', 'Liquid Funk']
for index, track in enumerate(tracks):
    print(f"Track {index + 1}: {track}") # Track 1: Breakbeat # Track 2: Neurofunk # Track 3: Liquid Funk

## 3. Functions & Lambdas

### Defining and Calling Functions

In [None]:
# Functions allow you to reuse code.
def cast_spell(spell_name):
    return f"Casting {spell_name}!"

print(cast_spell("Expecto Patronum"))

### Lambda Functions and List comprehension

In [None]:
# Lambda: a one-line function to square a number
square = lambda x: x * x
print(square(7))  # Output: 49
# List comprehensions: a concise way to create lists
squares = [x * x for x in range(10)]
print(squares) # Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

## 4. Data Structures

### Overview & Comparison

| Feature        | List (`[]`)        | Tuple (`()`)         | Dictionary (`{}`)                           | Set (`{}`)                    |
|----------------|--------------------|----------------------|---------------------------------------------|-------------------------------|
| Ordered        | ✅ Yes             | ✅ Yes               | ✅ Yes (insertion order, Py3.7+)             | ❌ Unordered                  |
| Mutable        | ✅ Yes             | ❌ No                | ✅ Yes (values mutable; keys immutable)     | ✅ Yes (elements immutable)   |
| Duplicates     | Allowed            | Allowed              | Keys: **Unique**; Values: Allowed           | Not allowed (unique only)     |
| Indexing       | Yes                | Yes                  | By key only                                 | Not applicable               |
| Use Case       | Dynamic collections| Fixed data           | Fast lookup/mapping                         | Uniqueness, set operations   |

### Lists

In [None]:
# Creating and modifying a list
playlist = ['Old Skool Hip Hop', 'Drum & Bass', 'Jungle']
print('Original playlist:', playlist)
print(f"Playlist length: {len(playlist)}") # Length of the playlist 4
print(f"First track: {playlist[0]}") # First track in the playlist
print(f"Slice playlist: {playlist[1:3]}") # Slice the playlist from index 1 to 2

# Append a new track
playlist.append('Breakbeat') # Add a new style
playlist.insert(1, 'Dubstep') # Insert Dubsete at index 1
print('Updated playlist:', playlist) # ['Old Skool Hip Hop', 'Dubstep', 'Drum & Bass', 'Jungle', 'Breakbeat']

### Tuples

In [None]:
# Tuples are immutable. 
coordinates = (42, 73)
print('Original coordinates:', coordinates)
x, y = coordinates # Unpacking the tuple
print('First element of tuple:', x) # 42
print('First element of tuple:', coordinates[0]) # 42

# To update, convert to a list and back.
temp = list(coordinates)
temp[0] = 99
coordinates = tuple(temp)
print('Updated coordinates:', coordinates)

# Since tuples are immutable, sorting returns a new tuple
sorted_tuple = tuple(sorted((3, 1, 4, 2)))
print('Sorted tuple:', sorted_tuple) # (1, 2, 3, 4) 

### Dictionaries

In [None]:
# Creating a dictionary
starship = {"warp_drive": 5, "shields": 3, "phasers": 4}
print('Initial starship systems:', starship)
print('Warp drive:', starship['warp_drive']) # Accessing by key
starship['nacelles'] = 2 # Adding a new key-value pair
print('Starship systems:', list(starship.keys())) # List of keys
print('Starship system values:', list(starship.values())) # List of values

# Boost each system
for system in starship:
    starship[system] += 1
print('Upgraded starship systems:', starship)

### Sets

In [None]:
# Sets hold unique items like a collection of legendary relics.
relics = {"Excalibur", "Mjolnir", "Philosopher's Stone", "Excalibur"}
print('Unique relics:', relics) # Set automatically removes duplcated "Excalibur"

# Add a relic and remove one
relics.add("Infinity Gauntlet")
relics.discard("Mjolnir")
print('Updated relics:', relics)

# Set operations
A = {1, 2, 3}
B = {3, 4, 5}
print('Union:', A.union(B))                # Union
print('Intersection:', A.intersection(B))  # Intersection
print('Difference:', A.difference(B))        # Difference
print('Symmetric Difference:', A.symmetric_difference(B))  # Symmetric difference

## 5. Exception Handling & Debugging

### Try-Except Blocks: Handling Errors Gracefully

In [None]:
try:
    num = int(input("Enter a number: "))
except ValueError:
    print("Invalid input. Please enter a valid number.")
else:
    print("You entered:", num)
finally:
    print("Execution complete.")


# Handling division by zero and type errors
try:
    result = 10 / 0
except ZeroDivisionError:
    print("Error: Division by zero encountered!")
except TypeError:
    print("Error: Invalid type provided!")
else:
    print("Division successful, result:", result)
finally:
    print("Cleanup actions executed.")

### More Exception Cases: File and Key Errors

In [None]:
try:
    data = {'matrix': 'red pill', 'reality': 'blue pill'}
    print(data['illusion'])
except KeyError:
    print("KeyError: 'illusion' not found in data.")

try:
    with open('non_existent_file.txt', 'r') as f:
        content = f.read()
except FileNotFoundError:
    print("FileNotFoundError: The file does not exist.")
finally:
    print("File operation attempted.")

### Debugging Tips

Debugging is an art—like detective work in code:

- Use `print()` to inspect variable values.
- Use a debugger (e.g., `%debug` in Jupyter) to step through your code.
- Write tests to isolate bugs.
- Comment your code and use logging for more complex projects.

Remember: Even Morpheus had to debug the Matrix!

## 6. File Handling

### File Modes & Cursor Control

File modes determine how you interact with files:

| Mode  | Description |
|-------|-------------|
| `'r'`   | Read (file must exist) |
| `'w'`   | Write (overwrites file) |
| `'a'`   | Append (adds to end of file) |
| `'x'`   | Exclusive creation (fails if file exists) |
| `'r+'`  | Read and write |

Cursor methods:
- `.tell()` returns the current position (in bytes).
- `.seek(offset, from)` moves the cursor (0: start, 1: current, 2: end).
- `.truncate()` cuts the file at the current cursor position.

Think of it as rewinding or fast‑forwarding through a mixtape.

```python
# Read entire file as a string
content = f.read()

# Read file line by line into a list
lines = f.readlines()

# Write a string to a file
f.write("Some text")

# Write multiple lines from a list
f.writelines(["Line 1\n", "Line 2\n"])

### File I/O Examples

In [None]:
# Example: Using 'a+' mode to read and then write
with open('Example2.txt', 'a+') as file:
    print("Initial Location:", file.tell())
    data = file.read()
    if not data:
        print('Read nothing')
    else:
        print('Data:', data)
    file.seek(0, 0)  # Move to beginning
    print("New Location:", file.tell())
    data = file.read()
    if not data:
        print('Read nothing')
    else:
        print('Data after seek:', data)
    print("Location after read:", file.tell())

# Example: Using 'r+' mode with truncate
with open('Example2.txt', 'r+') as file:
    file.seek(0, 0)  # Write at beginning
    file.write("Line 1\nLine 2\nLine 3\nLine 4\nfinished\n")
    file.truncate()  # Remove extra content
    file.seek(0, 0)
    print(file.read())

### Reading and Writing CSV, Excel & SQL

Pandas makes it easy to work with data files. Use the following methods:

- **CSV Files**:
  - Read: `pd.read_csv('data.csv')`
  - Write: `df.to_csv('output.csv', index=False)`

- **Excel Files**:
  - Read: `pd.read_excel('data.xlsx')`
  - Write: `df.to_excel('output.xlsx', index=False)`

- **SQL Databases**: Use `pd.read_sql()` and `df.to_sql()` (requires SQLAlchemy).

These methods allow you to import/export data seamlessly.

#### Arsenal & Star Trek Themed Data Examples

In [None]:
import pandas as pd

# DataFrame
arsenal_data = {
    'Player': ['Thierry Henry', 'Cesc Fabregas', 'Robin van Persie', 'Samir Nasri', 'Jack Wilshere'],
    'ShirtNumber': [14, 4, 10, 7, 8],
    'Position': ['Forward', 'Midfielder', 'Forward', 'Midfielder', 'Midfielder'],
    'Goals': [175, 69, 52, 30, 20],
    'Appearances': [369, 266, 150, 200, 300]
}
arsenal_df = pd.DataFrame(arsenal_data)
print('Arsenal DataFrame:')
print(arsenal_df.head())

# Star Trek DataFrame
tng_data = {
    'Officer': ['Jean-Luc Picard', 'William Riker', 'Data', 'Geordi La Forge', 'Worf'],
    'Position': ['Captain', 'Commander', 'Lt. Commander', 'Chief Engineer', 'Lieutenant'],
    'Age': [59, 42, 35, 40, 38],
    'Department': ['Command', 'Command', 'Operations', 'Engineering', 'Security']
}
tng_df = pd.DataFrame(tng_data)
print('\nStar Trek TNG DataFrame:')
print(tng_df.head())

### Advanced DataFrame Indexing & Slicing

Pandas offers two main indexing methods:

- **.loc[]**: Label-based indexing (inclusive of both start and stop).
- **.iloc[]**: Integer position-based indexing (like standard Python slicing).

Examples:

```python
# iloc: Get rows 0 to 1 and columns 0 to 2 from Arsenal DataFrame
arsenal_df.iloc[0:2, 0:3]

# loc: Select rows by label and columns by name from TNG DataFrame
tng_df.loc[2:3, 'Officer':'Department']

# Access a single element
arsenal_df.loc[arsenal_df.index[0], 'Goals']

# Slicing with loc (both bounds inclusive)
tng_df.loc[0:2, 'Officer':'Age']
```

### Filtering, Grouping & Merging DataFrames

In [None]:
# Filter Arsenal players with fewer than 60 goals
filtered_arsenal = arsenal_df[arsenal_df['Goals'] < 60]
print(filtered_arsenal)

# Group TNG officers by Department and count them
grouped_tng = tng_df.groupby('Department').size()
print(grouped_tng)

# Merging DataFrames: Example of a simple merge
merged_df = pd.merge(arsenal_df, tng_df, how='cross')
print(merged_df.head())

### Working with Pandas Series

In [None]:
# Creating a Pandas Series
s = pd.Series([10, 20, 30, 40, 50])
print('Series values:', s.values) # [10 20 30 40 50]
print('Series index:', s.index) # RangeIndex(start=0, stop=5, step=1)
print('Series shape:', s.shape) # (5,)

# Series methods
print('Mean:', s.mean()) # 30.0
print('Sum:', s.sum()) # 150
print('Unique values:', s.unique()) # [10 20 30 40 50]

## 7. Object-Oriented Programming (OOP)

### Class Definition & Object Creation

In [None]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def greet(self):
        return f"Hello, my name is {self.name} and I'm {self.age} years old."

# Create an instance of Person
person1 = Person("Alice", 25)
print(person1.greet())

## 8. Data Handling with Pandas and NumPy

### Pandas Cheat Sheet

| Method | Description | Syntax & Example |
|--------|-------------|------------------|
| `.dtypes` | Shows data types of different columns in the DataFrame | `df.dtypes`   |
| `.astype` | Allows to convert data types in the DataFrame. This should also be used when using replace, fillna and read_csv to  | `df['price', 'size'] = df['price', 'size'].astype('int')` and e.g. # Clean up missing values safely `df["price"] = df["price"].replace("?", np.nan).astype(float)`  |
| `.read_csv(<CSV_path>)` | Read CSV file into DataFrame | `df = pd.read_csv("data.csv")` , or if no header `df = pd.read_csv("data.csv", header = None)`, or if header definedas a list header `df = pd.read_csv(file_name, names = headers)`   |
| `.read_excel()` | Read Excel file | `df = pd.read_excel("data.xlsx")` |
| `.to_csv(<output CSV path>)` | Write DataFrame to CSV. index = False means row names will not be written | `df.to_csv("output.csv", index=False)` |
| `df["col"]` | Access a column | `df["age"]` |
| `.describe()` | Summary statistics by default to number columns | `df.describe()`, for all fields `df.describe(include="all")` |
| `.replace("?", np.NaN, inplace = True))` | Replaces '?' value with NumPy NaN, this allows to clean data from unused values. Inplace=True assigns to same variable df BUT from 3.0 should assign properly and stop using inplace=True -> | `df = df.replace("?", np.NaN)`|
| `.["column1"].replace(np.NaN, mean)` | Replaces np.NaN value in column1 with variable mean, this allows to update data from unknown values to something like the mean. From 3.0 should assign properly | `mean = df["price"].mean()` or safer `mean = df["price"].astype("float").mean(axis=0)` <- sets mean value as mean of value from colum so it can be used for replacing NaN values in column -> `df['price'] = df["price"].replace(np.NaN, mean)`|
| `.drop()` | Drop rows/columns | `df.drop(["col"], axis=1, inplace=True)` |
| `.dropna()` | Remove NaN rows | `df.dropna(inplace=True)` |
| `.dropdna(subset=["column1"], axis=0)` | Drops all rows, where column1 has a NaN value, this allows to clean data from rows that are missing relevant data | `df.dropdna(subset=["price"], axis=0)`|
| `.groupby()` | Group and aggregate data | `df.groupby("col").agg({"sales": "sum"})` |
| `.head()` | Show first n rows | `df.head(5)` |
| `.tail()` | Show last n rows | `df.tail(5)` |
| `.info()` | DataFrame info | `df.info()` |
| `.merge()` | Merge DataFrames | `merged_df = pd.merge(df1, df2, on=["key"])` |
| `.columns` | Show headers | `df.columns` |
| `.columns = headers` | Set columns to headers list variable | `df.columns = headers`, headers = ["id", "name"] |
| `.get_dummies` | Creates dummy or indicator variables so can turn categorical variables into quantitive ones so each value in column gets its own column | `dummy_variable_1 = pd.get_dummies(df["fuel-type"])` and then rename columns `dummy_variable_1.rename(columns={'gas':'fuel-type-gas', 'diesel':'fuel-type-diesel'}, inplace=True)` and merge 2 datafames `df = pd.concat([df, dummy_variable_1], axis=1)` and finally drop original column "fuel-type" from "df" `df.drop("fuel-type", axis = 1, inplace=True)`|
| `.rename()` | Allows to rename for example columns column | `df = df.rename(columns={"mpg": "l/100km"})` e.g. `df.rename(columns={'old_name':\'new_name'}, inplace=True)`|
| `.isnull()` | Allows to check if value that is passed into it is in fact missing data. Everything NaN evaluates to True | `missing_data = df.isnull()` `missing_data.head(5)` |
| `.notnull()` | Allows to check if value that is passed into it is NOT in fact missing data. Everything NaN evaluates to False | `missing_data = df.isnotnull()` `missing_data.head(5)` |
| `.value_counts()` | Allows to check value counts in column | `df['price'].value_counts()` |
| `.idxmax()` | Calculates most common type in column | `df['price'].value_counts().idxmax()` with variable `MostFrequentEntry = df['attribute_name'].value_counts().idxmax()` `df['attribute_name'].replace(np.nan,MostFrequentEntry,inplace=True)` |

### Iterating through data to see what has missing data

```python
missing_data = df.isnull()
print(missing_data.head())
for column in missing_data.columns.values.tolist():
    print(column)
    print (missing_data[column].value_counts())
    print("")  
```

### Normalisation

```python
df['attribute_name'] = df['attribute_name']/df['attribute_name'].max()
```

### Binning

```python
bins = np.linspace(min(df['attribute_name']), 
max(df['attribute_name'],n)
# n is the number of bins needed 
GroupNames = ['Group1','Group2','Group3,...]
df['binned_attribute_name'] = 
pd.cut(df['attribute_name'], bins, labels=GroupNames, include_lowest=True)
```

### Plot graphs

```python
plt.bar(group_names, df["Price-binned"].value_counts())
plt.xlabel("Price")
plt.ylabel("count")
plt.title("Price bins")
```

## 9. Numerical Computing with NumPy

### NumPy Cheatsheet

| Method | Description | Syntax & Example |
|--------|-------------|------------------|
| `np.array()` | Create an array | `arr = np.array([1,2,3])` |
| `np.mean()` | Mean of elements | `np.mean(arr)` |
| `np.sum()`  | Sum of elements | `np.sum(arr)` |
| `np.min()`  | Minimum value | `np.min(arr)` |
| `np.max()`  | Maximum value | `np.max(arr)` |
| `np.dot()`  | Dot product | `np.dot(arr1, arr2)` |
| `np.round()`  | Round values in this case for colum Screen Size to nearest 2 decimal places | `np.round(df[['Screen_Size']], 2)` |
| `np.linspace()`  | Creates bins of data. E.g. creates 3 equal sized bins needs 4 numbers (1,3,5,7) and labels them. Name them and use function cut to determine where each value belongs to -> | `bins = np.linspace(min(df['price']), max(df['price']),4)`, `group_names = ["Low", "Medium", "High"]`, `df['price-binned'] = pd.cut(df['price'], bins, labels=group_names, include_lowest=True)` |

### NumPy Arrays: The Core of Scientific Computing

NumPy arrays are fast and efficient, powering complex calculations.

### Creating and Manipulating Arrays

In [None]:
import numpy as np

# Creating arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([[1, 2], [3, 4]])

# Basic attributes
print('Dimensions:', arr1.ndim)
print('Shape:', arr1.shape)
print('Size:', arr1.size)

# Slicing with steps
print('Slice [1:5:2]:', arr1[1:5:2])

# Element-wise operations
print('Addition (broadcasting):', arr1 + 5)
print('Multiplication:', arr1 * 2)

# Dot product
dot_product = np.dot(arr1, arr1)
print('Dot product:', dot_product)

### Universal Functions & Matrix Operations

In [None]:
# Universal functions operate element-wise
print('Mean:', np.mean(arr1))
print('Standard Deviation:', np.std(arr1))

# Matrix multiplication example
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])
print('Matrix multiplication (dot):', np.dot(mat1, mat2))

## 9. API & Web Scraping

### Making API Requests with Requests

In [None]:
import requests

url = "https://api.example.com/data"
response = requests.get(url)
if response.status_code == 200:
    data = response.json()
    print(data)
else:
    print('Failed to retrieve data:', response.status_code)

### Web Scraping with BeautifulSoup

In [None]:
from bs4 import BeautifulSoup

# Imagine scraping secret archives from a digital world like in Assassin’s Creed
html = '<html><body><a href="http://example.com">Link</a></body></html>'
soup = BeautifulSoup(html, 'html.parser')
link = soup.find('a')
print('Scraped link:', link['href'])

### API Request Methods and Headers

| Method   | Description                  | Syntax & Example |
|----------|------------------------------|------------------|
| `GET`    | Retrieve data                | `response = requests.get(url)` |
| `POST`   | Submit new data              | `response = requests.post(url, data={...})` |
| `PUT`    | Update existing data         | `response = requests.put(url, data={...})` |
| `DELETE` | Delete a resource            | `response = requests.delete(url)` |
| Headers  | Pass custom headers          | `headers = {'Authorization': 'Bearer TOKEN'}` |

## End of Cheat Sheet

Congratulations, Code Warrior! This notebook provides a comprehensive reference for Python programming fundamentals, data structures, control flow, file handling, OOP, and data manipulation libraries. It's designed to be both a quick reference and a learning tool. Feel free to extend and modify it to fit your projects and style. Happy coding and may your algorithms be ever in your favor!