## Python Functions & Modularity 

### Why functions?
- Reuse logic instead of copy-paste.
- Single place to fix or improve behavior.
- Clear structure: each function does one job.
- Easier debugging and testing because logic is isolated.

### Core concepts
- **Function**: named block of code for a specific task.
- **Parameter**: variable in the function definition.
- **Argument**: actual value passed during a call.
- **Return**: sends a result back; without it, Python returns `None`.
- **Scope**: local variables live inside a function; globals live outside.

Problem statement: Create and call a greeting function that prints personalized messages for different names.

In [None]:
def greet(name):
    print(f'Hello, {name}!')  # simple parameter and argument

greet('Alice')
greet('Bob')

This shows a user-defined function with one parameter.
We call it multiple times, passing different arguments.
Functions let us reuse the same logic without rewriting it.

Problem statement: Build a function that returns a full name string from first and last names.

In [None]:
def full_name(first_name, last_name):
    return first_name + ' ' + last_name  # return sends data back

name = full_name('John', 'Doe')
print(name)

The `return` statement hands a value back to the caller.
Without `return`, this function would give `None`.
Returned values can be stored and reused later.

Problem statement: Write a function that loops through a list and returns the sum of its elements.

In [None]:
def sum_list(numbers):
    total = 0
    for num in numbers:  # loop inside a function
        total += num
    return total

my_list = [1, 2, 3, 4, 5]
print(sum_list(my_list))

A loop inside a function lets us repeat work on inputs.
Here we accumulate a running total across the list.
Changing the input list changes the output without editing the function.

Problem statement: Show how local scope can shadow a global variable with the same name.

In [None]:
message = 'global hello'

def show_scope():
    message = 'local hello'  # local variable shadows global
    return message

print('Inside function:', show_scope())
print('Outside function:', message)

Local variables exist only inside the function body.
The global `message` stays unchanged outside.
Understanding scope prevents name clashes and bugs.

Problem statement: Implement a reusable tax calculator with defaults and a safety check against negative taxable income.

In [None]:
def calculate_tax(income, tax_rate=0.2, deduction=5000):
    taxable_income = max(0, income - deduction)  # guard against negatives
    return taxable_income * tax_rate

sample = calculate_tax(50000, tax_rate=0.18, deduction=4000)
print(f'Tax to be paid: {sample}')

This modular function encapsulates the tax rule in one place.
Default parameters make the call flexible while keeping safe guards.
Updating the rule later requires changing only this function.

### Best practices for students
- Name functions after their action (e.g., `calculate_tax`).
- Keep one clear responsibility per function.
- Use parameters instead of hard-coding values.
- Prefer returning values over printing inside helpers.
- Add brief comments when the intent is not obvious.

### Git quick reminders
- `git clone <url>`: get a remote repo the first time.
- `git pull`: bring remote updates into your local clone.
- Commit locally, then `git push` to share changes.

### One-glance recap
| Concept | Description | Example |
| --- | --- | --- |
| Function | Reusable block of code | `def greet(name): ...` |
| Parameter | Placeholder in definition | `name` in `def greet(name)` |
| Argument | Actual value passed | `greet('Alice')` |
| Return | Sends result back | `return first + last` |
| Scope | Where a variable lives | local inside function, global outside |

Problem statement: Create a text file, write a few lines, then read the full content back.

In [None]:
# Create and read a text file
with open('notes_basic.txt', 'w') as f:
    f.write('First line\n')
    f.write('Second line\n')
    f.write('Third line\n')

with open('notes_basic.txt', 'r') as f:
    content = f.read()
    print(content)

Writing and then reading in one place keeps the demo self-contained.
Using a context manager closes the file automatically, even if errors occur.
Printing the full content shows the exact text that was persisted to disk.

Problem statement: Append new log entries to an existing file without overwriting earlier lines.

In [None]:
# Append log entries and inspect the file
log_path = 'app_log.txt'

# Seed the file once
with open(log_path, 'w') as f:
    f.write('session start\n')

# Append new entries
with open(log_path, 'a') as f:
    f.write('user clicked button\n')
    f.write('user submitted form\n')

# Read back to verify we did not overwrite
with open(log_path, 'r') as f:
    for line in f:
        print(line.strip())

Using 'a' mode adds lines to the end instead of erasing the file.
Seeding the file once shows the difference between initial content and appended logs.
Reading at the end verifies the file now has all accumulated events.

Problem statement: Filter a text file while streaming line by line to avoid loading everything into memory.

In [None]:
# Stream lines and print those starting with 'E'
data_path = 'names_list.txt'

# Create the file with mixed names
with open(data_path, 'w') as f:
    f.write('Alice\n')
    f.write('Eve\n')
    f.write('Ethan\n')
    f.write('Bob\n')
    f.write('Elena\n')

# Read line by line to filter without loading all at once
with open(data_path, 'r') as f:
    for line in f:
        name = line.strip()
        if name.startswith('E'):
            print(name)

Creating the file inside the notebook keeps the example runnable anywhere (e.g., Colab).
Streaming line by line scales to large files because you never hold everything in memory.
`startswith('E')` shows a simple condition you can swap for other filters.

Problem statement: Write and read structured data using JSON files so it round-trips between Python and disk.

In [None]:
import json

profile_data = {
    "users": [
        {"name": "Aria", "interests": ["reading", "cycling"]},
        {"name": "Noah", "interests": ["gaming", "hiking"]}
    ],
    "meta": {"count": 2}
}

# Write JSON to disk with pretty formatting
with open('profiles.json', 'w') as f:
    json.dump(profile_data, f, indent=2)

# Read JSON back into Python
with open('profiles.json', 'r') as f:
    loaded = json.load(f)

# Use the loaded data
first_user = loaded["users"][0]["name"]
print('First user:', first_user)

JSON maps cleanly onto Python dict/list structures, so saving and loading feels natural.
`json.dump(..., indent=2)` keeps the file human-readable for quick inspection.
Round-tripping data from dict → file → dict proves persistence works end-to-end.

Problem statement: Build 1D and 2D NumPy arrays and apply element-wise math to show vectorization.

In [None]:
import numpy as np

# 1D and 2D arrays
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([[1, 2, 3], [4, 5, 6]])

# Element-wise operations
scaled = arr1 * 5
squared = arr2 ** 2

print('arr1:', arr1)
print('scaled:', scaled)
print('arr2 squared:\n', squared)

NumPy arrays are homogeneous and live contiguously in memory, so element-wise math is fast.
Vectorized expressions like `arr1 * 5` avoid Python loops entirely.
The 2D square shows the same operation scaling across rows and columns.

Problem statement: Reshape an array and inspect its dimensions to see how data layout changes without copying.

In [None]:
import numpy as np

arr = np.arange(1, 7)  # [1 2 3 4 5 6]
print('Original:', arr, 'shape:', arr.shape, 'ndim:', arr.ndim)

reshaped = arr.reshape(2, 3)
print('Reshaped to 2x3:\n', reshaped)
print('New shape:', reshaped.shape, 'ndim:', reshaped.ndim)

`reshape` reuses the same underlying data as long as element counts match, so it is cheap.
Inspecting `shape` and `ndim` clarifies how the array is organized in memory.
Attempting incompatible shapes would raise a ValueError, which protects against silent bugs.

Problem statement: Use boolean masks to filter values and compute quick aggregates on the filtered data.

In [None]:
import numpy as np

scores = np.array([55, 72, 88, 91, 60, 47, 99])
mask = scores >= 70
passed = scores[mask]

print('Original scores:', scores)
print('Mask:', mask)
print('Passed:', passed)
print('Average of passed:', passed.mean())
print('Max of passed:', passed.max())

Boolean masks let you pick only the rows you want in a single expression.
Aggregations like `mean` and `max` run in optimized C, so they stay fast even on large arrays.
This mirrors real workflows: filter first, then summarize the subset.

Problem statement: Compare Python list looping vs NumPy vectorization on a mid-sized dataset to see the speed gap.

In [None]:
import numpy as np
import time

size = 200_000
py_list = list(range(size))
np_arr = np.arange(size)

# Python loop
start = time.perf_counter()
py_out = [x + 5 for x in py_list]
py_time = time.perf_counter() - start

# NumPy vectorized
start = time.perf_counter()
np_out = np_arr + 5
np_time = time.perf_counter() - start

print(f'Python list time: {py_time:.4f} seconds')
print(f'NumPy vectorized time: {np_time:.4f} seconds')
print('Speedup (~x):', round(py_time / np_time, 2))

Even at 200k elements, NumPy's vectorized math is typically orders of magnitude faster than pure Python loops.
Vectorization delegates the heavy lifting to optimized C code underneath the ndarray.
This pattern scales to millions of elements, which is why NumPy is a foundation for data science tooling.