<a href="https://colab.research.google.com/github/nickv779/LADS-Notebooks/blob/main/HW1_Integrative_Programming_Fall25.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Homework 1, Lin. Alg. for Data Science.
# Due date: See discord/canvas.


# Instructions:
-1. Copy the notebook so that you can save it!

0. Solve the tasks (by writing python code and answering extra questions, if any).

1. **Rename the notebook** like this: FirstName_LastName_HW1.ipynb (e.g. I'd rename it as: Hubert_Wagner_HW1.ipynb)

2. Run all the cells in the notebook, so that all results are visible.

3. **Important**: on colab create a shared link using the option **"for anyone with the link"** and switch permission from **Viewer** to **Editor**, so that it says "Anyone on the internet with the link can edit". Otherwise, I won't be able to read your work!

4. **Submit** the above link on canvas.

5. Later, *when the time comes* answer some brief questions about your solution via a google form I will send you. **This is part of the homework assignment**, so don't miss it!



# Collaboration rules:
> Since we are starting, feel free to ask questions on discord if you're completely stuck or discuss it in person. Try not to spoil it to others though! And make sure you write the code on your own from scratch though.


# Usage of LLMs (ChatGPT etc.)

This is a simple task, which is meant for you to get proficient with the basics (and combining math and programming). ChatGPT is able to solve it, but then you're not really practicing and you're just wasting your time... Later assignments will rely on insights (and programming proficiency) aquired in these simpler assignments -- and they are too complex for ChatGPT. I'd suggest you try to do this from scratch.

Heads up: While reasonable usage of LLMs is not disallowed (you'll definitely use them in your future work) -- in the feedback form you may be asked if/how you used them. So if you use them, note down the prompt and output.

# If this assignment is too hard...
... asking on discord is better than handing it off to an LLM...




# Task 1: Approximating hard integrals

> okay, some of them are not so hard.

Implement a python function which can **approximate the value of a definite integral a mathematical function** on an interval $[a,b)$. Use the absolutely simplest possible method that works!

## Implement two versions:
- start from a basic python version without numpy (it will be slow)
- implement another one using numpy (it should be much faster)

Use them to evaluate the following definite integrals:
- $$\int_0^{\frac{\pi}{2}} \arccos{\frac{\cos{x}}{1 + 2\cos{x}}}\,dx\, .$$
- $$\int_0^{e} \cosh x dx\, .$$
- $$\int_1^{e} \frac{1}{x} dx\, .$$
- $$\int_0^{1} 1 dx\, .$$

> For the final version, use numpy -- **except for functions like np.trapz**. You should implement your own simple integration method.

Make sure your function does exectly what it promises.

> Additionally, the last integral has to be evaluated correctly using a **small number of 'elements' to approximate**, say, 1,2, or 3 (using the same function as the others without handling this extra case in a special way). It's essentially meant as a unit test for your algorithm and its implementation.

In [35]:
'''PLEASE READ! This is useful for the last example.
A function taking a vector of values and returning 1 will *not* return a vector,
but rather a single number. This function solves this silly problem.
'''

def constant_function_of_one(x):
    '''Evaluates a constant function returning 1.
    It handles both a single number and np.array as input.

    Args:
      x: either a float or np.array of length n
    Returns:
      either a single 1. (as a float) or a length-n np.array of 1.
    '''
    return 0 * x + 1.

In [36]:
import numpy as np

def slow_integrate(f_x, from_x, to_x, num_elements):
    '''
    Basic python implementation for performing definite integration 
    of a passed in function with integration parameters.

    Args:
        f_x: function to integrate
        from_x: lower bound of x
        to_x: upper bound of x
        num_elements: number of elements to approximate with
    Returns:
        area under the function within bounds
    '''

    # initialize the width of each rectangle
    range_of_bounds = to_x - from_x
    width = range_of_bounds / num_elements

    # initialize starting values for area and current x value
    area_sum = 0
    curr_x = from_x

    # for each iteration, calculate the rectangle area using left riemann
    # and add it by finding the y of f(x) with x = curr_x and multiplying
    for i in range(num_elements):
        y = f_x(curr_x) 
        curr_x += width
        area_sum += (width * y)
    
    return area_sum


def integrate(f_x, from_x, to_x, num_elements):
    '''
    Numpy implementation for performing definite integration of a
    passed in function with integration parameters

    Args:
        f_x: function to integrate
        from_x: lower bound of x
        to_x: upper bound of x
        num_elements: number of elements to approximate with
    Returns:
        area under the function within bounds
    '''
    x_values = np.linspace(from_x, to_x, num_elements)
    gap = (to_x - from_x) / num_elements
    y_values = f_x(x_values)
    areas = gap * y_values
    return np.sum(areas)

# Task 2: Correctness

Write tests for your final implementation.

Hint: the exact values are $\frac{5}{24}\pi^2$, $ \frac{e^e - e^{-e}}{2}$, $1$ and $1$.)

For each case if $|res - correct| < 10^{-5}$. You can use the function below.


Below add tests (asserts) that if your results are close to the correct values. You can use the provided function.

> Feel free to test on other (mathematical) functions. Try to identify situations in which your function would **not** work! I will ask about this in the feedback form later.


In [37]:
def close_enough(a, b, num_digits):
    return np.abs(a-b) <= 10**-num_digits

In [38]:
# This is an example test checking if res=1.000005 is a good enough approximation
# for the correct result (1).
res = 1.000003
assert close_enough(res, 1, num_digits=5) # this 'test' passes

In [46]:
import math
# Function definitions for each integral
def first_func(x):
    return np.acos(np.cos(x) / (1 + 2*np.cos(x)))
def second_func(x):
    return np.cosh(x)
def third_func(x):
    return 1 / x
def fourth_func(x):
    return 1

In [40]:
# First integral approximation with python and numpy integrations
act = (5 / 24)*(math.pi**2)
res_py = slow_integrate(first_func, 0, math.pi/2, 1500000) # python approx
res_np = integrate(first_func, 0, np.pi/2, 1500000) # numpy approx
assert close_enough(res_py, act, num_digits=5)
assert close_enough(res_np, act, num_digits=5)

In [41]:
# Second integral approximation with python and numpy integrations
act = (math.e**math.e - math.e**-math.e) / 2
res_py = slow_integrate(second_func, 0, math.e, 1500000) # python approx
res_np = integrate(second_func, 0, np.e, 1500000) # numpy approx
assert close_enough(res_py, act, num_digits=5)
assert close_enough(res_np, act, num_digits=5)

In [42]:
# Third integral approximation with python and numpy integrations
act = 1
res_py = slow_integrate(third_func, 1, math.e, 15000000) # python approx
res_np = integrate(third_func, 1, np.e, 1500000) # numpy approx
assert close_enough(res_py, act, num_digits=5)
assert close_enough(res_np, act, num_digits=5)

In [43]:
# Fourth integral approximation with python and numpy integrations
act = 1
res_py = slow_integrate(fourth_func, 0, 1, 3) # python approx
res_np = constant_function_of_one(integrate(fourth_func, 0, 1, 3)) # numpy approx
assert close_enough(res_py, act, num_digits=5)
assert close_enough(res_np, act, num_digits=5)

# Task 3: Efficiency


Run your 'integrate' functions on the second example (with $\cosh$) using $n = 10^7$ samples. A good implementation using numpy should take around $0.25s = 250ms$.

Q: How much faster is it than your basic python (non-numpy) implementation?

> Remember that mixing numpy and pure python code can lead to slow performance!

In [44]:
%%timeit
slow_integrate(second_func, 0, math.e, 10**7)

44 s ± 763 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [45]:
%%timeit
integrate(second_func, 0, np.e, 10**7)

591 ms ± 7.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


The numpy implementation reportedly ran for 591 ms per run whereas the basic python implementation ran for 44 seconds per run, resulting in the numpy implementation running for 1.4% of the time the python implementation took (or 71x faster).

# Task 4: Documentation

Make sure your code that is reusable is properly commented (especially the main function using numpy).

If you've identified any situations in which your function would fail -- mention this in a comment so that your users are aware.

# Evaluation criteria

Later, I will send you a form asking a bunch of question about your solutions.

Some things to look out for:
- correctness
- tests
- code readability (good variable and function names)
- good comments/documentation (especially for any functions you implement)
- flexibility (there should be only 1 implementation, no extra cases)
- efficiency (use numpy wisely, avoid python for loops)