In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("intro.ipynb")

# Introduction

Welcome to Physics-112! As you may already know to satisfy the College of Computing, Data Science, and Society's Data Science Major's **physical science analytics** domain emphasis requirement, this class will have some programming components. **Our aim is not to scare you because of this requirement**, and we have attempted to make the assignments: as easy as possible, attempt to restrict computational problems to less than 1 hour, and still provide an ample learning experience. Like many of you I have also taken this class, and there's usually 3 reactions to the computational homeworks:

1. Ugh, another computational homework? Why do I have to do this programming mumbo-jumbo, my research work only needs me to solve the 50th dimension in string theory using pen and paper!
2. Ok, another computational assignment. At least we have ChatGPT to maybe get through this whole thing.
3. Yes! I don't want to see another partition function in my life, finally another computational assignment!

Of course there may be a mix of reactions (I was in the realm of liking partition functions, and *some* of the computational assignments), but you get the idea. So in lieu of the traditional computational homeworks that you may have seen in Physics 105, the *physics computational training* somewhere hidden in the department website, or 112 in the past years: our focus will be on giving you surface-level knowledge on how to answer research questions using computational techniques.

For those who have taken CS or DS courses, we will be incorporating `otter-grader` -- an autograding framework for assignments. This may be a source of dread for those who have taken the DATA 8, and/or the CS 61 series or beyond. But, the idea is to make sure these assignments are graded equally. **And as long as you pass all of the public tests you will recieve 85% of your grade so a minimum of ~8.5 points per assignment**. We will also allow the usage of **Chat-GPT**! In fact, we encourage you to use any LLM model that you so desire. However, you should keep in mind the following when using these tools:

1. Does ChatGPT's result physically make sense?
2. **ALWAYS** check edge-cases before deploying.

**Tl;dr: The computational assignments *should* not take the bulk of time spent on homeworks, using Chat-GPT is OK, and we will use an autograder.**

In [None]:
import numpy as np
import scipy
import matplotlib.pyplot as plt
import time

# HW 0: Introductory Python Part 1

| Question  | Points |
|---|---|
| 1a  | 1  | 
| 1b | 1  |
| 2a  | 1  |
|  2b | 1  | 
|  2c | 1  |
|  2d | 1  |
| 3a | 2 |
| 3b | 2 |
| 8 | 10 |

## Learning Goals: Python, and `numpy` Basics

The introductory python assignments that I was assigned in my physics classes were generally unhelpful, and useless. So what we will do instead is make a *reference* for you to use in the next 2-3 computational assignments that will be assigned to you. I.e, **the problems covered in this assignment may have answers to future computational problems.** Therefore, I recommend spending a little bit more effort into familiarzing oneself with the syntax of Python, and the general workflow of how to answer coding questions. Additionally, there are no hidden tests for this assignment. So as long as you pass the public tests, you will recieve full credit.

The following problems will take a look at basic Python, and `numpy` functionality. I will try to provide as much explanation as I can, but if you still have questions I suggest to look at the documentation of these packages, watch some YouTube videos, look at stackexchange posts, or email me. The structure of the problems on this assignment will follow the trend of going from easy $\to$ hard as follows: 

1. Instantiating variables, and Math Operations.
2. Common Data Structures: Tuples, Lists, and `numpy` arrays.
3. Vectorization and Linear Algebra

We will also gain familiarity with reading documentation in the packages that are listed above. Here are some useful links:

- https://numpy.org/doc/stable/index.html

**Tip: Generally searching: "How to do `x` in `numpy`" in google wil lead to its documentation.

## Problem 1: Instantiating Variables, and Math Operations

### Part a)
Creating variables in Python is quite simple, and is analagous to tacking on a number to a variable in math. There are a large number of different datatypes, but for our cases we will only need to look at `int` (integer), `float` (real number), and `str` (letters). Consider the following example:

In [None]:
an_integer = 1
a_float = 3.1414141414141414
a_string = "Buddy Hield can't make a 3 pointer for his life."

Here's the problem:

1. Make a variable `my_int`: I would like to add 111 to `an_integer` using the addition operator `+`. Don't just type 1+111 or 112, use the variable name `an_integer` in your solution.
2. Make a variable `my_float`: I want to round `a_float` to `3.14` using [`np.round()`](https://numpy.org/doc/2.1/reference/generated/numpy.round.html). How can we use `np.round` in this situation?
3. Make a variable `my_string`: I want to say that "Buddy Hield can make a 3 pointer!"

In [None]:
my_int = ...
my_float = ...
my_string = ...

my_int, my_float, my_string

In [None]:
grader.check("q1a")

### Part b)

Hopefully you were able to figure out that putting `+` means addition in Python. We will look at other math operations: `-` (subtraction), `/` (divide), `*` (multiply), and `**` (power) as these will be the ones you will often see.

I want to compute the following:

$$ 3 - \frac{12^2}{5} \times 16 $$ 

Write the operations explicitly in the cell below without using a calculator.

In [None]:
to_compute = ...
to_compute

In [None]:
grader.check("q1b")

## Problem 2: Data Structures

*Data Structures* are ways where we can organize data, and how we can access data. The ones we will cover in this problem are:

1. Lists
2. Tuples
4. NumPy arrays

Rather than providing definitions for these, let's work through why/when we would use these data structures.

## Lists
### Part a)

Lists are created with the syntax `[]`. For example:

```
num_list = [1, 2, 3, 4]
a_list_of_lists =[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

```

In the cell below create a list of strings that has the following inputs in the same order: "I", "hope", "the", "midterm", "will", "be", and "easy".

In [None]:
a_list_of_hopes = ...
a_list_of_hopes

In [None]:
grader.check("q2a")

### part b)

Lists are *mutable* data structure, i.e you are able to do operations on the list. The most common operations that you may run into are: `.append()`, `.extend()`, and indexing using `list[idx]`. There are more [operations](https://docs.python.org/3/tutorial/datastructures.html) that could also be helpful, however will probably not be needed for future implementations.

For the list `a_list_of_hopes` append `no_hope` using `.append()`. *Extra Question: do you need to declare `a_list_of_hopes = x.append(y)`? Or do you just need to do `x.append(y)`? Try both out!*

In [None]:
no_hope = ["Chien-I", "said", "it", "won't", "be", "curved"] # This is not true by the way.

# Append no_hope to a_list_of_hopes
...
print(a_list_of_hopes)

You may have noticed something strange, and now there is a list within a list! This is a common pitfall that will surely happen to you at some point. To fix this we'll use a (not efficient) way to solve this problem. Let's *slice* the list to redefine `a_list_of_hopes` back to its original defined state. Below are a few examples of ways of using the `[]` operator on lists.

In [None]:
example_list = [0, 1, 2, 3, 4, 5]
print("Get the first element of the list (the first element is indexed with a 0!): ", example_list[0])
print("Get the last element of the list: ", example_list[-1])
print("Get the first two elements: ", example_list[:2])
print("Get every second element of the list (inclusive): ", example_list[::2])

Now let's fix a `a_list_of_hopes` **using slicing as shown above**, by assigning `fixed_list_of_hopes` with the correct slice. 

*Hint: we only want to get the first how many elements of our list?*.

*Note: There is an easy way to do this by using `.pop()`, but use slicing just as an example in this case as it will be used extensively in future homeworks.*

In [None]:
fixed_list_of_hopes = ...
fixed_list_of_hopes

Now use `.extend()` on `fixed_list_of_hopes`, and see if that's the intended way of joining these two lists together.

_Type your answer here, replacing this text._

In [None]:
...
fixed_list_of_hopes

In [None]:
grader.check("q2b")

## Tuples
### part c)

Tuples (defined using the parantheses `()`) are sort of like lists in terms of indexing operations, however they are not *mutable*. Mutability just means that we cannot edit the elements of a data structure.  For example you will get an error for the following cell,

In [None]:
a_tuple = (1, 1, 2)

print("Valid Operations: \n", a_tuple[0], a_tuple[:2], a_tuple[::2])

# The following will throw an error
# a_tuple[0] = 5

You may then ask, why would we want to use tuples? As a guideline you should **use tuples when you don't expect to change the values of your list**. This may seem a little abstract, but take this into consideration in your future research/programming questions if you want to be able to edit your lists (which is not always true). Let's work through a use case example of this:

```
Your professor gives you a set list of coordinates in x, and y. He tells you, "My data should represent a linear solution, but something is making it quadratic when I process my data! Debug this code for me, and maybe I'll give you a letter of rec."
```

Replicate your professor's data as a tuple, and see if you can fix his problem.


In [None]:
x_data = [1, 2, 3, 4, 5, 6]
y_data = [1, 2, 3, 4, 5, 6]

fixed_x_data = ...
fixed_y_data = ...

# Ignore the implementation below.
def process_data(x_data, y_data):
    
    if type(x_data) == list and type(y_data) == list:
        
        return x_data, np.array(y_data)**2
    
    else:
        return x_data, y_data

new_x_data, new_y_data = process_data(x_data, y_data)
new_fixed_x_data, new_fixed_y_data = process_data(fixed_x_data, fixed_y_data)

plt.figure()
plt.title("I Love Linear Plots")
plt.plot(new_x_data, new_y_data, label = "Professors Data")
plt.plot(new_fixed_x_data, new_fixed_y_data, label = "Fixed Data")
plt.legend()

Although the example above may seem quite rudimentary, situations where your data is augmented in a unexpected way happens! Again remember, if you don't need to change the values always use a tuple!

In [None]:
grader.check("q2c")

## `numpy` Arrays

`numpy` will be the most useful package you will use for research or other purposes. **Namely, you will be able to use math operations directly on your arrays!** Consider the following example,


In [None]:
my_list = [1, 2, 3, 4, 5]

# print(my_list + 1) # Uncomment this code, and see if it will add 1 to each value.

my_numpy_array = np.array([1, 2, 3, 4, 5])

print(my_numpy_array + 1)

There are other useful uses of numpy arrays that we will explore in the next few questions.

### part d)

`numpy` arrays have the benefit of passing *conditionals* as an index, which will output the values that satisfy the condition. Let's look at the following example:

In [None]:
print("Original Array: ", my_numpy_array)
print("Conditional Applied on Array: ", my_numpy_array < 3)
print("Passing boolean array to an array: ", my_numpy_array[my_numpy_array < 3])

As shown above, passing a boolean array to an array, will return only the values with `True`.

Let's work through a hard example. Below we construct a random set of points using a `numpy`'s `.random` module. In the code below, we generate a normally distributed array with *shape* `(2, 100)` (the first entry is the number of rows, and the second entry is the columns). 

In [None]:
# DO NOT CHANGE
np.random.seed(112)

# Initialize random set of positions where in the array we have two axes your row (0), and column (1).
positions = np.random.random((2, 100)) # A randomly distributed set of points with a shape (dimensions of the array) (2 x 100).
print("Shape of positions: ", positions.shape)
# Data
print("First 10 Data: \n", positions[:, :10])
print("X data: \n", positions[0, :10]) 
print("Y data: \n", positions[1, :10])

# Showing axes
print("Average along row: \n", np.average(positions, axis = 0)[:10])
print("Average along column: \n", np.average(positions, axis = 1)[:10])

plt.figure()
plt.scatter(positions[0], positions[1]) # postions[0] is the x data, and positions[1] is the y data.

Now here's the problem: I want to visualize that all the particles above and equal to 0.5 are the color red, and anything below that is blue. Note in Python you can use the comparison operators: `<` (less than), `>` (greater than), `<=` (less than or equal to), and `>=` (greater than or equal to).

*Hint: Think carefully how we defined our x and y data above.*
*Hint: `numpy` is quite smart in how it handles passing arrays as a conditional consider the following example (**this is relevant to future computational homeworks**):*

In [None]:
a_numpy_array = np.array([np.arange(1, 6), np.arange(6, 11)])
print("Example Numpy Array: \n", a_numpy_array)
print("Shape: ", a_numpy_array.shape)

test_bool = [True, False, True, True, False]

print("Passing Boolean array on the first row of data: \n", a_numpy_array[0, test_bool])
print("Passing Boolean array on the second row of data: \n", a_numpy_array[1, test_bool])
print("Apply the Boolean array to each row of data: \n", a_numpy_array[:, test_bool]) # Notice the usage of :

test_bool = [True, False]

print("Passing boolean array on the whole test array: \n", a_numpy_array[test_bool])

In [None]:
# reds and blues should be boolean arrays.
reds = ...
blues = ...

#red_x, and red_y should be the positions of the red particles.
red_x = ...
red_y = ...

#blue_x, and blue_y should be the positions of the blue particles.
blue_x = ...
blue_y = ...

In [None]:
plt.figure()
plt.scatter(red_x, red_y, color = 'red')
plt.scatter(blue_x, blue_y, color = 'blue')

In [None]:
grader.check("q2d")

# Vectorization and Linear Algebra

Vectorization is an important skill to pick up for research programming, as it will often speed-up your code order of magnitudes. The next few problems will be exploring vectorization, and doing matrix operations in Python. Let's first consider two implementations of the dot product using vectorization, and a for-loop (we will cover this in the next assignment).

In [None]:
array_1 = np.arange(0, 1e6)
array_2 = np.arange(0, 1e6)

dot_product_slow = 0
start_time = time.time()
for i in range(len(array_1)):  # Manually iterating through indices
    dot_product_slow += array_1[i] * array_2[i]
    
end_time = time.time() - start_time
print("Took ", end_time, "s to finish dot product computation.")

start_time = time.time()
dot_product_fast = np.dot(array_1, array_2)
end_time = time.time() - start_time

print("Took ", end_time, "s to finish dot product computation.")

As you can see `numpy`'s built in dot product is orders of magnitude faster than our slow implementation of the dot product. We can also do matrix operations using the symbols `@` (matrix multiplication), and `.T` for taking the transpose of a matrix. For example,

In [None]:
my_matrix = np.array(
    [
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]
    ]
)

to_multiply_matrix = np.array(
    [
        [4, 12, 3],
        [3, 7, 2],
        [2, 1, 1]
    ]
)

print("Current Matrix: \n", my_matrix)
print("Transpose Matrix: \n", my_matrix.T)
print("Matrix Multiplication: \n", my_matrix @ to_multiply_matrix)

## Problem 3
### part a)

Given a set of 2D particle positions, apply the rotation matrix to each particle where $\theta = 45^\circ$,

$$ R = \begin{bmatrix} 
\cos(\theta) & -\sin(\theta) \\
\sin(\theta) & \cos(\theta)
\end{bmatrix}$$

When applying a rotation to a vector the general equation is,

$$ \mathbf{v'} = \mathbf{R}\mathbf{v}$$

*Hint: Use `np.sin`, and `np.cos` to create your rotation matrix. Search online whether `np.sin` takes in degrees or radians. Find out a way to do a conversion to degrees if necessary.*

*Hint: if you're confused, try applying the rotation matrix to one point first, then see how you can apply it in a vectorized way. If you're still confused consider asking ChatGPT the correct implementation*.


In [None]:
particle_positons = np.array([[12, 3],
                            [3, 2],
                            [6, 12]])

theta = ...
R = ...
            ...

# rotated gives the rotated positions of the particles.
rotated = ...

rotated

In [None]:
grader.check("q3a")

### part b)

In the next problem we will consider two sets of 2D vectors,

$$ \{\mathbf{a}_1, \mathbf{a}_2, ..., \mathbf{a}_{10} \} $$

and

$$ \{\mathbf{b}_1, \mathbf{b}_2, ..., \mathbf{b}_{10} \}. $$

We want to calculate the dot products between vectors from the two sets of vectors of the same label and return a list of their dot products

$$ \{\mathbf{a}_1\cdot \mathbf{b}_1,\; \mathbf{a}_2\cdot \mathbf{b}_2, ...,\; \mathbf{a}_{10}\cdot \mathbf{b}_{10}  \}.$$

We will do this by first creating two arrays `a` and `b` of shape `(2, 10)` that represent the two sets of vectors. You can think of `a` as a matrix $ a_{i\alpha} $ where the $i$ index labels the $x-$ and $y$-component of the vectors, and $\alpha =1, 2, ..., 10$ tells us which vector in the first set we are referring.  Similarly, `b` can be thought of as a matrix $ b_{j \beta}$. Then, the following operation, in the index notation with Einstein convention,
$$(a^T)_{\alpha k} b_{k\beta}$$
gives us the dot product between the $\alpha$th vector in the first set and the $\beta$th vector in the second set. 

Following this chain of thought, obtain the list

$$ \{\mathbf{a}_1\cdot \mathbf{b}_1,\; \mathbf{a}_2\cdot \mathbf{b}_2, ...,\; \mathbf{a}_{10}\cdot \mathbf{b}_{10}  \}.$$



*Hint: Note that a simple implementation of the dot product will not suffice since we're working with an array of multiple vectors. You might want to find a numpy function that can list diagonal elements of an array. Your answer should only be 1 line.* **This question is relevant for a future computational assignment.**

In [None]:
# DO NOT CHANGE
np.random.seed(112)
#

a = np.random.random((2, 10))
b = np.random.random((2, 10))

# This should give a list of dot products that we want to calculate
dot_product = ...

print(dot_product.shape)

In [None]:
grader.check("q3b")

# End and Directions for Submitting Work

Congratulations, you have finished your first computational assignment! Next time we will cover curve fitting, loops, and functional programming which will be important for future computational assignments. 

**For those who have no experience with `otter` please read the following:**

1. Download the .zip file at the bottom of this assignment, **MAKE SURE TO SAVE BEFORE DOWNLOADING.**
2. Go to gradescope, and **upload the full zip file** to the assignment.

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False, run_tests=True)