In [1]:
# Initialize Otter
import otter
grader = otter.Notebook("HW1_Fall2022.ipynb")

# DATA 601

## HW1


**Learning Objectives**

- Explore built-in data types in Python.
- Review fundamental programming and problem solving concepts with Python.
- Implement functions based on mathematical concepts and definitions.
- Gain experience working with the Jupyter notebook environment.

_This is an individual homework assignment._ 

Please complete this homework assignment within the Jupypter notebook environment.  

#### Submission 

In order to ensure that everything goes smoothly, and is easy to grade please follow these instructions:

- Please provide your solutions where asked; please do not alter any other parts of this notebook.


#### <font color='red'> Note: Passing the provided test-cases doesn't guarantee points. We check your code with other test-cases for grading.</font> 

In [2]:
# Check that we are using a recent version of Jupyter.
import IPython
assert IPython.version_info[0] >= 3, "Your version of IPython is too old, please update it."
import otter
grader = otter.Notebook()

## Part A

This part focuses on scalar types. You should be able to complete the following questions without using any collection types.

### Question 1

The central bionomial coefficients form a vertical row down the middle of Pascal's triangle. They can be calculated via the following formula:

$$
{2n\choose n} = \frac{(2n)!}{(n!)^2} = \prod_{k=1}^{n} \frac{n+k}{k}  
$$

Write a function that calculates the nth central bionomial coefficient. You may use in-built functions to calculate the factorials.

(3 points)

In [3]:
def cent_binomial_coef(n):
    '''Returns the n-th binomial coefficient for an input
    integer n.
    '''
    if n == 0:
        return 1

    k = denom = numer = 1
    while k <= n:
        denom *= n + k
        numer *= n
        k += 1
    coef = denom // numer
    return coef

In [4]:
grader.check("q1_a")

**Citation:**
From the _drop in_ session with TA, one of the takeaways was that division (which is part of this formula) can lead to a small error due to the computer's truncation method because the storage is finite. 

**How I used this:**
To minimize this error, I decided to do the division operation at the end of the function, not in the loop. If the input number is significant, these minor errors can accumulate and lead to a wrong result. 

**Related content:** round-off errors, floating point

### Question 2

Write a function that determines whether an integer is prime. You may not use built-in functions to determine this directly.

(8 points)

In [5]:
import math
def is_prime(n):
    '''For a positive integer n, returns True if n is a
    prime number, False otherwise.
    '''
    if n <= 1:
        return False
    elif n == 2:
        return True
    elif (n > 2) and (n % 2 == 0):  # all even numbers can not be prime
        return False 
    else:
        for div in range(3, math.floor(math.sqrt(n)) + 1, 2): # skip even numbers
            if n % div == 0:
                return False
            else:
                div += 1
    return True

In [6]:
grader.check("q2_a")

**Citation:**
From _drop in_ session with TA I recalled why to use `sqrt(n)` in upper limit. If `n` has divisors it means `n = a * b` and `a > sqrt(n)` it means that `b` should be less than `sqrt(n)` - and we already checked for that. 

**How I used this:**
I put `sqrt(n)` as upper limit variable in the loop, which makes the algorithm much more efficient than checking to `n` including.   

**Related content:** loops, prime numbers

##### Question 3

(5 points)

Many things in nature can be modeled by the exponential growth function: 
$$
P(t) = \alpha \beta^{\frac{t}{\tau}},
$$
where $\alpha$ is the initial population, $\beta$ is the growth rate factor, t is the time passed, and $\tau$ is the time constant. 

Imagine that you are an entomologist on the plant Zergo. There is a small insect of interest to you on this planet which has an exponential growth rate. From experimentation you determine that: 
$\beta= \sqrt\frac{5}{4}$,  and $\tau= 2^{\frac{1}{3}}$.

Starting with a population $\alpha$ of 10, use this formula to compute $P(t)$ in the function `population(t)` below. 

<!--
BEGIN QUESTION
name: A3
manual: true
points: 2
-->

In [7]:
import math
def population(t):
    '''Computes the population at time t, where t is a positive float, 
    and returns the population as an integer (floor value). 
    It is assumed that time is in minutes and that all parameters
    are correct for this assumption.
    '''
    alpha = 10
    beta = math.sqrt(5/4)
    tau = 2**(1/3)
    population = alpha * beta ** (t/tau)
    
    return math.floor(population)

In [8]:
grader.check("q3_a")

## Part B

Questions in this part rely on strings, lists, and tuples.

### Question 1

Consider the lists $a=[2,1,2,5,3,1]$ and $b=[2,1,7,1]$. The intersection of these lists, $a \cap b$, is $[1,1,2]$. Write a function that returns the intersection of two arbitrary lists. You may not use built-in set operations, and should pay attention to duplicates.

(5 points)


In [9]:
def intersection_list(a, b):
    '''
    Returns the intersection of two lists a and b. 

    '''
    result = []
    for e in a:
        if e in b:
            b.pop(b.index(e)) # The index() method only returns the first occurrence of the matching element
            result.append(e)

    return result

In [10]:
grader.check("q1_b")

**Citation:**
Web tutorial on [list.index()](https://www.programiz.com/python-programming/methods/list/index) method

**How I used this:**
The index() method only returns the _first occurrence_ of the matching element - precisely what is needed to count for duplicates - not to pop away all the same elements.    
**Related content:** list, search

### Question 2
(8 points)

The scalar triple product of three vectors is defined as:

$$
a.(b \times c)
$$

Write a function that calculates the scalar triple product of three vectors, a, b, and c.

In [11]:
def striple_prod(a,b,c):
    '''Returns the scalar triple product of three vectors A, B, and C. 
    '''
    # cross = [b[1]*c[2] - b[2]*c[1], 
    #          b[2]*c[0] - b[0]*c[2], 
    #          b[0]*c[1] - b[1]*c[0]] 
    # result = a[0]*cross[0] + a[1]*cross[1] + a[2]*cross[2]
    
    # Matrix
    m = [a, b, c]
    nrows = len(m)
    ncols = len(m[0])
    
    minors = [[0] * ncols for n in range(nrows)] # create list of lists the right way - citation below
    # Populate matrix of cofactors - matrix of minors with the signs changed
    for i in range(nrows):
        for j in range(ncols):
            rows = list(range(nrows))
            rows.pop((rows.index(i))) # delete the row of the element I'm trying to find the minor for 
            cols = list(range(ncols)) 
            cols.pop((cols.index(j))) # delete the column of the element I'm trying to find the minor for 
            # cofactor for any element is either the minor or the opposite of the minor
            cofactor = 1 if (i + j) % 2 == 0 else -1
            # minor is the determinant that remains
            minors[i][j] = (m[min(rows)][min(cols)] * m[max(rows)][max(cols)] -
                            m[min(rows)][max(cols)] * m[max(rows)][min(cols)]) * cofactor

    # Take first row/list to calc determinant
    determinant = a[0]*minors[0][0] + a[1]*minors[0][1] + a[2]*minors[0][2] 

    return determinant

In [12]:
grader.check("q2_b")

**Citations**:
1) [Scalar triple product](https://www.cuemath.com/algebra/scalar-triple-product/) from my Linear Algebra course 
2) Method of calculating the [determinant of square nxn matrix](https://people.richland.edu/james/lecture/m116/matrices/determinant.html)
3) [Create list of lists the right way](https://stackoverflow.com/questions/240178/list-of-lists-changes-reflected-across-sublists-unexpectedly)

**How I used it:**
From 1) and 2) sources, I recalled what a scalar triple product is. After, I hard-coded the formulas for cross and dot product. But was not satisfied with the solution. As hard code usually is _not scalable_ and _fragile_. Even though the input is _three_ lists, meaning `3x3` matrix, I decided to expand to `nxn` matrix, which is easy to adjust - just expand the number of lists with proper length and change the first line of code inside of the function (where the matrix is created).  

From 3) I copied the line of code on how to create lists of lists correctly. Initially, I was doing this like `[[0]*ncols]*nrows`. But there was an issue when I tried to populate that matrix using indexes. When I changed one element, the whole sublist changed. I recalled that everything in Python is an object and I was creating a list of elements all pointing to the same object (cpoies of the same list). With alternative aproach create different lists by re-evaluating expression.      

**Related content**: references, cross product, dot product, determinant

### Question 3
(10 points)

Write a function that, given a piece of text, calculates the most frequently occuring letter.


In [22]:
def freq_letters(text):
    '''
    For textual input, returns the most frequently occuring letter
    (your function should be case-insensitive)
    '''
    alphabet_lowercase = "abcdefghijklmnopqrstuvwxyz"
    alphabet_uppercase = alphabet_lowercase.upper()
    alphabet = alphabet_lowercase + alphabet_uppercase
    l = list(text)
    most_freq_l = None
    count = 0
    for e in l:
        if e in alphabet:
            most_freq_l = e if l.count(e) > count else count
    return most_freq_l

In [19]:
grader.check("q3_b_1")

Write a second function that returns the number of vowels and consonants in the text.

In [20]:
def vowels(text):
    '''
    For textual input, returns the number of vowels and consonants.
    Return a tuple with number of vowels as the first element and consonants
    as the second element, i.e (num_vowels, num_consonant)
    '''
    vowels = list('aeiou')
    alphabet_lowercase = list('abcdefghijklmnopqrstuvwxyz')
    text_lowercase = list(text.lower())

    num_letters = 0
    num_vowels = 0
    # Count letters in text
    for l in text_lowercase:
        if l in alphabet_lowercase:
            num_letters += 1
    # Count vowels in text
    for v in vowels:
        num_vowels += text_lowercase.count(v)
    # Count consonants as not vowels
    num_consonants = num_letters - num_vowels

    return (num_vowels, num_consonants)

In [21]:
grader.check("q3_b_2")

## Part C

The question in this part of the assignment is intended to:

  1) Illuminate how what you already know is useful in this context.
  
  2) Provide an exercise in writing about coding concepts and ideas and using the relevant language and terminology. 
  
  3) Demonstrate your understanding of the assignment material in a different format.


<!-- BEGIN QUESTION -->

### Question 1 
(3 points) 

In 100 to 250 words identify a concept you have found difficult or confusing from this assignment. Reflect on how your previous learning or experience helped you to understand this concept. Provide your reflection using markdown in the cell below. 

There are plenty of tools that I can use to organize and store my data - _data types_. And when the problem is simple, it is obvious what tool you need to take out of the box to solve it. But when it becomes more complex, the number of ways you can solve it grows exponentially, and only a few are efficient and scalable.

When I arrived in Canada just a few weeks ago, I brought all my belongings with me. And when I moved-in into a new house, the challenge was "how should I organize everything ?". In the beginning, it was like a _set_ - a _collection of unordered_ things in the middle of an unfurnished unit. Then I created one broad category - clothes with multiple _sublists_ - summer, fall and winter. Eventually, I populated my _collections_ with _ordered_ items, and now I can easily _retrieve_ suitable, _indexed_ clothes depending on the weather. As the winter is coming, I had to buy new warm boots and a jacket which I can easily _add_ to my collections. But first, I will have to free up _space_ in my dresser as the storage is limited, so that I will _remove_ T-shirts to my set of stuff in the basement. Probably, I will _rearrange_ things as time pass by.

So, when I think about this problem, how to arrange things in my place or how to organize data, it really comes to "what is my final goal ?". Do I want to save space because in the future I will have to buy more clothes or maybe I don't have to think about the future and solve the current problem, which will save time for me? During this assignment, I bumped into these dilemmas a couple of times, for example, solving a triple product problem. 

Other findings from this assignment were: 1) be careful with the division, which leads to the floating point error due to restricted memory space (Part A, Problem 1) and 2) everything is an object in Python and has its place, so be careful when creating copies. 

<!-- END QUESTION -->

## Submission

Make sure you have run all cells in your notebook and that your code works as expected. Save your notebook and upload it to <a href=https://www.gradescope.ca >Gradescope</a>
**Please save (ctrl+s) before uploading!**