# Assignment 6: Number Theory

*If something in the assignment is unclear or does not make sense, make a note and ask the lecturer, a TA in the exercise session, or a friend!*

### Homework Exercises

Solutions to the following exercises are to be handed in for peer grading, either typed or handwritten. Write neatly and concisely, so the solutions are easy to read. You are allowed (even encouraged) to work in groups, but solutions have to be written and handed in individually. If you worked in groups, please state on your answer sheet who you worked with. 


1. (MIT 9.21) Consider the following numbers:
    $$
    \begin{align}
    m &= 2^9 5^{24} 7^4 11^{7} \\
    n &= 2^3 7^{22} 11^{211} 19^{7} \\
    p &= 2^5 3^{4} 7^{6042} 19^{30}.
    \end{align}
    $$

    (a) What is $\mathrm{gcd}(m, n, p)$?

    In order to find the GCD, firstly, we need to find the smallest power of each common factor of that appear in these numbers.

   The common prime factors are 2 and 7.

   For prime number 2, so $2^3$ is the smallest.

   For prime number 7, so $7^4$ is the smallest.

   Other prime numbers don't appear in all 3 numbers. Therefore, these numbers don't contribute to the GCD.

   The GCD for all 3 numbers is $2^3 * 7^4 = 19208$.  
    
    (b) What is $\mathrm{lcm}(m, n, p)$?

   In order to find the lcm, we need to find the maximum power of each prime factor:

   For the prime number 2, $2^9$ is the greatest.

   For the prime number 3, $3^4$ is the greatest.

   For the prime number 5, $5^24$ is the greatest.

   For the prime number 7, $7^{6042}$ is the greatest.

   For the prime number 11, $11^{211}$ is the greatest.

   For the prime number 19, $19^{30}$ is the greatest.

   Therefore, $lcm(m, n, p) =  2^9 * 3^4 * 5^24 *  7^{6042} * 11^{211} * 19^{30}$

    For $k \geq 1$, define $\nu_k(n)$ to be the largest power of $k$ that divides $n$. For a non-empty set $A$ of natural numbers, define $\nu_k(A) = \{\nu_k(a) : a\in A\}$. 

   
    (c) Express $\nu_k(\mathrm{gcd}(A))$ in terms of $\nu_k(A)$.

    For a non-empty set A of natural numbers, $v_k(GCD(A)) = min(v_k(A))$

    (d) For $p$ a prime number, express $\nu_p(\mathrm{lcm}(A))$ in terms of $\nu_p(A)$.

   For a non-empty set A of natural numbers, $v_p(LCM(A))= max (v_p(A))$ 

    (e) Give an example of integers $a, b$ where $\nu_6(\mathrm{lcm}(a, b)) > \max(\nu_6(a),  \nu_6(b))$.

   let a = 2 and b = 3

   we have that $v_6(2) = 0$ because 6 doesn't divide 2

   $v_6(3) = 0 $ because 6 doesn't divide 3

   LCM(2, 3) = 6 and $v_6(6) = 1$

   so $\nu_6(\mathrm{lcm}(a, b)) > \max(\nu_6(a),  \nu_6(b))$ because 1 > 0

1. (EG 3.76) 

    (a) If a decimal number ends in $5$, then it is divisible by $5$. Explain why!

   Any number ending with 5 can be written as 10k + 5 for k is some decimal number.

   Because of $10 * k + 5$ = $ 2 * 5 * k$ = $5 (2 * k + 1)$, the number is divisible by 5
    
    (b) If the sum of the digits in a decimal number is divisible by $9$, then the number itself is divisible by $9$. In fact, the sum of the digits is equal to the number itself modulo $9$. Explain why!

    A number presented as $a_1a_2...a_n$ can be wrriten as $a_1*10^{n-1} + a_2*10^{n-2}+ ... + a_n$ 
   
    Since $10^k\equiv 1 (mod 9)$, we know that $a * 10^k\equiv a (mod 9) $ for $a \neq 10$. This implies $(a_1*10^{n-1} + a_2*10^{n-2}+ ... + a_n) mod 9 =a_1 + a_2+ ... + a_n$

    So if the sum of digits is equal to 9 so the number is divisible by 9 

   
    (c) Think of which other numbers the method in (a) is applicable to, and formulate the general case.

    In base 10, this rules works for  2 and 5 (divisors of 10).

    General case: A number in base b is divisible by d if and only if its last digit is divisible by d, when d is a divisor of b
   
    (d) Do the same for the method in (b).

   In base 10, this rule works for 3 and 9.

   General case: A number in base b is divisible by d if and only if the sum of its digits is divisible by d, whenever d is a divisor of (b-1)

### Programming Exercise
In this programming assignment you will explore the *Sieve of Eratosthenes*, an ancient algorithm for finding all prime numbers up to a given limit. 

To find all primes less than or equal to a given integer $n$, the Sieve of Eratosthenes proceeds in the following steps.
1. Create a list $L$ of consecutive integers up to $n$: $L = (2, 3, \ldots, n)$.

2. Initialize $p=2$, the smallest prime number. 

3. Enumerate the multiples of $p$ as $2p, 3p, \ldots, kp$ until $kp>n$. Mark all these numbers in $L$ as not prime. 

4. Find the smallest number in $L$ greater than $p$ which is not marked. If no such number exists, terminate. Otherwise, let $p$ equal this new number and repeat from step $3$.

5. Output all numbers not marked; they are all prime numbers smaller than or equal to $n$.  

**Task 1:** Implement the Sieve of Eratosthenes as a python function `eratosthenes(n)` as described in Steps 1-5 and find the $10$ largest prime numbers that are smaller than $1\;000\;000$. Hint: for better performance, let the list consist of either True or False instead of the numbers themselves, and let the numbers be represented by the index of this Boolean in the list.


In [10]:
def eratosthenes(n):
    
    # When creating the list: first two ints are false (0 and 1), then we asumme that all other ints are primes
    is_primes = [False, False] + [True] * (n-1)

    # Sieve step 2
    p = 2

    # Sieve step 3-4
    for p in range(2, n+1):
        if is_primes[p]:
            for multiple in range(2*p, n+1, p):
                is_primes[multiple] = False

    is_primes = [i for i , prime in enumerate(is_primes) if prime]
    return is_primes

largestPrimes = eratosthenes(1000000)

print(largestPrimes[-10:])  # 


[999863, 999883, 999907, 999917, 999931, 999953, 999959, 999961, 999979, 999983]


We now discuss two optimizations of the sieve. The first is to halt when $p> \sqrt{n}$. The second is, given $p$, to start enumerating the multiples of $p$ from $p \cdot p$ instead of $2p$.

**Task 2:** Explain why these optimizations do not make the algorithm miss a prime. 

**Task 3:** Implement these optimizations into a new sieve `eratosthenes_opt(n)`. Run both `eratosthenes(n)` and `eratosthenes_opt(n)` on all integers in the list $(1\;000, 10\;000, 100\;000, 1\;000\;000, 10\;000\;000)$ and compare their running times on each instance. You can measure the execution time of a program in Python using the `time` package and the function `time.time()`; see [this link](https://stackoverflow.com/questions/1557571/how-do-i-get-time-of-a-python-programs-execution) for an example. 

#TASK 2 ANSWER

![alt text](task2.jpg)

In [15]:
import math
import time


def eratosthenes_opt(n):
    is_primes = [False, False] + [True] * (n-1)

    p = 2

    while p*p <= n:
        if is_primes[p]:
            for multiple in range(p*p, n+1, p):
                is_primes[multiple] = False

        p += 1

    return [i for i, prime in enumerate(is_primes) if prime]


# Comparison function

def comparison(n):
    for i in n:
        #Unoptimzed
        start = time.time()
        prime1 = eratosthenes(i)
        end = time.time()
        print(f"Unoptimized total time for {i}: {end - start:.4f} seconds")

        #Optimized
        start2 = time.time()
        prime2 = eratosthenes_opt(i)
        end2 = time.time()
        print(f"Optimized total time for {i}: {end2 - start2:.4f} seconds")

        print()

listOfIntegers = [1000, 10000, 100000, 1000000, 10000000]

comparison(listOfIntegers)


Unoptimized total time for 1000: 0.0005 seconds
Optimized total time for 1000: 0.0002 seconds

Unoptimized total time for 10000: 0.0026 seconds
Optimized total time for 10000: 0.0011 seconds

Unoptimized total time for 100000: 0.0286 seconds
Optimized total time for 100000: 0.0126 seconds

Unoptimized total time for 1000000: 0.2550 seconds
Optimized total time for 1000000: 0.1452 seconds

Unoptimized total time for 10000000: 2.7847 seconds
Optimized total time for 10000000: 1.6401 seconds




**Task 4:** Which of the optimizations do you think is most significant? Try your hypothesis by modifying your code from Tasks 1 and 3.

# TASK 4 ANSWER

I believe that stopping at $\sqrt{n}$ yields the biggest optimization, because it skips entire iterations, affecting the time complexity drastically: $O(n)$ to $O(\sqrt{n})$

Below is the code that concludes that this also is the case. Optimizing for $\sqrt{n}$ is the better optimization of the two.

In [17]:
def sieve_sqrt_only(n):
    is_prime = [False, False] + [True] * (n - 1)
    p = 2
    while p * p <= n:
        if is_prime[p]:
            for multiple in range(2 * p, n + 1, p):  # Still starts at 2p
                is_prime[multiple] = False
        p += 1
    return [i for i, prime in enumerate(is_prime) if prime]

def sieve_p2_only(n):
    is_prime = [False, False] + [True] * (n - 1)
    for p in range(2, n + 1):  # Still loop up to n
        if is_prime[p]:
            for multiple in range(p * p, n + 1, p):  # Start at p^2
                is_prime[multiple] = False
    return [i for i, prime in enumerate(is_prime) if prime]


import time

def test_optimizations(n):
    for i in n:
        print(f"\nTesting with n = {i}")

        # Only sqrt optimization
        start = time.time()
        sieve_sqrt_only(i)
        print(f"Only sqrt: {time.time() - start:.4f} sec")

        # Only p^2 optimization
        start = time.time()
        sieve_p2_only(i)
        print(f"Only p^2: {time.time() - start:.4f} sec")

        print()

listToTestOpt = [1000, 10000, 100000, 1000000, 10000000]

test_optimizations(listToTestOpt)


Testing with n = 1000
Only sqrt: 0.0001 sec
Only p^2: 0.0002 sec


Testing with n = 10000
Only sqrt: 0.0011 sec
Only p^2: 0.0022 sec


Testing with n = 100000
Only sqrt: 0.0315 sec
Only p^2: 0.0455 sec


Testing with n = 1000000
Only sqrt: 0.1546 sec
Only p^2: 0.2111 sec


Testing with n = 10000000
Only sqrt: 1.8153 sec
Only p^2: 2.2493 sec

