<a href="https://colab.research.google.com/github/dstrick97/cse380-notebooks/blob/master/ponder_and_prove_combinatorics_and_probability.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ponder and Prove Combinatorics and Probability
#### Due: Saturday, 6 February 2021, 11:59 pm.

Contributers: Bretton Steiner, and Claire Hocker

## Conjecture

A number-theoretic conjecture of combinatorial significance is the following:

$degree2({2n \choose n}) =$ the "bits-on count" (or population count, or Hamming weight) of $n$.

$degree2(m)$ is defined as the number (degree, exponent) of 2's in the prime factorization of $m$.

In other words, for any $m$, a positive integer, $m = 2^e \cdot o$ where $o$ is an odd positive integer (could be 1) and $e$ is a natural number, including zero --- which would be the case when $m$ is odd. It's the $e$ that is the $degree2$ of $m$.

Another way to state this conjecture is that the number of 1's in the binary expansion of ${2n \choose n}$ for positive integer $n$ is equal to the number of 2's in the prime factorization of $n$.

Your task is to write Python code to test this conjecture for as many positive integers as you can. See the self-assessment for more details.

Note: a `bitsoncount` function can be a one-liner in Python: `return bin(x).count('1')`



In [None]:
def getCount(n): 
  #adapted from Geeks for Geeks :https://www.geeksforgeeks.org/count-occurrences-of-a-prime-number-in-the-prime-factorization-of-every-element-from-the-given-range/
    # To store the requried count 
    cnt = 0 
    p = 2
    val = p 
    while (True): 
  
        # Number of values 
        # that are divisible by val 
        a = n // val 
  
        # Number of values  
        # that are divisible by val 
        b = (n - 1) // val 
  
        # Increment the power of the val 
        val *= p
  
        # (a - b) is the count of numbers
        # that are divisible by val 
        if (a - b): 
            cnt += (a - b) 
  
        # No values that are divisible by val 
        # thus exiting from the loop 
        else: 
            break 
  
    return int(cnt) 

In [None]:
from math import gcd

def nCk(n, k):
  if k < 0 or k > n:
    return 0
  else:
    result = 1
    d = 1
    g = 1
    m = min(k, n - k)
    while (d <= m):
      g = gcd(result, d)
      result = n * (result // g)
      result = (result // (d // g))
      n -= 1
      d += 1
    return result


In [None]:
def countBits(n):
    return bin(n).count('1')


In [None]:
def test_conjecture(n):
    return countBits(n) == getCount(nCk(2*n, n))


In [1]:
from signal import signal, SIGTERM
from sys import exit

def handler(signal_received, frame):
  exit(0)

In [2]:
import datetime
from signal import signal, SIGTERM

signal(SIGTERM, handler)
print('Began running at ' + str(datetime.datetime.now()))
n = 1
try:
  while test_conjecture(n):
    n += 1
except:
  print('Verified up to ' + str(n) + ' at ' + str(datetime.datetime.now()))


Began running at 2021-02-06 21:31:33.128600
Verified up to 1 at 2021-02-06 21:31:33.129867


# Output
The final output after running for 24 hours inside the linux lab is:
Began running at 2021-02-05 14:15:00.550762
Verified up to 47637 at 2021-02-06 14:18:04.605291

# Report
It was really cool to see how code can be made to run for really long periods of time without taking over the users computer and hogging the CPU. Bretton and I adapted the getCount() function from an online source, and Claire was a huge help in explaining how the functions relate to one another in comparing the number of ones in a bitstring to the number of twos in the prime factorization. While Bretton and Claire ran theirs on laptops, I was able to use the Linux Lab to run mine, and I wonder if that is the cause of our difference in numbers. Brother Neff was kind and took the time to help debug my output file, and in the process I learned about signal handling and running scipts in the background.

## Basic Probability Theory Question
A dark room contains two barrels. The first barrel is filled with green marbles, the second is filled with a half-and-half mixture of green and blue marbles. So there's a 100% chance of choosing a green marble from the first barrel, and a 50% chance of choosing either color in the second barrel. You reach into one of the barrels (it's dark so you don't know which one) and select a marble at random. It's green. You select another. It's green too. You select a third, a fourth, a fifth, etc. Green each time. What is the *minimum* number of marbles you need to select to *exceed* a probability of 99% that you are picking them out of the all-green barrel? (Note that there are enough marbles so that the answer does not depend on how many marbles are in the second barrel.)


In [None]:
def greenProb(n):
  return (1/2) ** n

def blueProb(n):
  return 1 - (1/2) ** n

for n in range(10):
  print(f'n: {n}')
  print(f'green: {greenProb(n)}')
  print(f'blue: {blueProb(n)}')
  print()

n: 0
green: 1.0
blue: 0.0

n: 1
green: 0.5
blue: 0.5

n: 2
green: 0.25
blue: 0.75

n: 3
green: 0.125
blue: 0.875

n: 4
green: 0.0625
blue: 0.9375

n: 5
green: 0.03125
blue: 0.96875

n: 6
green: 0.015625
blue: 0.984375

n: 7
green: 0.0078125
blue: 0.9921875

n: 8
green: 0.00390625
blue: 0.99609375

n: 9
green: 0.001953125
blue: 0.998046875



7 consecutively green marbles are required to be more than 99% sure you are pulling from the green barrel. As we can see in the output above, as the number of consecutively green marbles increases, the probability that we are not in the mixed barrel decreases and the probability of the green barrel increases.

## A Related But Deeper Basic Probability Theory Question
Take a deep breath. Suppose Shakespeare's account is accurate and Julius Caesar gasped "You too, Brutus" before breathing his last. What is the probability that you just inhaled a molecule that Julius Caesar exhaled in his dying breath?

Assume that after more than two thousand years the exhaled molecules are uniformly spread about the world and the vast majority are still free in the atmosphere. Assume further that there are $10^{44}$ molecules of air in the world, and that your inhaled quantity and Caesar's exhaled quantity were each about $2.2 \times 10^{22}$ molecules.
### Hint
If a number $x$ is small, then $(1 - x)$ is approximately equal to $e^{-x}$.


In [None]:
from math import e

breathMols = (2.2 * (10 ** 22))
ceasarsBreath = breathMols / (10**44)
print(1 - e ** (-ceasarsBreath * breathMols))


0.9920929459484066


First we calculate the ratio of Ceasar's breath within the total atmosphere. By using the hint above we know e raised to Ceasars breath in the atmosphere times our breath intake is the probability we do not take in a molecule. By subtracting this from 1, we get the probability we breathe in a molecule expelled in Ceasars last breath. This is much higher than we were expecting. The site https://puzzlemath.blogspot.com/2011/06/julius-caesars-last-breath.html was used for reference.

## What is True?
Assess yourself on how you did using the checkboxes below. Check a box by putting an 'X' in it only if it is warranted.


### What is true of my experience in general?
(5 points each, 15 points total)
- [x] I had fun.
- [x] I learned something new.
- [x] I achieved something meaningful, or something I can build upon at a later time.

### What is true of my report on what I learned?
(5 points each, 25 points total)
- [x] I wrote a sufficient number of well-written sentences.
- [x] My report is free of "mechanical infelicities" (misspelled words, grammatical errors, punctuation errors, etc.).
- [x] I reported on any connections I found between this investigation and something I already know.
- [x] I reported who were and what contribution each of my collaborators made.
- [x] I reported on how many numbers I was able to verify with a time/computation budget of 24 hours (in a row).


### What is true about my answers?
(15 points each, 60 points total)
- [x] I figured out how to run a Python program continuously for at least 24 hours.
- [x] I refrained from printing out anything except the highest number I verified, knowing that printing just slows a program down.
- [x] I got the right answer for the first probability theory question.
- [x] I got the right answer for the second probability theory question.
