<a href="https://colab.research.google.com/github/spencerleewilliams/cse380-notebooks/blob/master/ponder_and_prove_combinatorics_and_probability.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ponder and Prove Combinatorics and Probability
#### Due: Saturday, 6 February 2021, 11:59 pm.

## Conjecture

A number-theoretic conjecture of combinatorial significance is the following:

$degree2({2n \choose n}) =$ the "bits-on count" (or population count, or Hamming weight) of $n$.

$degree2(m)$ is defined as the number (degree, exponent) of 2's in the prime factorization of $m$.

In other words, for any $m$, a positive integer, $m = 2^e \cdot o$ where $o$ is an odd positive integer (could be 1) and $e$ is a natural number, including zero --- which would be the case when $m$ is odd. It's the $e$ that is the $degree2$ of $m$.

Another way to state this conjecture is that the number of 1's in the binary expansion of ${2n \choose n}$ for positive integer $n$ is equal to the number of 2's in the prime factorization of $n$.

Your task is to write Python code to test this conjecture for as many positive integers as you can. See the self-assessment for more details.

Note: a `bitsoncount` function can be a one-liner in Python: `return bin(x).count('1')`



In [None]:
import operator as op
from math import sqrt
from functools import reduce
def two_nCn(k):
    n = 2*k 
    r = min(k, n-k)
    num = reduce(op.mul, range(n, n-r, -1), 1)
    denom = reduce(op.mul, range(1, r+1), 1)
    return num // denom

def num_of_ones(k):
    count = 0
    while(k):
        k &= (k - 1)
        count += 1
    return count

def num_of_twos(k):
    count = 0
    while k % 2 == 0:
        count += 1
        k //= 2
    return count

def test_conjecture(n):
    return (num_of_ones(n) == num_of_twos(two_nCn(n)))

In [None]:
%time test_conjecture(30000)

In [None]:
n = 1
try:
  while test_conjecture(n):
    n += 1
except:
  print('Verified up to ' + str(n) + ' at ' + str(datetime.datetime.now()))

In [None]:
print(n)

Start Time: 5:10pm MT 2-5-2021

End Time: 5:10pm MT 2-6-2021

Final Output: 74,322

## Basic Probability Theory Question
A dark room contains two barrels. The first barrel is filled with green marbles, the second is filled with a half-and-half mixture of green and blue marbles. So there's a 100% chance of choosing a green marble from the first barrel, and a 50% chance of choosing either color in the second barrel. You reach into one of the barrels (it's dark so you don't know which one) and select a marble at random. It's green. You select another. It's green too. You select a third, a fourth, a fifth, etc. Green each time. What is the *minimum* number of marbles you need to select to *exceed* a probability of 99% that you are picking them out of the all-green barrel? (Note that there are enough marbles so that the answer does not depend on how many marbles are in the second barrel.)


To ensure that the you are selecting from the all-green barrel, you would need to pull enough green marbles that if it were from the other barrel, then the probability of getting that 'lucky' would be less than one percent.

$1/2^n$ < 0.01

$1/2^n$ $\approx$ 0.0099999999999999999999

$ln(1/2^n)$ = ln(0.0099999999999999999999)

$n$ = ln(0.0099999999999999999999) / ln($1/2$)

$n$ $\approx$ 6.64

$n$ = 7 




## A Related But Deeper Basic Probability Theory Question
Take a deep breath. Suppose Shakespeare's account is accurate and Julius Caesar gasped "You too, Brutus" before breathing his last. What is the probability that you just inhaled a molecule that Julius Caesar exhaled in his dying breath?

Assume that after more than two thousand years the exhaled molecules are uniformly spread about the world and the vast majority are still free in the atmosphere. Assume further that there are $10^{44}$ molecules of air in the world, and that your inhaled quantity and Caesar's exhaled quantity were each about $2.2 \times 10^{22}$ molecules.
### Hint
If a number $x$ is small, then $(1 - x)$ is approximately equal to $e^{-x}$.


The answer to this question is to determine the ratio of the probability space of how many molucles you can inhale versus how many of Caeser's molcules exist in the space of 10 to the 44th. Finding the ratios compared to one will give you the probability of at least one molucle inhaled. The probability can be written as the following:

$P = 1-(1-M/A)^C$

Where: M and C equals molecules inhaled/exhaled out of a total of A molecules.

$P = 1-(1-2.2*10^(22)/10^(44))^(2.2*10^(22))$


$P = 0$

This answer means that there is over a 99% chance that at least one molecule inhaled was a molecule exhaled by Caeser. 


## What is True?
Assess yourself on how you did using the checkboxes below. Check a box by putting an 'X' in it only if it is warranted.


### What is true of my experience in general?
(5 points each, 15 points total)
- [X] I had fun.
- I had mostly fun in particpating in this ponder and prove activity. The theoritical problems were good thought questions to put what we have learned into question and the conjecture too. However, the conjecture test also shows a useful simulation test of what it would be like to run a useful test to collect large amounts of data that could require a continous run time of 24 hrs or more. Nevertheless, the implementation of this 24 hr test was a little bothersome for a college student limited to only a laptop and responsibilties to ensure other assignments are completed without them affecting this test, but lucky with some help I was able to do it.
- [X] I learned something new.
- This week's CDL with fibinoci numbers and triangular numbers taught me another useful connection in the magic of their influence in the number world. In my engineering statistics class I recently learned about polynomial distribution and how it relates to probability. There, it also mentions how the triangular numbers form a consistent pattern to rely from when solving for those types of probabilities. The CDL also taught me to read the instructions more carefully as I thought it was to find the relationship between columns rather than rows. In other words, taking the time to understand first saves a lot of confusion and pain for the future.
- [X] I achieved something meaningful, or something I can build upon at a later time.
- The 24hr run test is something I felt that I achieve in a meaningful way and can be used in the future too. In the industry your code must be capable of being excuted at all times and perhaps for long durations too. This activity has helped me to create and run more time effiencent python code and problem solve with new challenges and or restrictions to consider that would be considered common in the real-world like time, resources and materials to produce something credable and efficent. I feel that I could build upon this be testing a simliar appication on a cloud database for upkeeping information so that it may provide current information at any time of the day.

### What is true of my report on what I learned?
(5 points each, 25 points total)
- [x] I wrote a sufficient number of well-written sentences.
- [x] My report is free of "mechanical infelicities" (misspelled words, grammatical errors, punctuation errors, etc.).
- [x] I reported on any connections I found between this investigation and something I already know.
- [x] I reported who were and what contribution each of my collaborators made.
- [x] I reported on how many numbers I was able to verify with a time/computation budget of 24 hours (in a row).


### What is true about my answers?
(15 points each, 60 points total)
- [x] I figured out how to run a Python program continuously for at least 24 hours.
- [x] I refrained from printing out anything except the highest number I verified, knowing that printing just slows a program down.
- [x] I got the right answer for the first probability theory question.
- [x] I got the right answer for the second probability theory question.


##Connections
- I have been studying software testing and engineering statistics this semester. The topics of efficiency, uncertainties, probabilities all have been common themes between this course and those mentioned. Even in my physics lab, mitigating risk and error uncertainty is possible thanks to the arithmetic fundamentals of number theory in combinatorics and probability. The connection I form from all of these subjects is solving real-world problems. This DPC with Zieger's candy bar could easily be an interview question. Even if we know that a conjecture is always true, the ability to apply its truth efficiently is usefull in a demanding technological world that constanlly requires more and more data to run. 

##Collaborators
- Brother Neff - Helped me to fix and optimize my conjecture test program.
- Claire Hocker - Helped me understand the nature of the conjecture and its purpose.