# How random are you?
Just how good are people at generating random numbers?  We're going to try to find out here in this project. We're going to compare a series of user-generated random "coin toss" type events to actual coin toss events. In particular, we're going to see if people make random sequences or whether there's a bias embedded in there. We'll do this by looking at the distribution of run-lengths (how often do you have N in a row?).  For example, if 1=heads and 0=tails, the sequence `1 0 1 1 0 1 0 1 0 1 1 0` would have, for heads, 3 1-length runs, 2 2-length runs, and no 3-length runs.

## Getting the user input
For the first part of this project, you'll collect a series of 1's and 0's from a user. There are fancier ways of doing this but, trust me, when tossing in cross-platform aspects, macOS security, and simplicity, we're going to just use the `input` method in Python. The downside to this is that it just asks you to enter in a string and waits until you hit Enter. There's no ability to filter the keys and no ability to make sure exactly N valid keys have been pressed.

So, *write a function that has the user ostensibly press a bunch of 1's or 0's. Then, take that input, remove anything other than 1 or 0. Do this using a list comprehension. You can keep it as a list or convert it back to a string.* Once that basic bit is done, *set it up so that it can take a `min` (default 20) number of 1's and 0's that must be in there and enforce this.*

Remember, a string is something you can iterate over. Here, for example, taking `foo` and filtering out the numbers:


In [2]:
foo='Sharks are4 older1 than trees'
print(foo)
bar=[i for i in foo if not i.isnumeric()]
print(bar)
baz=''.join(i for i in foo if not i.isnumeric())
print(baz)

Sharks are4 older1 than trees
['S', 'h', 'a', 'r', 'k', 's', ' ', 'a', 'r', 'e', ' ', 'o', 'l', 'd', 'e', 'r', ' ', 't', 'h', 'a', 'n', ' ', 't', 'r', 'e', 'e', 's']
Sharks are older than trees


In [None]:
def GetUserSequence(min=20):
    '''function where user presses a bunch of 1s and 0s. Remove anything not a 1 or 0 using list comprehension. Convert back to string. 
    then make it so that there has be a min amount of 20 number of 1s and 0s. '''

    sequence = str(input("Please randomly press 1 and 0 on your keyboard at least 20 times."))

    while len(sequence) < 20: 
        print("Please enter at least 20 values")
        sequence = input("Please randomly press 1 and 0 on your keyboard at least 20 times.")

    print("Thank you for entering at least 20 numbers")
    filtered_seq = [n for n in sequence if n == "0" or n == "1"]
    return "".join(filtered_seq)


userseq=GetUserSequence()


Please enter at least 20 values
Thank you for entering at least 20 numbers
24
101010010010110010001010


## OK, so what's actually random?
Here, we're going to write come code that actually makes a random sequence of 0's and 1's of length n (default 1000), returning this as a string.  Later on, we'll use numpy and scipy, but Python now has decent random numbers built in.  Have a look at [`random.choices`](https://docs.python.org/3/library/random.html#random.choices).  But, here's a sample of how it works:

In [None]:
import random
print(random.choices(['duck','go'],[10,2],k=10))


['duck', 'go', 'go', 'duck', 'duck', 'duck', 'duck', 'duck', 'duck', 'go']


**Now, in the cell below**, write a function `GenRandom` that uses `random.choices` to make a list of `n` random 0's and 1's.

In [83]:
import random
def GenRandom(n=1000):
    '''make a list of n random 0s and 1s; convert to string aftere'''
    return ''.join(random.choices(['1', '0'], k = n))
   


rndseq=GenRandom()

print(rndseq)

1010101101000010111010101100111110000000011101000000001111101111011010000101101111000100101010011100100100011000001110111111000110100111000011011011001110110011011110101101110111100110000011111000010000110100000111101100101011111001011111101100101011011010010100110110100011010001000111101111100111110111111101001011010101110000011101111100000011001101000101011101101110011110111011110000010000001110110010000011001111111010110000111001010011000100100001110000011111100110011011000000111011100111110000100111111000010011001011101101101010001110001101100110010001000001110010000000001011110101110010001001010011001011101110101000101100010000001010100001110000011001000110001110011010011011010001111110101001000000101010001011000000100011000111110001111011101000101100110101101100100000100110100001111111111110010000010011000101001101011000101001100100000100011100111001101111000011100111100101111100111011111011110111110011010001110100111001000110111010110111111101000111111000011101011011111001101011

# The fun part
Now comes the fun part.  We need to see just how often patterns come up. In particular, we're going to look for how often we get runs of length 1, runs of length 2, of length 3, ... length 8.  Python has a nice [`count`](https://docs.python.org/3/library/stdtypes.html?highlight=count#str.count) function that works on strings that we might think to use. But, the trouble is, this counts "non-overlapping occurrences of substrings".  Have a look at this sample.  We should end up with 1 run of length 1 and 1 of length 3, but none of length 2.

In [60]:
foo='01001110'
print(foo.count('1'))
print(foo.count('11'))
print(foo.count('111'))

4
1
1


Well that's not quite right... It found 4 of length=1, 1 of length=2 (*wait - think about why it came up with just 1 of these and not 2*), and 1 of length 3.

What if we made it look for 0's beforehand to make sure we're at the start of a run?

In [61]:
print(foo.count('01'))
print(foo.count('011'))
print(foo.count('0111'))

2
1
1


Closer, I suppose, but still not there.  What if we looked for the full start with 0, thing, and then end with 0?

In [15]:
print(foo.count('010'))
print(foo.count('0110'))
print(foo.count('01110'))

1
0
1


Hey, that looks good!  Let's just test it one more time though

In [62]:
foo='0101010101010101010101010'
print(foo.count('010'))
print(foo.count('0110'))
print(foo.count('01110'))

6
0
0


We were so close weren't we? I mean, really now.  6?  I count 12 in there.  *Why is it coming up with only 6?* Once we figure that out, *what might we do about it?*

Now, write a function that:

1. Takes in a string and figures out the number of run-lenghts of 1's from 1-8.  Remember, your string could start or end with a 1, so any solution you come up with has to handle this.
2. Divides those counts by the length of the string itself to, in some ways, normalize this so that short and long inputs are on roughly an even "odds of run-length X" kind of footing.  No, you can't use numpy.  A key point here is to think around obstacles for solutions.
3. Returns a list with those "probabilities" as a list

Then, run this on both your user-generated string and on the random string and give me a pretty printout of the results.

In [101]:
def CalcRunLengthProbs(s):
    for run_length in range(9):
        # print(run_length)
        run_length_ones = [1] * run_length
        print(run_length)
        ones = [n.count(run_length_ones) for n in s if n == run_length_ones] 
        # print(ones)
                     
    return ones


    


    #'1' * 8 --- 8 runs in a row 

    '''try this on user-generated string and on random string and give pretty printout of the results'''



# one_count = 0
# for i in foo:
#     if i == '1':
#         one_count +=1

# print(one_count)



In [105]:
CalcRunLengthProbs(rndseq)

0
1
2
3
4
5
6
7
8


[]