# Individual Digits Test

Since 2017 I have in the course questionnaire asked for 50 by human hand "random" digits (i.e. [0-9]), in order to produce a dataset, where one has to test to what degree these digits are indeed random.

In this exercise, I have extracted these digits for each individual student, cleaning up in the dirty data, such that all entries now contain exactly 50 digits (and no other characters).

The challenge is to test each individual entry to see, if it is seemingly random or not. As there are only 50 entries, statistics is low, but one can now obtain a distribution of p-values, possibly revealing that while some (most?) students managed to produce "random enough" numbers, there are also clear (deliberate?) outliers.

NOTE: Many of the same tests that was used in the combined data can be applied here.

  
##  Author: 
- Troels Petersen ([email](mailto:petersen@nbi.dk))

In [None]:
import numpy as np
import csv
import matplotlib.pyplot as plt
import random
import string
from scipy import stats

In [None]:
verbose = True

## __Student Numbers__

Read the student digits in from the text file provided. Garbage numbers have already been removed and the format of the data in lines of 50 digits.

In [None]:
file_name = 'data_IndividualStudentDigits.txt'

list_digits = []
with open(file_name, mode='r', newline='') as file:
    reader = csv.reader(file)
    for row in reader:
        list_digits.append(row)
        
print(f"  The total number of valid entries in the data files are: {len(list_digits):4d}")

# Closure test:

Before we apply tests to the real (human) data, it is worthwhile to check the result on something we know. Therefore, we simulate truly random digits for (non-human) data. Here, we would expect the p-values to distribute themselves uniformly, as "random" is our null hypothesis. However, since statistics is small (50 entries per case), the p-value distributions might be affected by this (i.e. scarse, shifted, etc.). The closure tests allows you to know this ahead of applying your tests to the real data.

In [None]:
Nsim = len(list_digits)        # Here, one can also choose a much higher number!

# Generate the list of strings
list_digits = []
for _ in range(Nsim):
    random_digits = random.choices("0123456789", k=50)
    random_string = "".join(random_digits)
    list_digits.append(random_string)

# Analysis of digits:

This is where your analysis goes - good luck.

In [None]:
for i, digits in enumerate(list_digits) :

    # Convert string to list of inteters:
    numbers = []
    for char in digits :
        numbers.append(int(char))

    if (verbose and i < 10) :
        print(numbers)

Questions:
---
1. Discuss (possibly again) with your peers, what criteria truly random numbers should satisfy, and how these criteria can be tested. Next, think of what test statistic (i.e. function of the 50 entries boiling these down to one number) that would be different between non-random and truly random numbers. If you know the distribution of this test statistic, you may in a test ask what the probability of obtaining this value or something more extreme is. That is of course the p-value.<br>
You may possibly be inspired by looking at say the first 10-50 entries in the student data. Here, you should see "obvious" non-random cases, which may guide you.

2. Apply your test to all the entries, starting with the *truly random data* to see, if your test(s) make sense and do give sensible p-values for these digits.

3. Once you think this is the case, try it out on the real data. Do you get a different p-value distribution? And do some of the entries have very low p-values? Try to print some of the low p-value cases and check if they are indeed rather non-random.

# Learning points:

This exercise should generally make you capable of:
1. Designing (simple) hypothesis tests
2. Coding them up
3. Testing these on random data as a closure test
4. Applying them to data
5. Interpreting the results

You should (once again) be highly aware, that even a fully correct test may not "discover" anything, even if there is an effect. For example, testing if the mean value of the 50 digits is consistent with 4.5 (null hypothesis) may show consistency, even if the sample is _very_ non-ramdom (e.g. half 0s and half 9s).

However, if just **one** test shows a highly significant deviation from the null hypothesis, then this hypothesis falls.