version 0.0.2

Requirement:
 - Basics of Python

Contains:
 - necessary implementations for the widget.


Any functions/calculations implemented in the following can be removed for tasks (marked with the comment '# May be implemented by pupils') and implemented by pupils. Additional tasks can and should be added or modified. This depends on the previous knowledge of the pupils and the focus of the topics.

In [None]:
import sys
sys.path.append("..")
import numpy as np

The basic idea is the simulation with the help of computers. The second simplest example is realized with a dice, which ideally represents a Laplace experiment - the occurring events of a finite set of possible events are equal.

In [None]:
a = np.random.choice(a=[1,2,3,4,5,6], size=30)
print(a)

It is also possible to weight the results differently, i.e. to simulate a non-ideal dice. However, the overall probability should always remain 1.

In [None]:
b = np.random.choice(a=[1,2,3,4,5,6], p=[0.125/2., 0.25, 0.5, 0.125/4., 0.125/4, 0.125], size=30)
print(b)

But let us stay with the ideal cube. For the later application we will need individual functions. A simple function can be defined, which outputs a random number from the set $\{1,2,3,4,5,6\}$ when called.

In [None]:
# May be implemented by pupils

def my_random_dice_roll_simulation():
    return np.random.choice(a=[1,2,3,4,5,6], p=[1/6, 1/6, 1/6, 1/6, 1/6, 1/6], size=1)[0]

for _ in range(5):
    print(my_random_dice_roll_simulation())

A list consisting of 30 or more elements is not easy to read. It is easier to just count how often a number has been rolled, since each roll is independent of the other. An explicit implementation can be done with three lines in this case.

In [None]:
# May be implemented by pupils

# Throw result     1, 2, 3, 4, 5, 6
a_hist = np.array([0, 0, 0, 0, 0, 0])
for num in a:
    a_hist[num -1] += 1
print(a)
print(a_hist)

The zero entry corresponds to the throw of a one, the first to a two, and so on.

After conducting such an experiment it is useful to calculate some quantities to be able to quantify the measurement in some way. First of all, the mean value, which can be calculated directly in a function, is useful:

In [None]:
# May be implemented by pupils

def my_mean(my_array):
    my_array = np.array(my_array)
    # return np.mean(my_array)
    return np.sum(my_array) / len(my_array)

However, a problem arises: The average value of the histogram calculated in this way is only the average value of the number of throws - i.e. it represents the expectation of how often, for example, fives are thrown. However, this value will also be used in the future.

In [None]:
print(my_mean(a))
print(my_mean(a_hist))

The calculation of the actual mean value for a histogram can be realized with the following formula: $$ \bar{x} = \frac{\sum_{i=1}^6 i \cdot n_i}{\sum_{i=1}^6 n_i} $$ where $i$ is the respective bin and $n_i$ is the number of events for this bin.

First of all, it is advisable to express the normalization in a separate function, because it will be the same for all subsequent calculations:

In [None]:
# May be implemented by pupils

def normalisation(my_array):
    my_array = np.array(my_array)
    temporal_sum = np.sum(my_array)
    if temporal_sum == 0.0:
        return np.array(my_array)
    return (1./temporal_sum) * np.array(my_array)

The now so normalized histogram entries are:

In [None]:
a_hist_normalized = normalisation(a_hist)
print(a_hist)
print(a_hist_normalized)

And correspond to the probabilities of throwing the respective number.
With the help of the normalized histogram entries, the mean value can be calculated in the next step:

In [None]:
# May be implemented by pupils

def my_mean_of_histogram(my_array):
    my_array = np.array(my_array) if np.sum(my_array) == 1.0 else normalisation(my_array)
    # return sum(i* item for i, item in enumerate(my_array, start=1))
    my_bins = np.array([i for i in range(1, len(my_array) + 1)])
    return np.sum(my_bins * my_array)

In [None]:
print(my_mean(a))
print(my_mean_of_histogram(a_hist))
print(my_mean_of_histogram(a_hist_normalized))

The next quantity is the standard deviation of the total measurement: $$\sigma = \sqrt{\frac{1}{6 - k} \sum_{i=1}^6 (x_i - \bar{x})^2}\, ,$$here again the standard deviation of the total measurement for the histogram is calculated differently: $$\sigma = \sqrt{\frac{\sum_{i=1}^6 i^2 n_i}{-k + \sum_{i=1}^6 n_i}}$$

In [None]:
# May be implemented by pupils

def my_standard_deviation(my_array, k=0.0):
    mean_ = my_mean(my_array)
    return np.sqrt((1./(len(my_array) - k)) * np.sum((my_array - mean_) ** 2))

def my_standard_deviation_of_histogramm(my_array):
    my_array = np.array(my_array) if np.sum(my_array) == 1.0 else normierung(my_array)
    mean_ = my_mean_of_histogram(my_array)
    # return np.sqrt(sum((i - mittelwert_) ** 2 * item for i, item in enumerate(my_array, start=1)))
    my_bins = np.array([i for i in range(1, len(my_array) + 1)])
    varianz = np.sum((my_bins - mean_) ** 2 * my_array)
    return np.sqrt(varianz)

In [None]:
print(my_standard_deviation(a))
print(my_standard_deviation_of_histogramm(a_hist_normalized))

It is also interesting to answer the question of what uncertainty there was for five times the throw five. For this purpose, each binary entry can be regarded as an independent Poisson process. The uncertainty is given in the simplest case as the root of the events.

In [None]:
# May be implemented by pupilsa

def symetric_uncertainty_poisson(array):
    return np.sqrt(array)

In [None]:
print(symetric_uncertainty_poisson(2))
print(a_hist)
print(symetric_uncertainty_poisson(a_hist))

All functions are now combined for the graphical display. It is still useful to scale the simulated events to the number of measurements performed. In this way a comparability of the expectation with the actual measurement is achieved. For a good prediction it is useful to simulate a larger quantity of events and then compare it with the actual measurement.

In [None]:
# May be implemented by pupils

def scaling_simulation_to_measurement(measurement):
    return 1.0 if np.sum(measurement) == 0.0 else np.sum(measurement)

Addition:  
At the end a quantity should be implemented that evaluates whether the existing measurement originates from the simulated distribution. As an example the $p_0$ value is taken, which is calculated from the $\chi^2$ which is defined for a histogram as follows: $$ \chi^2 = \sum_{i=1}^N \frac{(n_i - y_i)^2}{\sigma_i^2} \, .$$ $\sigma_i$ is the uncertainty of the respective bin entry and corresponds to $\sqrt{y_i}$. $y_i$ is the expected number of events in a bin entry based on the simulation of the respective bin scaled to the measurement. $n_i$ is the number of measured events in the respective bin. 
The value of $\chi^2$ corresponds to a deviation of the measurement from the simulated values. Each of the individual measured values is weighted according to its uncertainty. Measured values with a high inaccuracy change the total value less than measured values with a smaller statistical uncertainty.

The value thus calculated can be translated into a $p$ value. This is a measure for the validation of a hypothesis and indicates whether the measurement confirms the expectation. By definition, this is a probability to obtain the observed measurement under the condition that the hypothesis used is correct. If $p_0$ should fall below a pre-defined value (often $0.05$ or $0.01$ is chosen) the chosen hypothesis - the expectations $y_i$ can be discarded in favour of a new hypothesis.

In [None]:
import scipy

def p0_from_chi2(messung, erwartung):
    # (mess, erw) ist ein Paar aus den Paaren (messung, erwartung)
    chi2_ = sum((1.0/erw) * (mess - erw) ** 2 for (mess, erw) in zip(messung, erwartung) if float(erw) != 0.0)
    # Übersetzung in die Wahrscheinlichkeit dass die Messung der Erwartung entspricht
    p0_ = 1.0 - scipy.stats.chi2.cdf(chi2_, df=len(messung) - 1)
    # Rückgabe: Name der Größe und die Größe selbst (für die Konvertierung)
    return "p_0", p0_

Summary of the created functions in a class (only relevant for the transfer of the class to the application). All functions should be assigned.

In [None]:
class OwnCalculations(object):
    pass

# Replace with another function if necessary
OwnCalculations.own_pdf_measurement_one = my_random_dice_roll_simulation

OwnCalculations.own_pdf_simulation_one = my_random_dice_roll_simulation
OwnCalculations.own_mean = my_mean
OwnCalculations.own_std = my_standard_deviation
OwnCalculations.own_norm = normalisation
OwnCalculations.own_measurement_scale = scaling_simulation_to_measurement
OwnCalculations.own_individual_std = symetric_uncertainty_poisson
OwnCalculations.own_stat_evaluation = p0_from_chi2

Import of the graphical application:

In [None]:
from include.widget.WuerfelWidget import WidgetWuerfel as WW

`b_num` indicates the number of " dice sides". The `calc_class` argument corresponds to the class that combines all previous functions (`OwnCalculations`).

In [None]:
WW(f_width=700, f_height=700, f_bottom=0.09, f_left=0.09, b_num=6, g_width=900, calc_class=OwnCalculations)

Note: It is possible to reorganize the individual functions of the application according to their implementation. However, it should be considered that some functions are interrelated, which makes a complete encapsulation impossible.