# About
Implementation of the 5-Star Ratings Problem

In [1]:
import numpy as np

# MAIN

Settings:

In [2]:
rating_threshold = 4.8

n_trials = 18

dimensions = 5  # **
# **Come back and think about whether it is ok to call this "dimensions".
#   If you know n-1 faces you know the final face too...
#   But for this method it is more intuitive to give one dimension per face anyways

Derived variables:

In [3]:
# Add one because we want to represent 0 to n_trials (INCLUSIVE of 0 and n)
array_shape = [n_trials+1 for _ in range(dimensions)]

print("Array Shape:\n", array_shape)

Array Shape:
 [19, 19, 19, 19, 19]


### 1. Valid Position Flag
Flag whether each position is permitted
* Index position represents the number of times that each dimension occurred
* Position is permitted if sum of its indices sums to n_trials 

Get Index Positions:

In [4]:
# Get Indices
# * Each dimension gets its own array_shape array
indices = np.indices(array_shape)

Checks:

In [5]:
# First Corner Value

# Show the first corner value of the first dimension's matrix
first_corner_value = indices[0][0,0,0,0,0]
first_corner_value

0

In [6]:
# Last Corner Value

# Show the last corner value of the first dimension's matrix
last_corner_value = indices[0][ n_trials, n_trials, n_trials, n_trials, n_trials]

# Check that the maximum can match the number of trials
assert last_corner_value == n_trials, "DimensionError: Last corner value doesn't match the number of trials"

print(last_corner_value)

18


Sum Indices

In [7]:
index_position_sum = indices.sum(axis=0)
index_position_sum.shape

(19, 19, 19, 19, 19)

In [8]:
# Boolean Array to flag the valid positions
valid_position_bool = index_position_sum == n_trials

In [9]:
# Check
number_of_valid_positions = valid_position_bool.sum()

# ----------------------------
### Cross check 
### Using "stars and bars" from combinatorics
from math import comb

# Calculating the number of ways to distribute n_throws across n_faces
n_throws = n_trials
n_faces = dimensions

# Using the formula (n + k - 1) choose (k - 1)
number_of_ways = comb(n_throws + n_faces - 1, n_faces - 1)

message = f"Error: valid position count doesn't match the cross check."
message += f"Valid Position Count: {number_of_valid_positions}. Cross check count: {number_of_ways}"
assert number_of_valid_positions == number_of_ways, message


Conclusion:

In [10]:
# Valid Position Matrix
# MANUAL CHECK: Diagonal should be TRUE
valid_position_bool

array([[[[[False, False, False, ..., False, False,  True],
          [False, False, False, ..., False,  True, False],
          [False, False, False, ...,  True, False, False],
          ...,
          [False, False,  True, ..., False, False, False],
          [False,  True, False, ..., False, False, False],
          [ True, False, False, ..., False, False, False]],

         [[False, False, False, ..., False,  True, False],
          [False, False, False, ...,  True, False, False],
          [False, False, False, ..., False, False, False],
          ...,
          [False,  True, False, ..., False, False, False],
          [ True, False, False, ..., False, False, False],
          [False, False, False, ..., False, False, False]],

         [[False, False, False, ...,  True, False, False],
          [False, False, False, ..., False, False, False],
          [False, False, False, ..., False, False, False],
          ...,
          [ True, False, False, ..., False, False, False],
       

### 2. Rating Threshold Flag

Reuse `indices` from part 1.

In [11]:
indices.shape

(5, 19, 19, 19, 19, 19)

In [12]:
face_values = np.arange(1, dimensions+1)
face_values

array([1, 2, 3, 4, 5])

In [13]:
# Apply required shape for broadcasting
face_values = face_values.reshape((5,) + (1,) * (indices.ndim - 1))
face_values.shape

(5, 1, 1, 1, 1, 1)

In [14]:
# Get Values attributed to each event
# * Position value data separate for each dimension 
position_values_separated = (face_values * indices)

# Position Value Total
# * Combine the dimensions
position_values = position_values_separated.sum(axis=0)

# Get Average Per Trial (Divide by n_trials)
position_mean = position_values / n_trials
position_mean.shape

(19, 19, 19, 19, 19)

**Threshold Flag**

Using the CDF paradigm (as opposed to survival function)

In [15]:
# Flag where scenario mean is <= the target threshold
threshold_flag = position_mean <= rating_threshold

### 3. Scenario Probability

ToDo:
* Get the probability for each scenario in the matrix
* Use threshold flag and valid position flag to get
    * probability of getting a scenario <= ratinng_threshold 
    
To get the probabilities:
* Input a vector **`p`** with *p_i* for each dimension
* Use the `n_trials` and `dimensions`
* Right way I think
    * Or use a pmf from scipy? The inverse beta thing comes later?
* Wrong  I think
    * Plug this into some sort of inverse multinomial distribution (~multinomial inverse beta function)?
    * think about why this is not right?
    * how is it used in the two faced coin case?

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multinomial.html

In [16]:
from scipy.stats import multinomial



In [17]:
multinomial.pmf([5,5], 10, [0.5, 0.5])

0.24609375000000044

In [18]:
multinomial.pmf?

[1;31mSignature:[0m [0mmultinomial[0m[1;33m.[0m[0mpmf[0m[1;33m([0m[0mx[0m[1;33m,[0m [0mn[0m[1;33m,[0m [0mp[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Multinomial probability mass function.

Parameters
----------
x : array_like
    Quantiles, with the last axis of `x` denoting the components.
n : int
    Number of trials
p : array_like
    Probability of a trial falling into each category; should sum to 1

Returns
-------
pmf : ndarray or scalar
    Probability density function evaluated at `x`

Notes
-----
`n` should be a positive integer. Each element of `p` should be in the
interval :math:`[0,1]` and the elements should sum to 1. If they do not sum to
1, the last element of the `p` array is not used and is replaced with the
remaining probability left over from the earlier elements.
[1;31mFile:[0m      c:\users\ferga\projects\pyenvs\env\lib\site-packages\scipy\stats\_multivariate.py
[1;31mType:[0m      method


# Roughwork

### Roughwork 1

Functions:

In [19]:
def f(x):
    return 2 * x

def is_possible_scenario(x):
    
    # Placeholder response
    possible_flags = (x > 0.5).astype(int)
    
    return possible_flags 

In [20]:
# Define the function g which returns 1 if the element is greater than 0.5, else 0
def g(x):
    return (x > 0.5).astype(int)

# Apply f to A
B = f(A)

# Apply g to A
C = g(A)

# Element-wise multiplication of B and C
D = B * C

# Sum of the elements of D
sum_D = np.sum(D)
print("Sum of the elements in D:", sum_D)

NameError: name 'A' is not defined

### Roughwork 2

In [None]:
import numpy as np

def create_ndim_array(num_dims, size_per_dim):
    """
    Create an N-dimensional NumPy array where each element is the sum of its indices.

    :param num_dims: Number of dimensions of the array
    :param size_per_dim: Size of each dimension
    :return: N-dimensional array with each element being the sum of its indices
    """
    # Generate arrays of indices for each dimension
    indices = np.indices((size_per_dim,) * num_dims)
    
    # Sum along the first axis to sum across all dimensions
    array_sum = indices.sum(axis=0)
    return array_sum

# Example usage
num_dims = 2  # Number of dimensions
size_per_dim = 5  # Size of each dimension

ndim_array = create_ndim_array(num_dims, size_per_dim)
print(ndim_array)
print("Shape of the array:", ndim_array.shape)


Roughwork to understand:

In [None]:
indices = np.indices((5,) * 2)  # (n_trials,) * n_dimensions
indices

In [None]:
indices.sum(axis=1)

In [None]:
indices.sum(axis=0)

Feasible outcome matrix 
* A boolean matrix
* True if sum of indices = `n_trials`