### JPX: Are my monkeys behaving normally?
Just for fun I have some monkeys that [do my trading for me](https://www.kaggle.com/code/carlmcbrideellis/jpx-baseline-monkey-throwing-dart-model), inspired by the famous suggestion by [Burton Malkiel](https://en.wikipedia.org/wiki/Burton_Malkiel) that "*a blindfolded monkey throwing darts at the stock listings could select a portfolio that would do as well as one selected by the experts.*" as mentioned in his book [A Random Walk Down Wall Street](https://wwnorton.com/books/9780393358384). As I can only give them 5 bananas per day, here we simulate the results of 50,000 such random submissions.

Read in the supplemental `stock_prices.csv`, which is the data used to calculate the Leaderboard:

In [None]:
import numpy as np
import pandas as pd
import random
import matplotlib.pyplot as plt

In [None]:
prices = pd.read_csv('../input/jpx-tokyo-stock-exchange-prediction/supplemental_files/stock_prices.csv', parse_dates=["Date"])

set up the [evaluation function](https://www.kaggle.com/code/smeitoma/jpx-competition-metric-definition)

In [None]:
def calc_spread_return_sharpe(df: pd.DataFrame, portfolio_size: int = 200, toprank_weight_ratio: float = 2) -> float:
    """
    Args:
        df (pd.DataFrame): predicted results
        portfolio_size (int): # of equities to buy/sell
        toprank_weight_ratio (float): the relative weight of the most highly ranked stock compared to the least.
    Returns:
        (float): sharpe ratio
    """
    def _calc_spread_return_per_day(df, portfolio_size, toprank_weight_ratio):
        """
        Args:
            df (pd.DataFrame): predicted results
            portfolio_size (int): # of equities to buy/sell
            toprank_weight_ratio (float): the relative weight of the most highly ranked stock compared to the least.
        Returns:
            (float): spread return
        """
        assert df['Rank'].min() == 0
        assert df['Rank'].max() == len(df['Rank']) - 1
        weights = np.linspace(start=toprank_weight_ratio, stop=1, num=portfolio_size)
        purchase = (df.sort_values(by='Rank')['Target'][:portfolio_size] * weights).sum() / weights.mean()
        short = (df.sort_values(by='Rank', ascending=False)['Target'][:portfolio_size] * weights).sum() / weights.mean()
        return purchase - short

    buf = df.groupby('Date').apply(_calc_spread_return_per_day, portfolio_size, toprank_weight_ratio)
    sharpe_ratio = buf.mean() / buf.std()
    return sharpe_ratio

set up our monkey instance

In [None]:
n_monkeys = 50000

In [None]:
codes = prices["SecuritiesCode"].unique()
results = []
for i in range(n_monkeys):
    code_list = []
    for code in codes:
        code_list.append([code,random.random()])       
    codes_df = pd.DataFrame(code_list, columns=['code','prediction'])
    codes_df['Rank'] = codes_df.prediction.rank(method="min").astype('int') -1
    mapping = dict(codes_df[['code', 'Rank']].values)
    prices['Rank'] = prices['SecuritiesCode'].map(mapping)
    results.append(calc_spread_return_sharpe(prices))

In [None]:
print("Maximum score:  %.3f" % max(results))
print("Minimum score: %.3f" % min(results))

Let us now fit a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution) to the results, and plot

In [None]:
from scipy.optimize import curve_fit

# define a Gaussian function
def Gaussian(x,mu,sigma,A):
    return A*np.exp(-0.5 * ((x-mu)/sigma)**2)

fig, ax = plt.subplots(figsize=(10, 5))
bin_heights,bin_borders, _ = plt.hist(results, bins=150)
bin_centers = bin_borders[:-1] + np.diff(bin_borders)/2

# seed guess
initial_guess=(0, 0.1 , 1)
# the fit
parameters,covariance=curve_fit(Gaussian, bin_centers, bin_heights, initial_guess)
sigma=np.sqrt(np.diag(covariance))

x_interval = np.linspace(min(results), max(results), n_monkeys)
plt.plot(x_interval,Gaussian(x_interval,parameters[0],parameters[1],parameters[2]),color='red',lw=4,label='Gaussian fit', alpha=0.8)
plt.xlabel('LB score',fontsize=14)
plt.yticks([]);

the normal distribution has the following parameters:

In [None]:
print("μ = %.5f" % parameters[0])
print("σ = %.5f" % abs(parameters[1]) )

The Sharpe ratio is effectively calculated over one average day, thus a yearly Sharpe ratio of 1 (good) would correspond to a LB score of   \\( 1/\sqrt{252} \approx 0.063 \\) and an excellent Sharpe ratio of 2 would correspond to a LB score of \\( 2/\sqrt{252} \approx 0.126 \\) respectively *etc* (many thanks to [Bowaka](https://www.kaggle.com/bowaka) for this calculation). 

Let us see empirically how many did better than these values:

In [None]:
Sharpe_values = [1,2,3,4,5,6,7]
# assume the number of trading days per year = 252

for k in Sharpe_values:
    LB_score = k/np.sqrt(252)
    count = sum(i > LB_score for i in results)
    print("Yearly Sharpe",k,"LB score %.3f" % LB_score,"percentage = %.1f" % ((100/n_monkeys)*count) ) 

### Normality tests
Here we are looking for *p*-values greater than 0.05, which would  indicate that we cannot reject the null hypothesis that the data was indeed actually drawn from a normal distribution.
#### [Shapiro-Wilk test](https://docs.scipy.org/doc/scipy-1.8.0/html-scipyorg/reference/generated/scipy.stats.shapiro.html) 
(Note: For *N* > 5000 the W test statistic is accurate but the *p*-value may not be.)

In [None]:
from scipy.stats import shapiro

Wstat, p = shapiro(results)

print('W statistic = %.3f, p-value = %.3f' % (Wstat, p))

#### [D’Agostino and Pearson test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.normaltest.html)

In [None]:
from scipy.stats import normaltest

k2, p = normaltest(results)

print('result = %.3f, p-value = %.3f' % (k2, p))

where the result is \\(s^2 + k^2 \\), where *s* is the z-score returned by a [skew test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skewtest.html) and *k* is the z-score returned by a [kurtosis test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kurtosistest.html).

### Conclusions
* Both visually and via the two normality tests we can see the our simulated monkeys are fairly normal
* It is not possible for a random submission to realistically obtain a Leaderboard score above 0.4

### Annex
Let us calculate the probability that a monkey can obtain a Leaderboard score greater than, say, 0.25

In [None]:
from scipy.special import erf

mu       = 0 # assume mu is exactly zero
sigma    = abs(parameters[1])
LB_score = 0.25

probability = (erf((LB_score-mu)/(sigma*np.sqrt(2)))/2) + 0.5

print(f'Probability = {round((1-probability)*100,3)}%')