<a href="https://colab.research.google.com/github/DepartmentOfStatisticsPUE/cda-2022/blob/main/notebooks/cda_1_distributions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup environment

### Python libraries

In [2]:
import scipy.stats as st
import numpy as np
import pandas as pd

## Setup R via Python

In [16]:
%load_ext rpy2.ipython

## Setup Julia via Python

Install Julia

In [None]:
%%bash
wget https://julialang-s3.julialang.org/bin/linux/x64/1.7/julia-1.7.2-linux-x86_64.tar.gz
tar zxvf julia-1.7.2-linux-x86_64.tar.gz
## pythons module
pip install julia

Install python's julia module and setup Julia

In [37]:
import julia
julia.install(julia = "/content/julia-1.7.2/bin/julia")
from julia import Julia
jl = Julia(runtime="/content/julia-1.7.2/bin/julia",compiled_modules=False)
%load_ext julia.magic

Install relevant Julia packages and load them

In [49]:
%%julia
import Pkg; Pkg.add("Distributions")
import Pkg; Pkg.add("DataFrames")
using Distributions
using DataFrames

   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`


# Exercies

## Exercise 1

Assume that football player with success rate 0.4 shot 10 times on goal. Let $X$ be a random variable denoting number of successful scores.  Please find:

+ Distribution of $X$
+ Probability that football player score exactly 4 times ($P(X=4)$)
+ Probability that football player score at least 7 times ($P(X>=7) = 1- P(X <= 6)$)

### Using Python

In [12]:
## Distribution of X
pd.DataFrame({"x": np.arange(10), "p" : [st.binom(10,0.4).pmf(i) for i in np.arange(10)]})

Unnamed: 0,x,p
0,0,0.006047
1,1,0.040311
2,2,0.120932
3,3,0.214991
4,4,0.250823
5,5,0.200658
6,6,0.111477
7,7,0.042467
8,8,0.010617
9,9,0.001573


In [13]:
## P(X=4)
st.binom(10,0.4).pmf(4)

0.2508226560000002

In [15]:
## P(X=7) = 1 - P(X=6)
1 - st.binom(10,0.4).cdf(6)

0.05476188160000006

### Using R

In [17]:
%%R 
data.frame(x = 0:10, p = dbinom(0:10,10,0.4))

    x            p
1   0 0.0060466176
2   1 0.0403107840
3   2 0.1209323520
4   3 0.2149908480
5   4 0.2508226560
6   5 0.2006581248
7   6 0.1114767360
8   7 0.0424673280
9   8 0.0106168320
10  9 0.0015728640
11 10 0.0001048576


In [18]:
%%R
dbinom(4,10,0.4)

[1] 0.2508227


In [19]:
%%R
1-pbinom(6,10,0.4)

[1] 0.05476188


### Using Julia

In [47]:
%%julia
binom = Binomial(10,0.4)

0.2508226560000002

In [53]:
%%julia
DataFrame(x = 0:10, p = pdf.(binom,0:10))

<PyCall.jlwrap 11×2 DataFrame
 Row │ x      p
     │ Int64  Float64
─────┼────────────────────
   1 │     0  0.00604662
   2 │     1  0.0403108
   3 │     2  0.120932
   4 │     3  0.214991
   5 │     4  0.250823
   6 │     5  0.200658
   7 │     6  0.111477
   8 │     7  0.0424673
   9 │     8  0.0106168
  10 │     9  0.00157286
  11 │    10  0.000104858>

In [54]:
%%julia
pdf(binom, 4)

0.2508226560000002

In [55]:
%%julia
1-cdf(binom,6)

0.05476188159999995

## Exercise 2

Number of car accidents in one day in some city follows Poisson distribution with expected value $\lambda=2$. Find the probability that at most 4 car accidents happen.

In [56]:
st.poisson(2).cdf(4)

0.9473469826562889

In [58]:
%%R
ppois(4,2)

[1] 0.947347


In [62]:
%%julia
cdf(Poisson(2),4)

0.9473469826562888