# Biased and fair coin sequences of arbitrary length

### Installing the parallel_simulations module

In [None]:
%pip install -q -e "git+https://github.com/vladimirrotariu/parallel-monte-carlo-simulations#egg=parallel_simulations&subdirectory=src"

### Importing the PipelineOptions class

In [29]:
import sys
from apache_beam.options.pipeline_options import PipelineOptions

### Importing the ParallelMCBattery class

In [30]:
from parallel_simulations import ParallelMCBattery

### Creating the battery configurations, and instantiating the Monte Carlo battery for simulations object

One may configure the Monte Carlo parallel battery by choosing the desired random number generator, in this case [Philox](https://numpy.org/doc/stable/reference/random/bit_generators/philox.html#philox-counter-based-rng), and the pipeline options instance of the class [PipelineOptions](https://beam.apache.org/releases/pydoc/2.33.0/apache_beam.options.pipeline_options.html#apache_beam.options.pipeline_options.PipelineOptions) of Apache Beam, for this example choosing for simplicity the default settings, which means the pipeline runs on the local `Direct Runner`.

In [31]:
# as we run the Apache Beam application in a Jupyter notebook, we provide the pipeline_options programatically, and not via CLI
sys.argv = [""]
###############################################################################################################################

options = PipelineOptions()

battery_configs = {"rng" : "Philox", "pipeline_options": options}

battery_parallel_MC = ParallelMCBattery(battery_configs=battery_configs)

### Defining the statistical model

We are interested to model two distinct **parallelizable** sequences of heads 'H' and tails 'T' generated by simulating tossing coins of a given `bias`, which is a list of `parameters`, corresponding in this case to a unique parameter.

In [32]:
def CoinSequence(number_flips, rng, bias):
    return ["H" if rng.random() <= bias[0] else "T" for _ in range(number_flips)]

models = [CoinSequence, CoinSequence]

### Creating the simulations configurations

To configure the simulations for these models, one further uses a list of dictionaries, each dictionary corresponding to one of the `models`.

In [33]:
# 100,000 sequences, each of 16 'H' or 'T' generated by simulating tossing a fair coin
fair_coin_config = {"parameters": [0.5], "number_simulations" : 100000, "number_points": 16}

# 60,000 sequences, each of 32 'H' or 'T' generated by simulating tossing a biased coin
biased_coin_config = {"parameters": [0.7], "number_simulations" : 60000, "number_points": 32}

simulation_configs = [fair_coin_config, biased_coin_config]

### Simulating with minimal configurations

And now one may perform the Monte Carlo simulations configured above, having in mind that without specifying `output_paths`, the output csv files will be written by default in the directory where it is executed the Python script which calls the object method `simulate`:

In [34]:
battery_parallel_MC.simulate(models, simulation_configs)

### Sample inspection

As we did not specify the `output_paths`, the output files names' are created with the index correspoinding from the `models` list

In [35]:
!head -n 3 0.txt

H,H,H,H,T,H,T,H,H,H,T,H,T,T,H,H
T,H,H,H,H,H,H,T,T,T,H,H,H,T,T,H
T,H,T,T,T,H,H,T,T,H,H,T,H,H,H,T


In [36]:
!head -n 3 1.txt

H,H,H,T,H,H,H,H,H,T,T,H,H,H,H,H,H,T,H,H,H,H,H,H,H,H,H,H,T,H,H,T
H,T,H,T,T,H,H,H,H,H,T,H,T,H,T,T,T,T,H,H,H,H,H,H,T,H,H,H,T,H,H,H
H,H,H,H,T,H,H,H,T,T,T,H,H,H,H,H,H,H,H,H,T,T,T,T,T,H,T,T,T,H,H,H


### Simulating with output_paths

As expected, one may use both relative and absolute paths to configure where the output files of the Monte Carlo simulation is going to be created.

In [None]:
output_paths = ["A","B"]

battery_parallel_MC.simulate(models, simulation_configs, output_paths)

In [None]:
!head -n 3 A

In [None]:
!head -n 3 B