# Final Project Report

* Class: DS 5100
* Student Name: Khalil Goddard
* Student Net ID: mfu2as
* This URL: a URL to the notebook source of this document

# Instructions

Follow the instructions in the Final Project isntructions notebook and put evidence of your work in this notebook.

Total points for each subsection under **Deliverables** and **Scenarios** are given in parentheses.

Breakdowns of points within subsections are specified within subsection instructions as bulleted lists.

This project is worth **50 points**.

# Deliverables

## The Monte Carlo Module (10)

- URL included, appropriately named (1).
- Includes all three specified classes (3).
- Includes at least all 12 specified methods (6; .5 each).

Put the URL to your GitHub repo here.

Repo URL:

Paste a copyy of your module here.

NOTE: Paste as text, not as code. Use triple backticks to wrap your code blocks.

```
import pandas as pd
import numpy as np

class Dice: 
    
    '''
    This is the Dice Class. 
    A die has N sides, or “faces”, and W weights, and can be rolled to select a face.
    For example, a “die” with  𝑁=2 is a coin, and a one with  𝑁=6 is a standard die.
    Normally, dice and coins are “fair,” meaning that the each side has an equal weight. 
    An unfair die is one where the weights are unequal.
    Each side contains a unique symbol. Symbols may be all alphabetic or all numeric.
    𝑊 defaults to  1.0 for each face but can be changed after the object is created.
    The weights are just positive numbers (integers or floats, including  0), not a 
    normalized probability distribution.
    The die has one behavior, which is to be rolled one or more times.
    
    PURPOSE: Given a list of values, compare each value against a threshold
    
    INPUTS
    faces    numpy array of values to represent the N faces of a die
    
    '''
    
    def __init__(self, faces: np.array):
        
        
        if type(faces) != np.ndarray:
            raise TypeError(f"'faces' must be of type numpy array, got {type(faces).__name__}")
        
        if faces.dtype.kind not in ('U', 'S') and np.issubdtype(faces.dtype, np.number) == False:
                raise ValueError("Values in 'faces' must be of type string or int")
                
        if len(np.unique(faces)) != len(faces):
            raise ValueError("Values in 'faces' must be distinct")

     
        
        weights  = np.ones(len(faces))
        self.weights = weights 
        self.faces = faces
        
        self.dice_info = pd.DataFrame(weights, index=faces, columns=['Weights'])

        
        
        
    def change_weight(self, face, new_weight):
        
        if face not in self.faces:
            raise IndexError(f'Passed Face Value, "{face}" not in die array.')
        else:
            try:
                float(new_weight)
                index = np.where(self.faces == face)
                self.weights[index] = float(new_weight)
            
            except:
                raise TypeError(f"'new_weight' must be numeric, got {type(new_weight).__name__}")
                
                
                
                
    def roll_dice(self, rolls = 1):

        return list(pd.DataFrame(self.dice_info.index).sample(n=rolls, replace=True, weights = self.weights)[0])
        
                
            
        
        
        
    def current_state(self):
        
        rslt = self.dice_info.copy()
        return rslt
        
        
        
        
        
class Game: 
    

    
    '''
    A game consists of rolling of one or more similar dice (Die objects) one or more times.

    By similar dice, we mean that each die in a given game has the same number of sides and associated faces, but 
    each die object may have its own weights.
    
    Each game is initialized with a Python list that contains one or more dice.
    
    Game objects have a behavior to play a game, i.e. to roll all of the dice a given number of times.
    
    Game objects only keep the results of their most recent play.
    
    PURPOSE: Given a list of values, compare each value against a threshold
    
    INPUTS
    faces    numpy array of values to represent the N faces of a die
    
    '''
    
    
    
    
    def __init__(self, die: list ):
        
        self.die = die
        
        
        
    def play(self, num_rolls: int):
        
        plays = pd.DataFrame()
        plays.index.name = "Roll Number"
        index = 0
        for dice in self.die:
            plays[index] = dice.roll_dice(num_rolls)
            index += 1
            
        self.plays = plays
    
    
    
    
    def show_results(self, form = "wide"):
        
        if form != "wide" and form != "narrow":
            raise ValueError(f"Form must be 'narrow' or 'wide', got '{form}'")
            
        elif form == "wide":
            return self.plays.copy()
        
        else:
            rslt = self.plays.copy()
            
            narrow_df = rslt.reset_index().melt(id_vars='Roll Number'
                                                ,var_name='Dice Number'
                                                ,value_name='FaceRolled')
            
            narrow_df.set_index(['Roll Number', 'Dice Number'], inplace=True)
    
        
            return narrow_df
        
        
       
    
    
    
class Analyzer:
    
    
    
    
    def __init__(self, game):
        
        
        if not isinstance(game, Game):
            raise ValueError(f"Object passed must be a Game object, got {type(game).__name__}")
        else:
            self.game = game
                             
                
                
    def jackpot(self):
        
        jackpots = 0 
             
        for i in range(len(self.game.show_results())):
            if len(self.game.show_results().iloc[i].unique()) == 1:
                jackpots += 1
                    
        return jackpots
                     
        
        
    
    def face_counts(self):
       
        faces = []

        for i in range(len(self.game.show_results())):
            row = self.game.show_results().iloc[i]
            row_counts = row.value_counts()
            faces.append(row_counts)

        rslt = pd.DataFrame(faces).fillna(0).astype(int)
        rslt.index = self.game.show_results().index
        
        return rslt

    
    
    def combo_count(self):
       
        combos = []

        for i in range(len(self.game.show_results())):
            row = tuple(sorted(self.game.show_results().iloc[i]))
            combos.append(row)

        combo_s = pd.Series(combos)
        combo_counts = combo_s.value_counts().to_frame(name='count')
        combo_counts.index.name = 'combo'
        

        return combo_counts
    
    
    
    def permutation_count(self):
        
        rslt = self.game.show_results()
        perms = []
        
        for i in range(len(rslt)):
            row = tuple(rslt.iloc[i])    
            perms.append(row)
        
        
        perm_s = pd.Series(perms)
        perm_counts = perm_s.value_counts().to_frame(name='count')
        
        
        perm_counts.index.name = 'perm'
        
        return perm_counts
        
```
        

## Unitest Module (2)

Paste a copy of your test module below.

NOTE: Paste as text, not as code. Use triple backticks to wrap your code blocks.

- All methods have at least one test method (1).
- Each method employs one of Unittest's Assert methods (1).

```
import unittest
import pandas as pd
import numpy as np
from montecarlo import Dice
from montecarlo import Game
from montecarlo import Analyzer

class MonteCarloTestSuite(unittest.TestCase):
    
    def test_1_dice_instance(self): 
        #ensure faces attriubute of die object is proper data structure. 
        array1 = np.array([34, 23, 45])
        die = Dice(array1)
        self.assertTrue(type(die.faces) == np.ndarray) 
        

    def test_2_dice_change_weights(self):
        # test if change weights changes the weight of the correct face and is correct data structure. 
        array1 = np.array([34, 23, 45])
        die = Dice(array1)
        die.change_weight(34, 4)
        expected = 4
        actual = die.weights[0]
        self.assertTrue(type(die.weights) == np.ndarray and expected == actual ) 
       
                
    def test_3_roll_dice(self): 
        # test if roll dice return list object.
        array1 = np.array([34, 23, 45])
        die = Dice(array1)
        rslt = die.roll_dice(5)
        self.assertTrue(isinstance(rslt, list)) 
        
            
    def test_4_dice_current_state(self): 
       # test if current state returns pandas dataframe.
        array1 = np.array([34, 23, 45])
        die = Dice(array1)
        rslt = die.current_state()
        self.assertTrue(isinstance(rslt, pd.DataFrame)) 
        
        
    def test_5_game_instance(self): 
       # test that the game instance has a list of die objects
        array1 = np.array([1,2,3,4,5,6])
        array2 = np.array([1,2,3,4,5,6])
        die1 = Dice(array1)
        die2 = Dice(array2)
        
        game = Game([die1, die2])
        expected = 2
        actual = len(game.die) 
        self.assertTrue(isinstance(game.die, list) and expected == actual) 
        
        
    def test_6_game_plays(self):
        #test that game plays function returns a pandas dataframe.
        array1 = np.array([1,2,3,4,5,6])
        array2 = np.array([1,2,3,4,5,6])
        die1 = Dice(array1)
        die2 = Dice(array2)
        
        game = Game([die1, die2])
        game.play(4)
       
        self.assertTrue(isinstance(game.plays, pd.DataFrame)) 
        
        
    def test_7_game_show_results(self):
        #test that game show results function returns a pandas dataframe.
        array1 = np.array([1,2,3,4,5,6])
        array2 = np.array([1,2,3,4,5,6])
        die1 = Dice(array1)
        die2 = Dice(array2)
        
        game = Game([die1, die2])
        game.play(4)

        
        rslt = game.show_results()
       
        self.assertTrue(isinstance(rslt, pd.DataFrame)) 
        
        
        
    def test_8_analyzer_instance(self): 
        #test that the analzer object was instantiated with a game object as the argument.
        array1 = np.array([1,2,3,4,5,6])
        array2 = np.array([1,2,3,4,5,6])
        die1 = Dice(array1)
        die2 = Dice(array2)
        
        game = Game([die1, die2])
        game.play(4)
        
        ana = Analyzer(game)
        
        
        self.assertTrue(isinstance(ana.game, Game)) 
        
        
    
    def test_9_analyzer_jackpot(self):
        # test that the jackpot function returns an int. 
        array1 = np.array([1,2,3,4,5,6])
        array2 = np.array([1,2,3,4,5,6])
        die1 = Dice(array1)
        die2 = Dice(array2)
        
        game = Game([die1, die2])
        game.play(4)

        ana = Analyzer(game)
        
        self.assertTrue(type(ana.jackpot()) == int)
        
        
        
    def test_10_analyzer_face_counts(self):
        #test that the face_counts function returns a pandas dataframe. 
        array1 = np.array([1,2,3,4,5,6])
        array2 = np.array([1,2,3,4,5,6])
        die1 = Dice(array1)
        die2 = Dice(array2)
        
        game = Game([die1, die2])
        game.play(4)

        
        ana = Analyzer(game)
        
        self.assertTrue(type(ana.face_counts()) == pd.DataFrame)
        
        
        
    def test_11_analyzer_combo_count(self):
        #test that the combo_count function returns a pandas dataframe. 
        array1 = np.array([1,2,3,4,5,6])
        array2 = np.array([1,2,3,4,5,6])
        die1 = Dice(array1)
        die2 = Dice(array2)
        
        game = Game([die1, die2])
        game.play(4)

        ana = Analyzer(game)
        
        self.assertTrue(type(ana.combo_count()) == pd.DataFrame)
        
        
        
        
    def test_12_analyzer_permutation_count(self):
        #test that the permutation_count function returns a pandas dataframe. 
        array1 = np.array([1,2,3,4,5,6])
        array2 = np.array([1,2,3,4,5,6])
        die1 = Dice(array1)
        die2 = Dice(array2)
        
        game = Game([die1, die2])
        game.play(4)
        
        ana = Analyzer(game)
        
        self.assertTrue(type(ana.permutation_count()) == pd.DataFrame)

        
                
if __name__ == '__main__':
    
    unittest.main(verbosity=3)
    
```

## Unittest Results (3)

Put a copy of the results of running your tests from the command line here.

Again, paste as text using triple backticks.

- All 12 specified methods return OK (3; .25 each).

```


-bash-4.4$python3 montecarlotest.py 
test_10_analyzer_face_counts (__main__.MonteCarloTestSuite) ... ok
test_11_analyzer_combo_count (__main__.MonteCarloTestSuite) ... ok
test_12_analyzer_permutation_count (__main__.MonteCarloTestSuite) ... ok
test_1_dice_instance (__main__.MonteCarloTestSuite) ... ok
test_2_dice_change_weights (__main__.MonteCarloTestSuite) ... ok
test_3_roll_dice (__main__.MonteCarloTestSuite) ... ok
test_4_dice_current_state (__main__.MonteCarloTestSuite) ... ok
test_5_game_instance (__main__.MonteCarloTestSuite) ... ok
test_6_game_plays (__main__.MonteCarloTestSuite) ... ok
test_7_game_show_results (__main__.MonteCarloTestSuite) ... ok
test_8_analyzer_instance (__main__.MonteCarloTestSuite) ... ok
test_9_analyzer_jackpot (__main__.MonteCarloTestSuite) ... ok
----------------------------------------------------------------------
Ran 12 tests in 0.085s
OK
-bash-4.4$


```

## Import (1)

Import your module here. This import should refer to the code in your package directory.

- Module successuflly imported (1).

In [4]:
import montecarlo.montecarlo

## Help Docs (4)

Show your docstring documentation by applying `help()` to your imported module.

- All methods have a docstring (3; .25 each).
- All classes have a docstring (1; .33 each).

In [2]:
help(montecarlo.montecarlo)

Help on module montecarlo.montecarlo in montecarlo:

NAME
    montecarlo.montecarlo

CLASSES
    builtins.object
        Analyzer
        Dice
        Game
    
    class Analyzer(builtins.object)
     |  Analyzer(game)
     |  
     |  Methods defined here:
     |  
     |  __init__(self, game)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |  
     |  combo_count(self)
     |  
     |  face_counts(self)
     |  
     |  jackpot(self)
     |  
     |  permutation_count(self)
     |  
     |  ----------------------------------------------------------------------
     |  Data descriptors defined here:
     |  
     |  __dict__
     |      dictionary for instance variables (if defined)
     |  
     |  __weakref__
     |      list of weak references to the object (if defined)
    
    class Dice(builtins.object)
     |  Dice(faces: <built-in function array>)
     |  
     |  This is the Dice Class. 
     |  A die has N sides, or “faces”, and W weights, an

## `README.md` File (3)

Provide link to the README.md file of your project's repo.

- Metadata section or info present (1).
- Synopsis section showing how each class is called (1). (All must be included.)
- API section listing all classes and methods (1). (All must be included.)

URL:

## Successful installation (2)

Put a screenshot or paste a copy of a terminal session where you successfully install your module with pip.

If pasting text, use a preformatted text block to show the results.

- Installed with `pip` (1).
- Successfully installed message appears (1).

# Scenarios

Use code blocks to perform the tasks for each scenario.

Be sure the outputs are visible before submitting.

## Scenario 1: A 2-headed Coin (9)

Task 1. Create a fair coin (with faces $H$ and $T$) and one unfair coin in which one of the faces has a weight of $5$ and the others $1$.

- Fair coin created (1).
- Unfair coin created with weight as specified (1).

In [6]:
import numpy as np
import pandas as pd
from montecarlo.montecarlo import Dice
from montecarlo.montecarlo import Game
from montecarlo.montecarlo import Analyzer


In [4]:
Fair = Dice(np.array(["H", "T"]))

Unfair =  Dice(np.array(["H", "T"]))
Unfair.change_weight("H", 5) 

In [5]:
Unfair.weights

array([5., 1.])

Task 2. Play a game of $1000$ flips with two fair dice.

- Play method called correclty and without error (1).

In [6]:
Game1 = Game([Fair, Fair]) 

Game1.play(1000)

Game1.show_results().head(6)

Unnamed: 0_level_0,0,1
Roll Number,Unnamed: 1_level_1,Unnamed: 2_level_1
0,T,T
1,T,H
2,H,T
3,H,T
4,H,T
5,H,H


Task 3. Play another game (using a new Game object) of $1000$ flips, this time using two unfair dice and one fair die. For the second unfair die, you can use the same die object twice in the list of dice you pass to the Game object.

- New game object created (1).
- Play method called correclty and without error (1).

In [9]:
Game2 = Game([Fair, Unfair, Unfair]) 

Game2.play(1000)

Game2.show_results().head(6)

Unnamed: 0_level_0,0,1,2
Roll Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,H,H,H
1,T,H,H
2,T,H,H
3,H,H,H
4,T,T,H
5,T,H,H


Task 4. For each game, use an Analyzer object to determine the raw frequency of jackpots — i.e. getting either all $H$s or all $T$s.

- Analyzer objecs instantiated for both games (1).
- Raw frequencies reported for both (1).

In [10]:
Analyzer1 = Analyzer(Game1)
Analyzer2 = Analyzer(Game2)



print(f' Game 1 had {Analyzer1.jackpot()} jackpots and Game 2 had { Analyzer2.jackpot()} jackpots.')

 Game 1 had 492 jackpots and Game 2 had 361 jackpots.


Task 5. For each analyzer, compute relative frequency as the number of jackpots over the total number of rolls.

- Both relative frequencies computed (1).

In [18]:
print(Analyzer1.jackpot() / Analyzer1.game.show_results().shape[0], 
      Analyzer2.jackpot() / Analyzer2.game.show_results().shape[0])

0.492 0.361


Task 6. Show your results, comparing the two relative frequencies, in a simple bar chart.

- Bar chart plotted and correct (1).

## Scenario 2: A 6-sided Die (9)

Task 1. Create three dice, each with six sides having the faces 1 through 6.

- Three die objects created (1).

In [7]:
Die1 = Dice(np.array([1,2,3,4,5,6]))
Die2 = Dice(np.array([1,2,3,4,5,6]))
Die3 = Dice(np.array([1,2,3,4,5,6]))

Task 2. Convert one of the dice to an unfair one by weighting the face $6$ five times more than the other weights (i.e. it has weight of 5 and the others a weight of 1 each).

- Unfair die created with proper call to weight change method (1).

In [9]:
Die3.change_weight(6, 5)
Die3.weights

array([1., 1., 1., 1., 1., 5.])

Task 3. Convert another of the dice to be unfair by weighting the face $1$ five times more than the others.

- Unfair die created with proper call to weight change method (1).

In [12]:
Die1.change_weight(1, 5)
Die1.weights

array([5., 1., 1., 1., 1., 1.])

Task 4. Play a game of $10000$ rolls with $5$ fair dice.

- Game class properly instantiated (1). 
- Play method called properly (1).

In [13]:
Game3 = Game([Die2, Die2,Die2,Die2,Die2]) 
Game3.play(10000)

Task 5. Play another game of $10000$ rolls, this time with $2$ unfair dice, one as defined in steps #2 and #3 respectively, and $3$ fair dice.

- Game class properly instantiated (1). 
- Play method called properly (1).

In [14]:
Game4 = Game([Die1, Die2,Die3,Die2,Die2]) 
Game4.play(10000)

Task 6. For each game, use an Analyzer object to determine the relative frequency of jackpots and show your results, comparing the two relative frequencies, in a simple bar chart.

- Jackpot methods called (1).
- Graph produced (1).

In [20]:
Analyzer3 = Analyzer(Game3)
Analyzer4 = Analyzer(Game4)



print (Analyzer3.jackpot() / 10000,
      Analyzer4.jackpot() / 10000)

0.0008 0.0005


## Scenario 3: Letters of the Alphabet (7)

Task 1. Create a "die" of letters from $A$ to $Z$ with weights based on their frequency of usage as found in the data file `english_letters.txt`. Use the frequencies (i.e. raw counts) as weights.

- Die correctly instantiated with source file data (1).
- Weights properly applied using weight setting method (1).

In [29]:
alpha = pd.read_csv("english_letters.txt", header = None, delimiter = " ")

(26, 2)

Task 2. Play a game involving $4$ of these dice with $1000$ rolls.

- Game play method properly called (1).

In [48]:
Game5 = Game([alpha_die, alpha_die, alpha_die, alpha_die]) 
Game5.play(1000)

Task 3. Determine how many permutations in your results are actual English words, based on the vocabulary found in `scrabble_words.txt`.

- Use permutation method (1).
- Get count as difference between permutations and vocabulary (1).

Task 4. Repeat steps #2 and #3, this time with $5$ dice. How many actual words does this produce? Which produces more?

- Successfully repreats steps (1).
- Identifies parameter with most found words (1).