In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from random import randint

#### Usage:


**When to Use this method?**

- When our project is already measurable by a # of tickets or points.
    * Once the project is ticketed out
    * The project already started, mid-project, more work added to project
       
**What input do we need for this method?**
- Measured project volume
- Average project velocity
- Calculated upper and lower bounds of the project velocity



Velocity here means how many `project-related` tickets are done in a measured period of time (week, sprint, month).

Let's say, the team noticed they averagely complete 6 units of project (points, tasks) in a given period of `time (week, sprint, month)`. The project is ticketed out and the team needs to understand how many `time units` they need to finish the project.

Upper bound - the best case scenario of the team's velocity. 

Lower bound - the worst case scenario, assuming there will be uncertainty by the end of the project; or there is another project that required team's attention.


In [2]:
n = 10**5 #number of rolls (the more we roll the dice, the more approximate our result would be)

# project value units in a measured period of time
lower_velocity_bound = 4 
upper_velocity_bound = 8

#remainder of the project value units to work on
m = 60

#a list to collect simulated amount of periods of time when it reached the m-value 
total_time_periods = []

'''
Simulating the roll. Rolling dice principle. 
How many measured periods of time will it randomly take to complete the remainder of the project.
'''
for i in range(n):
   
    #imitating a dice roll between upper and lower bounds
    def randomize():
        return randint(lower_velocity_bound, upper_velocity_bound+1)
    
    #total_points is used to collect finished value units
    total_units = 0
    #counting how many dice rolls will it take to reach the value of m
    number_of_time_periods = 0

    #generating a roll cycle
    while total_units < m:
        new_roll = randomize()
        total_units += new_roll
        number_of_time_periods += 1
    total_time_periods.append(number_of_time_periods)

#creating a data frame with final results
proj_df = pd.DataFrame({"Measured Time Period": total_time_periods, }) 

#95th percintile of how many Measured Time Periods it will take to finish the m-value
print(round(np.percentile(proj_df,95),2))

11.0


Running a `describe` command on the data frame, the result shows:
* `count` - a number of rolls
* `mean` - a mean value of the estimated project cost based on a number of rolls
* `min` - the minimum value resulted from the roll
* `25%, 50%, 75%` - the potential 25%, 50%, 75% probability
* `max` - the maximum values resulted from the dice roll

In [3]:
print(proj_df["Measured Time Period"].describe())

count    100000.000000
mean          9.686990
std           0.850354
min           7.000000
25%           9.000000
50%          10.000000
75%          10.000000
max          14.000000
Name: Measured Time Period, dtype: float64


Grouping the results by the total `Measured Time Period` needed (number of tries) to reach the `m`-value

In [4]:
print(proj_df["Measured Time Period"].groupby(proj_df["Measured Time Period"]).count().nlargest(20))

Measured Time Period
10    41808
9     36406
11    13985
8      6083
12     1617
13       58
7        42
14        1
Name: Measured Time Period, dtype: int64
