In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from random import randint

#### Usage:

```
This playbook can be used after the project has been already started and the team has a feeling of the velocity.

Velocity here means how many project-related tickets are done in a measured period of time (week, sprint, month).

Let's say, the team noticed they averagely complete 5 measured value units of the project (points, tasks) in a measured period of time (week, sprint, month). The project is ticketed out and the team needs to understand how many measured time units the the team may need to finish the project.

Upper bound - the best case scenario of the team's velocity. If 5 is the best value the team can produce - then 5 should be the upper bound.

Lower bound - the worst case scenario, assuming there will be uncertainty by the end of the project; or there is another project that required team's attention.
```

In [25]:
n = 10**5 #number of rolls

# project value units in a measured period of time
lower_velocity_bound = 4 
upper_velocity_bound = 8

#remainder of the project value units to work on
m = 60

#a list to collect simulated amount of periods of time when it reached the m-value 
total_time_periods = []

'''
Simulating the roll. Rolling a dice principle. 
How many measured periods of time will it randomly take to complete the remainder of the project.
'''
for i in range(n):
    #imitating a dice roll between upper and lower bounds
    def randomize():
        return randint(upper_bound, lower_bound+1) 
    
    #total_points is used to collect finished value project units
    total_units = 0
    
    #counting how many dice rolls will it take to reach the value of m
    number_of_time_periods = 0

    #generating a roll cycle
    while total_units < m:
        new_roll = randomize()
        total_units += new_roll
        number_of_time_periods += 1
    total_time_periods.append(number_of_time_periods)

#creating a data frame with final results
proj_df = pd.DataFrame({"Measured Time Period": total_time_periods, }) 

#95th percintile of how many Measured Time Periods it will take to finish the m-value
print(round(np.percentile(proj_df,95),2))

13.0


Running a `describe` command on the data frame, the result shows:
* `count` - a number of rolls
* `mean` - a mean value of the estimated project cost based on a number of rolls
* `min` - the minimum value resulted from the roll
* `25%, 50%, 75%` - the potential 25%, 50%, 75% probability
* `max` - the maximum values resulted from the dice roll

In [27]:
print(proj_df["Measured Time Period"].describe())

count    100000.000000
mean         11.370270
std           1.064566
min           8.000000
25%          11.000000
50%          11.000000
75%          12.000000
max          17.000000
Name: Measured Time Period, dtype: float64


Grouping the results by the total `Measured Time Period` neede to reach the `m`-value

In [32]:
print(proj_df["Measured Time Period"].groupby(proj_df["Measured Time Period"]).count().nlargest(20))

Measured Time Period
11    36461
12    29411
10    17797
13    11416
9      2451
14     2139
15      272
8        36
16       16
17        1
Name: Measured Time Period, dtype: int64
