git-calculator

Calculate dora metrics and related from a Git repository on the local file system. Does not require integration with GitHub or any other git service provider.

Getting Started

First, clone this repository and set it up:

# Clone the repository
git clone https://github.com/yourusername/git-calculator.git
cd git-calculator

# Set up Python environment
python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate
pip install -r requirements.txt

# Set Python path
export PYTHONPATH=$(pwd)  # On Windows, use: set PYTHONPATH=%cd%

Navigate to the Git repository you want to analyze:

cd /path/to/your/repository

Run Python and calculate your metrics:

# Launch Python
python

# Import required modules
from src import git_ir as gir
from src.calculators import cycle_time_by_commits_calculator as commit_calc
from src.calculators import change_failure_calculator as cfc
from src.calculators import chart_generator as cg
from src.calculators import commit_analyzer as ca

# Get the data
logs = gir.git_log()

# Calculate cycle time
tds = commit_calc.calculate_time_deltas(logs)
cycle_time_data = commit_calc.commit_statistics_normalized_by_month(tds)

# Calculate change failure rate
data_by_month = cfc.extract_commit_data(logs)
failure_rate_data = [(month, rate) for month, rate in cfc.calculate_change_failure_rate(data_by_month).items()]

# Analyze commit trends by author
ca.analyze_commits()

# Generate charts and save data
cg.generate_charts(cycle_time_data=cycle_time_data, 
                  failure_rate_data=failure_rate_data,
                  save_data=True)

Check your results:
- A new metrics directory will be created in your repository
- You'll find several files with your repository name as prefix:
  - metrics/{repo_name}_cycle_time_data.csv - Raw cycle time data
  - metrics/{repo_name}_change_failure_data.csv - Raw change failure rate data
  - metrics/{repo_name}_cycle_time_chart.png - Cycle time chart
  - metrics/{repo_name}_change_failure_rate_chart.png - Change failure rate chart
  - metrics/commit_trends.png - Commit trends by author
  - metrics/commit_{author}_commits.csv - Individual author commit data
  - metrics/commit_percentiles.csv - Author commit percentiles
To generate new charts later without recalculating:

from src.calculators import chart_generator as cg

# Load the saved data
cycle_time_data, failure_rate_data = cg.load_metrics_data()

# Generate new charts
cg.generate_charts(cycle_time_data=cycle_time_data, 
                  failure_rate_data=failure_rate_data)

Project Outline

git-calculator/
│
├── src/
|   ├── git_ir.py        # In memory representation of Git metadata
│   ├── calculators/
│   │   ├── cycle_time_calculator_by_branches.py  # Cycle time stats by branch
│   │   ├── cycle_time_calculator_by_commits.py  # Cycle time stats by commit
│   │   ├── change_failure_calculator.py         # Change failure rate stats
│   │   ├── commit_analyzer.py                   # Commit trends by author
│   │   └── chart_generator.py                   # Chart generation utilities
│   ├── util/
│   │   ├── git_util.py  # Helpers for interacting with a Git repo
│   │   └── toy_repo.py  # Temporary toy repo on the filesystem for testing
│
├── tests/
│   └── test_*.py        # Unit tests
│
├── README.md             # Documentation
├── requirements.txt      # Dependencies
└── setup.py              # Setup

Project Setup

cd git-calculator
export PYTHONPATH=$(pwd)

Set up virtual environment:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Project Testing

Run unit tests

pytest -v

For debugging: export PYTEST_ADDOPTS="--log-cli-level=DEBUG"

Project Playing Around

To play around with the interpreter:

python
from src.util.toy_repo import ToyRepoCreator
trc = ToyRepoCreator("/Users/denalilumma/doubling-code/scratch")
even_intervals = [7 * i for i in range(12)]  # Weekly intervals
trc.create_custom_commits(even_intervals)

(Replace with your local path)

from src.calculators.cycle_time_by_commits_calculator import cycle_time_between_commits_by_author
result = cycle_time_between_commits_by_author(None, bucket_size=4, window_size=2)
print(result)

Project Usage

To calculate statistics for a given repository, proceed with the following sequence.

Step one, go to this repo in the terminal and set the python path:

cd git_calculator
export PYTHONPATH=$(pwd)

Set up virtual environment:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Finally, go to the git repo you want to analyze:

cd tensorflow

Analyze:

# Launch python3 
python
# Paste:
from src import git_ir as gir
from src.calculators import cycle_time_by_commits_calculator as commit_calc
logs = gir.git_log()
tds = commit_calc.calculate_time_deltas(logs)
result = commit_calc.commit_statistics_normalized_by_month(tds)
commit_calc.write_commit_statistics_to_file(result, "scratch.csv") # Default file name is "a.csv"

Example output:

INTERVAL START, SUM, AVERAGE, p75 CYCLE TIME (minutes), std CYCLE TIME
2023-10,161280.0,40320.0,40320,0
2023-11,120960.0,40320.0,40320,0

To calculate change failure rate:

# Launch python3 
python
# Paste:
from src import git_ir as gir
from src.calculators import change_failure_calculator as cfc
logs = gir.git_log()
data_by_month = cfc.extract_commit_data(logs)
change_failure_rates = cfc.calculate_change_failure_rate(data_by_month)
cfc.write_change_failure_rate_to_file(change_failure_rates, "change_failure_rate.csv") # Default file name is "change_failure_rate_by_month.csv"

Example output:

Month,Change Failure Rate (%)
2023-10,25.0
2023-11,33.3

The change failure rate is calculated by identifying commits that contain keywords like "revert", "hotfix", "bugfix", "bug", "fix", "problem", or "issue" in their commit messages. The rate is expressed as a percentage of total commits that required fixes.

To analyze commit trends by author:

# Launch python3 
python
# Paste:
from src.calculators import commit_analyzer as ca
ca.analyze_commits()

This will generate:

A commit trends chart showing commits over time for each author
CSV files with individual author commit data
A CSV file with commit percentiles for all authors

Generating Charts

To generate modern-looking charts with trendlines for both metrics:

# First time: Calculate and save the data
from src import git_ir as gir
from src.calculators import cycle_time_by_commits_calculator as commit_calc
from src.calculators import change_failure_calculator as cfc
from src.calculators import chart_generator as cg

# Get the data
logs = gir.git_log()

# Calculate cycle time
tds = commit_calc.calculate_time_deltas(logs)
cycle_time_data = commit_calc.commit_statistics_normalized_by_month(tds)

# Calculate change failure rate
data_by_month = cfc.extract_commit_data(logs)
failure_rate_data = [(month, rate) for month, rate in cfc.calculate_change_failure_rate(data_by_month).items()]

# Save data and generate charts
cg.generate_charts(cycle_time_data=cycle_time_data, 
                  failure_rate_data=failure_rate_data,
                  save_data=True)

# Later: Load saved data and generate new charts
from src.calculators import chart_generator as cg

# Load the saved data
cycle_time_data, failure_rate_data = cg.load_metrics_data()

# Generate new charts
cg.generate_charts(cycle_time_data=cycle_time_data, 
                  failure_rate_data=failure_rate_data)

This will create a metrics directory in your repository and save four files with the repository name as prefix (e.g., tensorflow_cycle_time_data.csv):

metrics/{repo_name}_cycle_time_data.csv - Raw cycle time data
metrics/{repo_name}_change_failure_data.csv - Raw change failure rate data
metrics/{repo_name}_cycle_time_chart.png - Cycle time chart
metrics/{repo_name}_change_failure_rate_chart.png - Change failure rate chart

The repository name is automatically detected from:

The git remote URL (e.g., git@github.com:user/tensorflow.git → tensorflow)
If no remote is found, the current directory name is used
If neither is available, repo is used as a fallback

You can also use a custom prefix instead of the repository name:

# Save with custom prefix
cg.generate_charts(cycle_time_data=cycle_time_data, 
                  failure_rate_data=failure_rate_data,
                  save_data=True,
                  prefix='team_a_')

# Load with custom prefix
cycle_time_data, failure_rate_data = cg.load_metrics_data(prefix='team_a_')
cg.generate_charts(cycle_time_data=cycle_time_data, 
                  failure_rate_data=failure_rate_data)

This is useful when you want to compare metrics across different teams or time periods.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

git-calculator

Getting Started

Project Outline

Project Setup

Project Testing

Project Playing Around

Project Usage

Generating Charts

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

git-calculator

Getting Started

Project Outline

Project Setup

Project Testing

Project Playing Around

Project Usage

Generating Charts

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages