Calculate dora metrics and related from a Git repository on the local file system. Does not require integration with GitHub or any other git service provider.
- First, clone this repository and set it up:
# Clone the repository
git clone https://github.com/yourusername/git-calculator.git
cd git-calculator
# Set up Python environment
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate
pip install -r requirements.txt
# Set Python path
export PYTHONPATH=$(pwd) # On Windows, use: set PYTHONPATH=%cd%- Navigate to the Git repository you want to analyze:
cd /path/to/your/repository- Run Python and calculate your metrics:
# Launch Python
python
# Import required modules
from src import git_ir as gir
from src.calculators import cycle_time_by_commits_calculator as commit_calc
from src.calculators import change_failure_calculator as cfc
from src.calculators import chart_generator as cg
from src.calculators import commit_analyzer as ca
# Get the data
logs = gir.git_log()
# Calculate cycle time
tds = commit_calc.calculate_time_deltas(logs)
cycle_time_data = commit_calc.commit_statistics_normalized_by_month(tds)
# Calculate change failure rate
data_by_month = cfc.extract_commit_data(logs)
failure_rate_data = [(month, rate) for month, rate in cfc.calculate_change_failure_rate(data_by_month).items()]
# Analyze commit trends by author
ca.analyze_commits()
# Generate charts and save data
cg.generate_charts(cycle_time_data=cycle_time_data,
failure_rate_data=failure_rate_data,
save_data=True)-
Check your results:
- A new
metricsdirectory will be created in your repository - You'll find several files with your repository name as prefix:
metrics/{repo_name}_cycle_time_data.csv- Raw cycle time datametrics/{repo_name}_change_failure_data.csv- Raw change failure rate datametrics/{repo_name}_cycle_time_chart.png- Cycle time chartmetrics/{repo_name}_change_failure_rate_chart.png- Change failure rate chartmetrics/commit_trends.png- Commit trends by authormetrics/commit_{author}_commits.csv- Individual author commit datametrics/commit_percentiles.csv- Author commit percentiles
- A new
-
To generate new charts later without recalculating:
from src.calculators import chart_generator as cg
# Load the saved data
cycle_time_data, failure_rate_data = cg.load_metrics_data()
# Generate new charts
cg.generate_charts(cycle_time_data=cycle_time_data,
failure_rate_data=failure_rate_data)git-calculator/
│
├── src/
| ├── git_ir.py # In memory representation of Git metadata
│ ├── calculators/
│ │ ├── cycle_time_calculator_by_branches.py # Cycle time stats by branch
│ │ ├── cycle_time_calculator_by_commits.py # Cycle time stats by commit
│ │ ├── change_failure_calculator.py # Change failure rate stats
│ │ ├── commit_analyzer.py # Commit trends by author
│ │ └── chart_generator.py # Chart generation utilities
│ ├── util/
│ │ ├── git_util.py # Helpers for interacting with a Git repo
│ │ └── toy_repo.py # Temporary toy repo on the filesystem for testing
│
├── tests/
│ └── test_*.py # Unit tests
│
├── README.md # Documentation
├── requirements.txt # Dependencies
└── setup.py # Setup
cd git-calculator
export PYTHONPATH=$(pwd)
Set up virtual environment:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Run unit tests
pytest -v
For debugging:
export PYTEST_ADDOPTS="--log-cli-level=DEBUG"
To play around with the interpreter:
python
from src.util.toy_repo import ToyRepoCreator
trc = ToyRepoCreator("/Users/denalilumma/doubling-code/scratch")
even_intervals = [7 * i for i in range(12)] # Weekly intervals
trc.create_custom_commits(even_intervals)
(Replace with your local path)
from src.calculators.cycle_time_by_commits_calculator import cycle_time_between_commits_by_author
result = cycle_time_between_commits_by_author(None, bucket_size=4, window_size=2)
print(result)
To calculate statistics for a given repository, proceed with the following sequence.
Step one, go to this repo in the terminal and set the python path:
cd git_calculator
export PYTHONPATH=$(pwd)Set up virtual environment:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Finally, go to the git repo you want to analyze:
cd tensorflowAnalyze:
# Launch python3
python
# Paste:
from src import git_ir as gir
from src.calculators import cycle_time_by_commits_calculator as commit_calc
logs = gir.git_log()
tds = commit_calc.calculate_time_deltas(logs)
result = commit_calc.commit_statistics_normalized_by_month(tds)
commit_calc.write_commit_statistics_to_file(result, "scratch.csv") # Default file name is "a.csv"Example output:
INTERVAL START, SUM, AVERAGE, p75 CYCLE TIME (minutes), std CYCLE TIME
2023-10,161280.0,40320.0,40320,0
2023-11,120960.0,40320.0,40320,0To calculate change failure rate:
# Launch python3
python
# Paste:
from src import git_ir as gir
from src.calculators import change_failure_calculator as cfc
logs = gir.git_log()
data_by_month = cfc.extract_commit_data(logs)
change_failure_rates = cfc.calculate_change_failure_rate(data_by_month)
cfc.write_change_failure_rate_to_file(change_failure_rates, "change_failure_rate.csv") # Default file name is "change_failure_rate_by_month.csv"Example output:
Month,Change Failure Rate (%)
2023-10,25.0
2023-11,33.3The change failure rate is calculated by identifying commits that contain keywords like "revert", "hotfix", "bugfix", "bug", "fix", "problem", or "issue" in their commit messages. The rate is expressed as a percentage of total commits that required fixes.
To analyze commit trends by author:
# Launch python3
python
# Paste:
from src.calculators import commit_analyzer as ca
ca.analyze_commits()This will generate:
- A commit trends chart showing commits over time for each author
- CSV files with individual author commit data
- A CSV file with commit percentiles for all authors
To generate modern-looking charts with trendlines for both metrics:
# First time: Calculate and save the data
from src import git_ir as gir
from src.calculators import cycle_time_by_commits_calculator as commit_calc
from src.calculators import change_failure_calculator as cfc
from src.calculators import chart_generator as cg
# Get the data
logs = gir.git_log()
# Calculate cycle time
tds = commit_calc.calculate_time_deltas(logs)
cycle_time_data = commit_calc.commit_statistics_normalized_by_month(tds)
# Calculate change failure rate
data_by_month = cfc.extract_commit_data(logs)
failure_rate_data = [(month, rate) for month, rate in cfc.calculate_change_failure_rate(data_by_month).items()]
# Save data and generate charts
cg.generate_charts(cycle_time_data=cycle_time_data,
failure_rate_data=failure_rate_data,
save_data=True)
# Later: Load saved data and generate new charts
from src.calculators import chart_generator as cg
# Load the saved data
cycle_time_data, failure_rate_data = cg.load_metrics_data()
# Generate new charts
cg.generate_charts(cycle_time_data=cycle_time_data,
failure_rate_data=failure_rate_data)This will create a metrics directory in your repository and save four files with the repository name as prefix (e.g., tensorflow_cycle_time_data.csv):
metrics/{repo_name}_cycle_time_data.csv- Raw cycle time datametrics/{repo_name}_change_failure_data.csv- Raw change failure rate datametrics/{repo_name}_cycle_time_chart.png- Cycle time chartmetrics/{repo_name}_change_failure_rate_chart.png- Change failure rate chart
The repository name is automatically detected from:
- The git remote URL (e.g.,
git@github.com:user/tensorflow.git→tensorflow) - If no remote is found, the current directory name is used
- If neither is available,
repois used as a fallback
You can also use a custom prefix instead of the repository name:
# Save with custom prefix
cg.generate_charts(cycle_time_data=cycle_time_data,
failure_rate_data=failure_rate_data,
save_data=True,
prefix='team_a_')
# Load with custom prefix
cycle_time_data, failure_rate_data = cg.load_metrics_data(prefix='team_a_')
cg.generate_charts(cycle_time_data=cycle_time_data,
failure_rate_data=failure_rate_data)This is useful when you want to compare metrics across different teams or time periods.