A production-ready Python script for cleaning up direct repository collaborators in a GitHub organization. This tool helps maintain security and access control by removing collaborators who are not on an approved list.
This tool ONLY modifies users with "direct access" - those who were individually added to repositories as collaborators.
Users who have access through the following are NOT affected:
- Team membership (e.g., users in teams with repo access)
- Organization membership (e.g., org-level default permissions)
- Enterprise access
For example, if a user appears in the repository's "People" section under "Teams" rather than "Direct access", they will NOT be processed by this tool.
- 🔒 Safe by Default: Dry-run mode prevents accidental deletions
- 🎯 Flexible Filtering: Target specific repos or filter by visibility (private/public/internal)
- 👥 Direct Access Only: Only removes direct collaborators, never affects team/org access
- ⏱️ Rate Limit Control: Configurable delay between API calls (--delay parameter)
- 🔄 Automatic Retry: Exponential backoff for failed requests
- 📊 Detailed Reporting: Comprehensive summary and optional JSON output
- ✅ Robust Error Handling: Continues processing even if individual repos fail
- 🔍 Verbose Logging: Track exactly what's happening at each step
- Python 3.8+
- GitHub Personal Access Token with
admin:organdreposcopes
- Clone the repository:
git clone <repository-url>
cd github-cleanup- Install dependencies:
pip install -r requirements.txt- Set up your GitHub token:
export GITHUB_TOKEN='ghp_your_token_here'Create an allowed users file (one username per line):
cat > allowed_users.txt << EOF
alice
bob
charlie
# Comments are supported
team-bot
EOFRun a dry-run to see what would be removed:
python github_cleanup.py \
--org myorg \
--allowed-file allowed_users.txt# Single repository
python github_cleanup.py \
--org myorg \
--repos test-repo \
--allowed-file allowed_users.txt
# Multiple repositories
python github_cleanup.py \
--org myorg \
--repos "repo1,repo2,repo3" \
--allowed-file allowed_users.txt# Only private repositories
python github_cleanup.py \
--org myorg \
--visibility private \
--allowed-file allowed_users.txt
# Only public repositories
python github_cleanup.py \
--org myorg \
--visibility public \
--allowed-file allowed_users.txtAfter verifying with dry-run, use --apply to remove collaborators:
python github_cleanup.py \
--org myorg \
--allowed-file allowed_users.txt \
--applySkip confirmation prompt with --yes:
python github_cleanup.py \
--org myorg \
--allowed-file allowed_users.txt \
--apply \
--yespython github_cleanup.py \
--org myorg \
--allowed-file allowed_users.txt \
--output cleanup_report.jsonRequired Arguments:
--org ORG GitHub organization name
--allowed-file FILE Path to file with allowed usernames
Optional Arguments:
--token TOKEN GitHub token (default: GITHUB_TOKEN env var)
--repos REPOS Comma-separated list of repo names to process
--visibility {all,private,public,internal}
Filter repos by visibility (default: all)
--delay SECONDS Delay between API calls in seconds (default: 0.5)
--apply Actually remove collaborators (default: dry-run)
--yes Skip confirmation prompt
--output FILE Save JSON report to file
-v, --verbose Enable verbose logging
The allowed users file supports:
- One username per line
- Comments starting with
# - Inline comments after usernames
- Blank lines (ignored)
- Case-insensitive matching
Example:
# Core team
alice
bob
charlie # team lead
# Contractors
contractor1
# Bots
deploy-bot
ci-bot
2024-01-15 10:30:00 - INFO - Verifying GitHub authentication...
2024-01-15 10:30:01 - INFO - ✓ Authenticated as: admin-user
2024-01-15 10:30:01 - INFO - Loaded 5 allowed user(s) from allowed_users.txt
2024-01-15 10:30:02 - INFO - Fetching repositories for organization: myorg
2024-01-15 10:30:03 - INFO - Found 3 repositories matching criteria
============================================================
CONFIRMATION REQUIRED
============================================================
Organization: myorg
Repositories to process: 3
Allowed users: 5
Mode: DRY-RUN (no changes)
Repositories:
- myorg/repo1
- myorg/repo2
- myorg/repo3
============================================================
Proceed with operation? [y/N]: y
============================================================
Processing repository: myorg/repo1
============================================================
2024-01-15 10:30:10 - INFO - Found 3 direct collaborator(s):
2024-01-15 10:30:10 - INFO - ✓ alice - admin [direct] - ALLOWED (keeping)
2024-01-15 10:30:10 - INFO - ✓ bob - write [direct] - ALLOWED (keeping)
2024-01-15 10:30:10 - INFO - ✗ old-contractor - read [direct] - NOT IN ALLOWED LIST
2024-01-15 10:30:10 - INFO - [DRY-RUN] Would remove old-contractor from myorg/repo1
============================================================
CLEANUP SUMMARY
============================================================
Organization: myorg
Mode: DRY-RUN
Repositories:
Total found: 3
Successfully processed: 3
Skipped (errors/permissions): 0
Collaborators:
Total checked: 8
Preserved (in allowed list): 6
Would be removed: 2
Removed users by repository:
repo1:
- old-contractor
repo3:
- external-user
⚠ This was a DRY-RUN. Use --apply to actually remove collaborators.
============================================================
{
"organization": "myorg",
"total_repos": 3,
"repos_processed": 3,
"repos_skipped": 0,
"total_collaborators_checked": 8,
"total_collaborators_removed": 2,
"total_team_access_preserved": 6,
"dry_run": true,
"results": [
{
"repo_name": "repo1",
"success": true,
"collaborators_checked": 3,
"collaborators_removed": 1,
"skipped_team_access": 2,
"error_message": null,
"removed_users": ["old-contractor"]
}
]
}The script runs in dry-run mode by default, showing what would be removed without making any changes. You must explicitly use --apply to remove collaborators.
Before processing repositories, the script displays:
- Organization name
- Number of repositories
- Number of allowed users
- Mode (dry-run or apply)
- List of repositories to process
Use --yes to skip this prompt for automated workflows.
The script automatically skips repositories where the token doesn't have admin access, preventing errors and ensuring you only modify repos you control.
The script only removes direct collaborators. Team-based access is never removed, ensuring organizational structure is maintained.
If processing one repository fails, the script continues with others and provides a complete report at the end.
The script handles GitHub API rate limits automatically:
- Configurable Delay: Use
--delayto set time between API calls (default: 0.5 seconds) - Monitors Rate Limits: Tracks remaining requests in real-time
- Automatic Backoff: Waits when rate limited and retries with exponential backoff
- Verbose Tracking: Shows delay application in verbose mode (
-v)
Authenticated requests have a limit of 5,000 requests per hour.
# Faster processing (minimal delay)
python github_cleanup.py --org myorg --allowed-file allowed.txt --delay 0.1
# Default (balanced)
python github_cleanup.py --org myorg --allowed-file allowed.txt --delay 0.5
# Conservative (slower but very safe)
python github_cleanup.py --org myorg --allowed-file allowed.txt --delay 1.0Run the test suite:
# Run all tests
pytest test_github_cleanup.py -v
# Run with coverage report
pytest test_github_cleanup.py -v --cov=github_cleanup --cov-report=html
# Run specific test class
pytest test_github_cleanup.py::TestGitHubClient -vError: Authentication failed
Solution: Verify your token has the required scopes:
admin:org- Required to manage org repositoriesrepo- Required to manage collaborators
Error: Insufficient permissions (admin access required)
Solution: Ensure your token has admin access to the repositories you're trying to modify.
Error: Rate limit exceeded
Solution: The script automatically waits and retries. If processing many repos, consider:
- Running during off-peak hours
- Processing repos in smaller batches with
--repos
Error: Repository listed but shows as not found
Solution:
- Verify the organization name is correct
- Ensure the token has access to the repository
- Check if the repository exists and isn't archived
- Always test first: Run without
--applyto verify what will be removed - Start small: Test on a single repository with
--reposbefore processing all repos - Keep allowed list updated: Regularly review and update your allowed users file
- Save reports: Use
--outputto maintain audit logs of cleanup operations - Use specific filters: Target private repos first with
--visibility private - Review team access: Ensure users should be direct collaborators vs team members
- Token Security: Never commit tokens to version control. Use environment variables or secrets management.
- Audit Trail: Save JSON reports for compliance and auditing purposes.
- Least Privilege: Grant only necessary permissions to the token.
- Regular Reviews: Run cleanup operations regularly to maintain security posture.
Contributions are welcome! Please ensure:
- All tests pass:
pytest test_github_cleanup.py -v - Code follows PEP 8 style guidelines
- Type hints are included
- Docstrings are comprehensive
- New features include tests
[Your License Here]
For issues and questions:
- Open an issue in the repository
- Check existing issues for solutions
- Review GitHub API documentation: https://docs.github.com/en/rest
- Initial release
- Support for organization-wide cleanup
- Dry-run mode with safety features
- Comprehensive error handling and reporting
- JSON output support
- Rate limiting and retry logic