A modern Python library for Git repository analysis with async support and type safety.
- Repository metadata extraction
- File tree analysis with multiple formats
- Dependency parsing (Poetry, requirements.txt, package.json)
- Language statistics and file type detection
- Async support with context managers
- Type-safe API with clear error handling
pip install gitparsefrom gitparse import RepositoryAnalyzer, ExtractionConfig
# Configure analysis
config = ExtractionConfig(
max_file_size=1024 * 1024, # 1MB
exclude_patterns=["*.pyc"],
include_patterns=["*.py", "*.md"],
)
# Initialize and analyze repository
repo = RepositoryAnalyzer("https://github.com/username/repo", config)
# Repository metadata
info = repo.get_repository_info()
# Returns: {"name": "repo", "default_branch": "main", "head_commit": "abc123..."}
# File tree (markdown format)
tree = repo.get_file_tree(style="markdown")
# Returns: ["- README.md", " - src/", " - main.py", ...]
# Dependencies
deps = repo.get_dependencies()
# Returns: {
# "pyproject.toml": {"dependencies": {"requests": "^2.0.0"}},
# "requirements.txt": [{"name": "flask", "version": "2.0.0"}]
# }
# Language statistics
stats = repo.get_language_stats()
# Returns: {
# "Python": {"files": 10, "bytes": 1500, "percentage": 75.5},
# "Markdown": {"files": 2, "bytes": 500, "percentage": 24.5}
# }Also, GitParse comes with CLI for quick repository analysis. Use --help with any command to see its specific options:
# Get general help
gitparse --help
# Get help for specific commands
gitparse tree --help
gitparse content --help
gitparse all-contents --help
# Common operations
gitparse <repo_url> info
gitparse <repo_url> tree --style markdown
gitparse <repo_url> deps
gitparse <repo_url> langs
gitparse <repo_url> stats
gitparse <repo_url> readme
gitparse <repo_url> content path/to/file.py
gitparse <repo_url> dir-tree src --style markdown
# Advanced usage with filters
gitparse <repo_url> all-contents --max-size 1048576 --exclude "*.pyc" "*.so"
# Output options
gitparse <repo_url> langs -o language_stats.json # Save to file
gitparse <repo_url> stats --no-pretty # Disable pretty printingEach command supports --help for detailed usage information:
info: Get repository informationtree: Get repository file tree- See
tree --helpfor style options
- See
dir-tree: Get directory file tree- See
dir-tree --helpfor style options
- See
dir-contents: Get directory contentsreadme: Get repository README contentcontent: Get specific file contentall-contents: Get all file contents- See
all-contents --helpfor filtering options
- See
deps: Get repository dependencieslangs: Get language statisticsstats: Get repository statistics
Use gitparse --help to see all available options:
-o, --output: Save output to file--no-pretty: Disable pretty printing-h, --help: Show help message
# Get specific file content
content = repo.get_file_content("README.md")
# Returns: "# Project Title\n..."
# Get all text files
contents = repo.get_all_contents(
max_file_size=1024 * 1024,
exclude_patterns=["*.pyc", "*.so"]
)
# Returns: {"README.md": "# Title...", "src/main.py": "def main():..."}
# Get directory tree
tree = repo.get_directory_tree(
"src",
style="structured" # or "markdown", "flattened"
)
# Returns: {"src": {"main.py": None, "utils": {"helpers.py": None}}}# Repository statistics
stats = repo.get_statistics()
# Returns: {
# "total_files": 100,
# "binary_ratio": 0.05,
# "avg_file_size": 1024,
# "language_breakdown": {...}
# }
# Language breakdown
langs = repo.get_language_stats()
# Returns: {
# "Python": {"files": 50, "percentage": 80.5},
# "JavaScript": {"files": 10, "percentage": 19.5}
# }# Install dependencies
poetry install
# Run tests
poetry run pytest
# Run linting
poetry run ruff check .