Skip to content

Gihan-1994/github-crawler-mcp

Repository files navigation

GitHub Repository Crawler MCP Server

A powerful Model Context Protocol (MCP) server that enables ChatGPT and other AI assistants to crawl and interact with GitHub repositories - both public and private. Built with FastMCP.

Features

🔍 Comprehensive GitHub API Integration - 10 specialized tools for repository exploration:

  • Search repositories by query with advanced filtering
  • Get repository details including stars, forks, topics, and license
  • List user/org repositories with sorting options
  • Browse repository contents - files and directories
  • Get complete repository structure as a tree
  • Read README files in markdown format
  • List commits with author and message details
  • Analyze programming languages used in repositories
  • Browse issues with state filtering
  • Download file contents with automatic decoding

🔐 Private Repository Access - Full support for private repositories using GitHub Personal Access Tokens

High Rate Limits - 5,000 requests/hour with authentication (vs 60 without)

🚀 Easy Deployment - Deploy to FastMCP Cloud and connect directly to ChatGPT

Installation

1. Clone or Create the Project

mkdir github-crawler-mcp
cd github-crawler-mcp

Place the github_crawler_server.py file in this directory.

2. Install Dependencies

pip install -r requirements.txt

This will install:

  • fastmcp - MCP server framework
  • httpx - Async HTTP client for GitHub API
  • python-dotenv - Environment variable management

3. Configure GitHub Token

Create a .env file in the project root:

cp .env.example .env

Edit .env and add your GitHub Personal Access Token:

GITHUB_TOKEN=ghp_your_actual_token_here

Creating a GitHub Personal Access Token:

  1. Go to https://github.com/settings/tokens
  2. Click "Generate new token (classic)"
  3. Give it a descriptive name (e.g., "MCP GitHub Crawler")
  4. Select scopes:
    • repo - Full control of private repositories
    • read:org - Read org and team membership
  5. Click "Generate token"
  6. Copy the token immediately (you won't see it again!)
  7. Paste it in your .env file

Usage

Running Locally

Option 1: Using Python Directly (stdio mode)

python github_crawler_server.py

This starts the server in stdio mode, which is useful for testing with MCP-compatible clients.

Option 2: Using FastMCP CLI (stdio mode)

fastmcp run github_crawler_server.py:mcp

Option 3: HTTP Server for Testing

fastmcp run github_crawler_server.py:mcp --transport http --port 8000

The server will be available at http://localhost:8000/mcp

Testing the Server

Run the test script to verify all tools are working:

python test_server.py

This will test each tool with real GitHub API calls.

Available Tools

1. search_repositories

Search for repositories by query.

# Example: Search for FastMCP repositories
search_repositories(query="fastmcp language:python", sort="stars", order="desc", per_page=10)

2. get_repository_details

Get comprehensive details about a specific repository.

# Example: Get details for jlowin/fastmcp
get_repository_details(owner="jlowin", repo="fastmcp")

3. list_user_repositories

List all repositories for a user or organization.

# Example: List all repos for a user
list_user_repositories(username="octocat", type="all", sort="updated")

4. get_repository_contents

Browse files and directories in a repository.

# Example: List root directory contents
get_repository_contents(owner="jlowin", repo="fastmcp", path="")

# Example: Get a specific file
get_repository_contents(owner="jlowin", repo="fastmcp", path="README.md")

5. get_repository_structure

Get the complete directory tree of a repository.

# Example: Get full structure
get_repository_structure(owner="jlowin", repo="fastmcp", branch="main")

6. get_repository_readme

Get the README content in markdown format.

# Example: Read README
get_repository_readme(owner="jlowin", repo="fastmcp")

7. list_repository_commits

List recent commits with details.

# Example: Get last 20 commits
list_repository_commits(owner="jlowin", repo="fastmcp", per_page=20)

8. get_repository_languages

Analyze programming languages used.

# Example: Get language breakdown
get_repository_languages(owner="jlowin", repo="fastmcp")

9. list_repository_issues

Browse repository issues.

# Example: List open issues
list_repository_issues(owner="jlowin", repo="fastmcp", state="open", per_page=10)

10. get_file_content

Download and decode a specific file.

# Example: Get file content
get_file_content(owner="jlowin", repo="fastmcp", path="src/main.py")

Deploying to FastMCP Cloud

1. Push to GitHub

git init
git add .
git commit -m "Initial commit: GitHub Crawler MCP Server"
git remote add origin https://github.com/yourusername/github-crawler-mcp.git
git push -u origin main

2. Deploy on FastMCP Cloud

  1. Go to FastMCP Cloud
  2. Sign in with your GitHub account
  3. Click "Create New Project"
  4. Select your github-crawler-mcp repository
  5. Set the server entrypoint: github_crawler_server.py:mcp
  6. Add environment variable: GITHUB_TOKEN with your token value
  7. Click "Deploy"

Your server will be deployed and you'll get a URL like:

https://your-project.fastmcp.app/mcp

3. Connect to ChatGPT

  1. Open ChatGPT
  2. Go to Settings → Custom Instructions or MCP Settings
  3. Add your FastMCP Cloud server URL
  4. Save and start using GitHub crawler tools in ChatGPT!

Example ChatGPT Queries

Once connected, you can ask ChatGPT questions like:

  • "Search for popular Python web frameworks on GitHub"
  • "Show me the README for the FastMCP repository"
  • "What programming languages are used in the langchain repository?"
  • "List the recent commits in the openai/openai-python repository"
  • "Get me the structure of the react repository"
  • "Show me open issues in the tensorflow/tensorflow repository"

Rate Limits

Authentication Rate Limit
Without token 60 requests/hour
With token 5,000 requests/hour

Recommendation: Always use a GitHub token for production use to ensure you have sufficient rate limits.

Security Notes

⚠️ Important:

  • Never commit your .env file to version control (it's in .gitignore)
  • When deploying to FastMCP Cloud, add GITHUB_TOKEN as a secure environment variable
  • Use tokens with minimal required scopes for security
  • Rotate tokens periodically

Troubleshooting

Error: "GitHub API error: 401"

  • Your GitHub token is invalid or expired
  • Generate a new token and update your .env file

Error: "GitHub API error: 403 - Rate limit exceeded"

  • You've exceeded the rate limit
  • Wait for the limit to reset (check X-RateLimit-Reset header)
  • If using without authentication, add a GitHub token

Error: "GitHub API error: 404"

  • Repository doesn't exist or is private (and you don't have access)
  • Check the owner and repo names
  • Ensure your token has access to private repos if needed

Tool returns empty results

  • Check your search query syntax
  • Verify the repository/user exists
  • Some queries may genuinely return no results

Project Structure

github-crawler-mcp/
├── github_crawler_server.py  # Main MCP server
├── requirements.txt           # Python dependencies
├── .env.example              # Environment template
├── .env                      # Your config (not in git)
├── .gitignore               # Git ignore rules
├── test_server.py           # Test script
└── README.md               # This file

Contributing

Contributions are welcome! Feel free to:

  • Report bugs
  • Suggest new features
  • Submit pull requests
  • Improve documentation

License

MIT License - feel free to use this in your own projects!

Resources


Built with ❤️ using FastMCP

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages