Skip to content

sk31Dev/github_issue_analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GitHub Issue Analyzer

A production-ready FastAPI application t## πŸ“¦ Installation & Running Locally

This project uses a Makefile to simplify development workflows.

  1. Configure Virtual Environment (Recommended):

    python -m venv venv
    
    # Windows (Git Bash)
    source venv/Scripts/activate
    # Windows (PowerShell)
    .\venv\Scripts\Activate
    # Mac/Linux
    source venv/bin/activate
  2. Install Dependencies: Installs both production and development dependencies.

    make install

    Alternatively: pip install -r requirements.txt && pip install -r requirements-dev.txt

  3. Run the Application: Starts the FastAPI server with hot-reload enabled.

    make run

πŸš€ Features

  • GitHub Integration: Asynchronously fetches and paginates through GitHub issues using httpx.
  • Intelligent Analysis: Summarizes and analyzes issue context using OpenAI's gpt-4o-mini.
  • Local Caching: Stores issues in an SQLite database using SQLAlchemy to minimize API calls and improve performance.
  • production-Grade API: Built with FastAPI, featuring structured logging, telemetry middleware, and Pydantic validation.
  • Code Quality: Enforced via ruff for linting/formatting and mypy for static type checking.
  • Containerization: Fully Dockerized for easy deployment.
  • Developer Experience: Includes a Makefile for automating common tasks.

πŸ› οΈ Tech Stack

  • Framework: FastAPI, Uvicorn
  • Database: SQLAlchemy, SQLite
  • Async HTTP: HTTPX
  • LLM Integration: OpenAI API
  • Validation: Pydantic v2
  • Testing: Pytest, Pytest-Asyncio, Pytest-Cov
  • Linting & Typing: Ruff, Mypy
  • Infrastructure: Docker

πŸ—οΈ Architecture & Design Decisions

  • Modular Design: Structured into clear layers (Routers, Services, Models) to separate concerns, ensuring scalability and maintainability.
  • Asynchronous Processing: Uses httpx for non-blocking I/O, allowing the server to handle concurrent requests efficiently while waiting for GitHub API responses.
  • Local Caching Strategy (SQLite):
    • Reasoning: Chosen for its zero-configuration, serverless architecture which simplifies local development and testing.
    • Benefit: It acts as a reliable persistence layer without the overhead of spinning up a separate Docker container for Postgres/MySQL.
    • Trade-off: While excellent for this standalone service, a distributed system would require migrating to a client-server DB (like Postgres) for handling multiple writer instances.
  • LLM Optimization: Applies intelligent truncation to issue bodies to fit within context windows while preserving key information, balancing cost and analysis quality.
  • Observability: Features custom telemetry middleware to log request metrics (duration, status), enabling performance monitoring and debugging.
  • Project Tooling: Utilizes a Makefile and strict linting (ruff, mypy) to enforce a standardized, production-grade development workflow.

πŸ“‹ Prerequisites

  • Python 3.10+
  • Docker (optional)
  • OpenAI API Key

βš™οΈ Configuration

  1. Clone the repository:

    git clone https://github.com/sk31Dev/github_issue_analyzer.git
    cd github_issue_analyzer
  2. Environment Setup: Copy the example environment file (if available) or create a .env file in the root directory:

    touch .env

    Add the following variables to .env:

    OPENAI_API_KEY=your_openai_api_key_here
    DATABASE_URL=sqlite:///./issues.db
    GITHUB_API_URL=https://api.github.com

πŸ“¦ Installation & Running Locally

This project uses a Makefile to simplify development workflows.

  1. Install Dependencies: Installs both production and development dependencies.

    make install

    Alternatively: pip install -r requirements.txt && pip install -r requirements-dev.txt

  2. Run the Application: Starts the FastAPI server with hot-reload enabled.

    make run

    Alternatively: python -m app.main

    The API will be available at http://localhost:8000.
    API Documentation (Swagger UI): http://localhost:8000/docs

πŸ§ͺ Development

Running Tests

Run the test suite with coverage reporting:

make test

Linting & Formatting

Ensure code quality before committing:

make lint    # Checks for linting errors and type issues
make format  # Auto-formats code using Ruff

🐳 Docker Support

Build and run the application as a container.

  1. Build the Image:

    make docker-build
  2. Run the Container: Runs the container on port 8000 using your local .env file.

    make docker-run

πŸ”Œ API Endpoints

1. Scan Repository

Fetches open issues from a public GitHub repository and caches them. In this example, we scan the OpenAI Python SDK repository.

  • URL: /scan

  • Method: POST

  • Body:

    {
      "repo": "openai/openai-python"
    }
  • Response:

    {
      "repo": "openai/openai-python",
      "issues_fetched": 287,
      "cached_successfully": true
    }

2. Analyze Issues

Sends cached issues to the LLM for summarization or analysis. Here we ask for specific connection issues in the httpx library.

  • URL: /analyze

  • Method: POST

  • Body:

    {
      "repo": "encode/httpx",
      "prompt": "Identify common themes related to connection timeouts in the last 50 issues."
    }
  • Response:

    {
      "analysis": "Based on the recent issues, users are frequently experiencing connection timeouts when using proxies. Key themes include:\n\n1. **Proxy Authentication**: Several reports indicate timeouts specifically when digest auth is enabled with proxies.\n2. **Keep-Alive defaults**: Users migrating from requests are encountering changes in default keep-alive behavior causing hanging connections.\n3. **Async Contexts**: Improper closure of async contexts leading to pool exhaustion."
    }

πŸ“‚ Project Structure

github_issue_analyzer/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ routers/       # API endpoints
β”‚   β”œβ”€β”€ services/      # Business logic (GitHub, LLM)
β”‚   β”œβ”€β”€ utils/         # Utilities (Logging)
β”‚   β”œβ”€β”€ main.py        # App entry point & middleware
β”‚   β”œβ”€β”€ models.py      # Database models
β”‚   β”œβ”€β”€ schemas.py     # Pydantic models
β”‚   └── database.py    # DB Setup
β”œβ”€β”€ tests/             # Pytest suite
β”œβ”€β”€ .env               # Environment variables (GitIgnored)
β”œβ”€β”€ .gitignore         # Ignored files
β”œβ”€β”€ Dockerfile         # Docker configuration
β”œβ”€β”€ Makefile           # Task automation
β”œβ”€β”€ pyproject.toml     # Tool configuration (Ruff, Mypy)
└── requirements.txt   # Production dependencies

πŸ€– Development Prompts (LLM Usage)

This project was developed using an iterative prompting strategy to simulate a pair-programming environment with an AI assistant. The development process involved initial architectural planning with Gemini 3 Pro (Web) followed by implementation and refinement using Gemini 3 Pro on GitHub Copilot.

Why Gemini 3 Pro (Web)? I chose to use the Gemini web interface for the high-level planning phase because the GitHub Copilot integration (Gemini 3 Pro) is currently in preview. I wanted to ensure the foundational architectural decisions were made using the most stable and feature-rich version of the model available.

Phase 1: Planning & Architecture (Gemini 3 Pro web)

  1. Initial Strategy & Tech Stack Selection: "I started by sharing the requirement spec and asked for a recommendation on the best language and framework..."

    Prompt: "I am preparing for an interview assignment [Attached Requirement Spec]. First, suggest the most suitable language and framework for this task. List all options with their pros and cons, and explain the reasoning behind the final choice. For every design or technical decision, provide a clear justification."

  2. Project Scaffolding & Initial Code: "After agreeing on the stack, I asked for the project structure and the initial code skeleton..."

    Prompt: "I agree with the proposed stack and will use an OpenAI API key. Please generate the code, but first define the project structure. Ensure you follow Python best practices and coding standards, including SOLID principles, clean code, and performance optimizations."

  3. Demo Planning: "To prepare for the demo, I asked for suggestions on open-source repositories to scan..."

    Prompt: "Suggest an example repository I can use to demo this project. Also, how much credit should I add to my OpenAI account for testing and the demo?"

  4. Testing Strategy: "I asked about the most appropriate unit tests for this kind of application..."

    Prompt: "What types of unit tests should I generate for this project? Provide step-by-step instructions for adding these unit tests."

Phase 2: Implementation & Refinement (GitHub Copilot)

  1. Scaffolding & Implementation: "I instructed Copilot to implement the folder structure and code skeleton provided by Gemini..."

    Prompt: "Create the following structure [github_issue_analyzer tree] with app/ (routers, services, models), tests/, and configuration files."

  2. Dependency Management (Production vs. Dev): "I asked to generate separate requirement files for production and development..."

    Prompt: "Generate a requirements.txt for production dependencies and a separate requirements-dev.txt for development tools, ensuring the dev file imports the main one."

  3. Core Logic Refinement: "I took the core logic provided by Gemini for github_service.py and llm_service.py and asked Copilot to refine it..."

    Prompt: "Implement github_service.py to fetch issues using httpx and cache them in SQLite. Ensure pagination is handled. Then implement llm_service.py to read from the DB, truncate content, and send to OpenAI."

  4. Feature Implementation (Telemetry): "I needed to track performance, so I asked to implement logging middleware..."

    Prompt: "Implement logging middleware in main.py. Then create a corresponding test in tests/test_logging.py to verify that logs are captured correctly."

  5. Automation (Makefile): "To automate the workflow, I asked for a Makefile..."

    Prompt: "Create a Makefile including targets for install, run, test, lint, and docker-build. Also, create a pyproject.toml to configure generic linting settings with Ruff."

  6. Code Quality & Refactoring: "I ran MyPy and fed the errors back to Copilot..."

    Prompt: "Run MyPy on the codebase, analyze the type errors, and recursively apply fixes to all files until valid."

  7. Requirement Verification: "I requested a formal review of the implemented code against the original assignment spec..."

    Prompt: "Review the current codebase against the provided assignment requirements. Verify that all functional requirements and edge cases (like 'repo not found') are strictly met."

  8. Troubleshooting: "When I ran into a 'pytest not found' error, I asked for help debugging it..."

    Prompt: "I am encountering the error: 'bash: pytest: command not found'. Also, how can I add test statistics, such as the number of tests executed and a coverage report?"

  9. Documentation: "Finally, I asked for a professional README..."

    Prompt: "Generate a detailed, professional README for this project. Populate the API documentation with real-world usage examples for both endpoints, using repositories like openai/openai-python and httpx."

Phase 3: System Prompt Engineering (GitHub Copilot)

  1. System Prompt Refinement: "I iterated on the system prompt in llm_service.py, specifically asking to make the persona more 'expert-level'..."

    Prompt: "Refine the system prompt in llm_service.py to be more expert-level. Instruct the model to focus on patterns, be actionable, use evidence (issue IDs), and use strict Markdown formatting. Do not hallucinate issues."

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A high-performance, asynchronous backend service that ingests GitHub issues, maintains a local SQLite cache, and utilizes OpenAI's GPT models to derive natural language insights from repository data. Built with FastAPI, SQLAlchemy, and Clean Architecture principles.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors