Skip to content

ophirshiran/mega-coder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mega Coder

Mega Coder is a Python console application that uses LLMs to:

  1. Develop a Python program for the user using Gemini 2.5 Flash Lite
  2. Analyze and fix or explain a public GitHub repository using Gemini 2.5 Pro and gitingest
  3. Watch your screen in realtime, detect code using OCR, and give coding tips using GPT 5 mini

The app does not accept command line arguments.
All interaction with the user is done through the console menu.


Table of contents

  1. Project structure
  2. Requirements
  3. Installation
  4. Environment variables and configuration
  5. Running the app
  6. Option 1: Develop a Python program
  7. Option 2: Fix or explain a GitHub repository
  8. Option 3: Screen based realtime coding tips
  9. Extra feature: summary and run log
  10. Limitations and notes

Project structure

Main files in this exercise:

  • mega_coder.py
    Main console application that implements all three options and the extra feature.

  • generated-code-gemini.py
    Output file created and overwritten by option 1. Contains the generated Python program.

  • mega_coder.log
    Simple text log that records runs of option 1. Created automatically when option 1 is used.

There may also be small helper scripts like gemini_tester.py or chatgpt_response_tester.py, but the core of the exercise is in mega_coder.py.


Requirements

Python

  • Python 3.10 or newer is recommended.

Python packages

The following packages are used by mega_coder.py:

  • Core libraries for this project:

    • python-dotenv
    • colorama
    • tqdm
    • google-generativeai
    • openai
    • gitingest
    • pylint
  • Screen capture and OCR for option 3:

    • mss
    • numpy
    • rapidocr-openvino

You can install these packages with:

pip install python-dotenv colorama tqdm google-generativeai openai gitingest pylint mss numpy rapidocr-openvino

The exact versions are not hard coded in the code. You can pin them in requirements.txt if needed.


Installation

1. Clone or open the project folder

Put mega_coder.py and the rest of the files in a dedicated folder, for example:

/path/to/mega-coder/

2. Create a virtual environment (recommended)

From inside the project folder:

python -m venv .venv

3. Activate the virtual environment

On macOS or Linux:

source .venv/bin/activate

On Windows PowerShell:

.venv\Scripts\Activate.ps1

4. Install dependencies

pip install python-dotenv colorama tqdm google-generativeai openai gitingest pylint mss numpy rapidocr-openvino

5. Create a .env file for secrets

In the project root folder, create a file called .env:

GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here

Do not commit this file to Git. It should stay local only.


Environment variables and configuration

The app uses python-dotenv and loads environment variables from .env automatically:

  • GOOGLE_API_KEY

    • Required for Gemini 2.5 Flash Lite and Gemini 2.5 Pro via the google-generativeai library.
    • Used by options 1 and 2.
  • OPENAI_API_KEY

    • Required for GPT 5 mini via the openai library.
    • Used by option 3.

If any required key is missing:

  • Option 1 or 2 will exit with an error message about GOOGLE_API_KEY.
  • Option 3 will print a clear [screen-helper] message that OPENAI_API_KEY is missing and will not start the coaching loop.

Running the app

exercise1-mega-coder-demo.mp4

From the project root, with the virtual environment activated:

python mega_coder.py

The app enforces the “no command line arguments” rule.
If you try to run python mega_coder.py something, the program will exit with an error.

When the program starts, it prints:

I’m Mega Coder. What would you like me to do today?

1. Develop a python program.
2. Fix/change something in a Github repository.
3. Look at my screen and give me realtime coding tips.

You can now type:

  • 1 and Enter: option 1
  • 2 and Enter: option 2
  • 3 and Enter: option 3
  • q and Enter: quit the program gracefully

All user input is read from standard input using input() inside the mega_coder.py app itself.
Generated programs are required not to use input() or command line arguments.


Option 1: Develop a Python program

Overview

Option 1 implements the full pipeline described in the assignment:

  • Ask the user for a program description
  • Generate Python code with Gemini 2.5 Flash Lite
  • Write the code to generated-code-gemini.py
  • Run the generated code and collect exit code, stdout and stderr
  • If there is an error, ask Gemini to fix the code, up to 5 attempts
  • Add random mutations to sometimes break the code on purpose
  • Run pylint on the generated file and use Gemini to fix lint issues, up to 3 rounds
  • Measure runtime of the final program and try to generate a faster version
  • Print a human readable summary and append a log entry to mega_coder.log

Flow in detail

When you select option 1:

  1. The app prints:

    === Option 1: develop a Python program ===
    
    Describe me which python program you want me to develop.
    
  2. You type a description, for example:

    A program that checks if a number is prime and prints all primes up to 100
    
  3. The description is sanitized and validated, then used to build a strict prompt for Gemini. The prompt explicitly requires:

    • No command line arguments
    • No calls to input()
    • A complete, runnable Python 3 program
    • Only standard library imports
    • Meaningful assert statements that check the core logic
  4. The app calls Gemini 2.5 Flash Lite (model name gemini-2.5-flash-lite) using google-generativeai.
    The raw response is converted to plain Python code and written to generated-code-gemini.py.

  5. The program then starts an auto fix loop:

    • Constant: MAX_FIX_ATTEMPTS = 5
    • Before each run, there is a small chance to inject a tiny syntax bug into the code. This uses MUTATION_PROBABILITY = 0.3 and adds an invalid print( line.
    • The generated program is executed by spawning a new Python process using subprocess.run.
    • If exit code is not zero or an assertion fails, the error details and the current code are sent back to Gemini, which returns a fixed version.
    • This process repeats until the program runs successfully or until 5 attempts are used.

    If after 5 attempts the code still does not run correctly, the app prints:

    Sorry master, I have failed you. I can’t create this program without issues
    
  6. Once a working version is found, it is written to generated-code-gemini.py.

  7. The app then runs a lint and auto fix loop using pylint:

    • Constant: MAX_LINT_FIX_ATTEMPTS = 3
    • Command:
      pylint --disable=C,R generated-code-gemini.py
    • If pylint returns issues, the full pylint output and current code are sent to Gemini, which returns an updated version.
    • The updated version is written back to generated-code-gemini.py and linting is repeated, up to 3 rounds.

    If after 3 rounds there are still lint issues, the app prints:

    There are still lint errors/warnings
    

    If there are no issues, the app prints:

    Amazing. No lint errors/warnings
    

    During this process, the tqdm progress bar shows the lint fixing rounds.

  8. After the linting stage, the app measures runtime of the current program:

    • It runs generated-code-gemini.py and measures execution time in milliseconds.
    • Using this timing and the code, it asks Gemini to generate a more efficient version that keeps the same asserts and behavior.
    • It writes the optimized version to generated-code-gemini.py, measures its runtime, and compares times.

    If the optimized version is faster, it prints a message similar to:

    Code running time optimized! It now runs in {after} milliseconds, while before it was {before} milliseconds
    

    If the optimized version is slower or fails to run, the original version is restored.

  9. At the end of the pipeline, the app:

    • Prints a short summary block to the console.
    • Appends a log entry for this run to mega_coder.log.

All prints use colorama for colored output.


Option 2: Fix or explain a GitHub repository

Overview

Option 2 uses gitingest and Gemini 2.5 Pro to analyze a public GitHub repository based on a natural language request from the user.

Flow in detail

When you select option 2:

  1. The app prints:

    === Option 2: fix or explain a GitHub repository ===
    
  2. It asks:

    Give me the full url of a public github repository (or 'q' to go back to the main menu):
    
  3. After you provide a valid github.com URL, the app asks:

    Tell me what you want me to fix/change/explain in that repository (or 'q' to go back to the main menu):
    
  4. The app then uses:

    gitingest.ingest(repo_url, include_submodules=False)

    to build a text digest of the repository. The digest joins:

    • Summary
    • Tree
    • Content

    It also limits the length of the digest using the constant:

    MAX_REPO_DIGEST_CHARS = 200_000

    If the digest is longer, it keeps the first part and adds a note that the digest was truncated.

  5. A detailed prompt is built that includes:

    • Repository URL
    • The digest text
    • The user’s request

    The prompt instructs Gemini to:

    • Use only information from the digest
    • Mention file names and symbols when possible
    • Either propose concrete fixes or detailed explanations
  6. The prompt is sent to Gemini 2.5 Pro (model name gemini-2.5-pro).
    The response is printed directly to the console between:

    === Gemini 2.5 Pro answer ===
    ...
    === End of answer ===
    
  7. If gitingest is not installed, option 2 prints a clear message that it is required and suggests:

    pip install gitingest
    

    Then it returns to the main menu.


Option 3: Screen based realtime coding tips

Overview

Option 3 uses screen capture plus OCR plus GPT 5 mini to act as a realtime “code coach” that watches your screen and gives suggestions on code it sees.

Flow in detail

When you select option 3:

  1. The app prints:

    === Option 3: look at my screen and give realtime coding tips ===
    Perfect. Show me your screen and I will be giving you tips on how to improve the code I see.
    
  2. It checks that:

    • mss and numpy are installed
    • The rapidocr-openvino package is available
    • OPENAI_API_KEY is set and an OpenAI client can be created

    If any of these are missing or fail to initialize, the app prints a [screen-helper] error message and returns to the main menu.

  3. If everything is available, the app:

    • Starts a loop that captures the screen every SCREEN_GRAB_INTERVAL_SECONDS seconds (constant set to 1.0) using mss.
    • Converts the screenshot to a NumPy array and passes it to RapidOCR().
    • Collects the recognized text from all boxes and normalizes it by removing empty lines and trailing spaces.
  4. Once the OCR text is ready, the app decides if it looks like code using a heuristic looks_like_code:

    • It checks for strong markers like def, class, import, from or if __name__ == '__main__':.
    • It looks for common code tokens in multiple lines. If the score passes a threshold, it treats the text as code.
  5. If the current OCR code text is:

    • Not empty
    • Looks like code
    • Different from the previous frame

    then the app:

    • Prints a message that it detected new code on the screen.
    • Sends the code text to GPT 5 mini via the OpenAI client.

    GPT 5 mini receives this instruction:

    • System prompt: you are a senior software engineer that receives code from the screen and should give short, actionable tips for readability, correctness and performance.
    • User prompt: includes the code and asks for concise suggestions, plain text only, no markdown and no bullet points.

    The response from GPT 5 mini is printed to the console under:

    [screen-helper] Suggestions for the code on screen:
    
  6. The loop continues until you press Ctrl+C. In that case the app prints a message and returns to the main menu.


Extra feature: summary and run log

As an extra feature beyond the mandatory tasks, option 1 also adds:

1. Summary block in the console

At the end of option 1, after the auto fix, lint and optimization steps, the app prints:

=== Mega Coder summary ===
User request: <your description here>
Auto fix: success
Lint step: completed, see lint output above for details
Runtime optimization: completed, see timing output above for details
Generated file: generated-code-gemini.py
=====================================

This gives the user a clear overview of what Mega Coder did for this run.

2. Log entries in mega_coder.log

Each successful run of option 1 adds a new entry to mega_coder.log in the project root.
The log includes:

  • Timestamp
  • That option 1 was used
  • The user’s description
  • Fixed pipeline status (auto fix, lint, optimization)
  • Path of the generated file

Example:

[2025-11-22 21:13:45] Option 1
Description: A program that checks if a number is prime and prints all primes up to 100
Auto fix: success
Lint step: completed
Runtime optimization: completed
Generated file: generated-code-gemini.py
---

This log is useful both for debugging and for tracking what the tool was asked to generate over time.


Limitations and notes

  • Generated programs must not use input() or command line arguments.
  • Only Python standard library imports are allowed in the generated code.
  • The tool depends on external APIs and network connectivity. Failures in those services will cause options 1, 2 or 3 to fail gracefully with error messages.
  • For very large GitHub repositories, the digest is truncated to MAX_REPO_DIGEST_CHARS, so the analysis is based only on the first part of the repository.

About

Python console app that uses LLMs to: generate and iteratively improve Python programs, analyze public GitHub repositories based on user instructions, and capture the screen to give realtime coding tips.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages