Mega Coder is a Python console application that uses LLMs to:
- Develop a Python program for the user using Gemini 2.5 Flash Lite
- Analyze and fix or explain a public GitHub repository using Gemini 2.5 Pro and gitingest
- Watch your screen in realtime, detect code using OCR, and give coding tips using GPT 5 mini
The app does not accept command line arguments.
All interaction with the user is done through the console menu.
- Project structure
- Requirements
- Installation
- Environment variables and configuration
- Running the app
- Option 1: Develop a Python program
- Option 2: Fix or explain a GitHub repository
- Option 3: Screen based realtime coding tips
- Extra feature: summary and run log
- Limitations and notes
Main files in this exercise:
-
mega_coder.py
Main console application that implements all three options and the extra feature. -
generated-code-gemini.py
Output file created and overwritten by option 1. Contains the generated Python program. -
mega_coder.log
Simple text log that records runs of option 1. Created automatically when option 1 is used.
There may also be small helper scripts like gemini_tester.py or chatgpt_response_tester.py, but the core of the exercise is in mega_coder.py.
- Python 3.10 or newer is recommended.
The following packages are used by mega_coder.py:
-
Core libraries for this project:
python-dotenvcoloramatqdmgoogle-generativeaiopenaigitingestpylint
-
Screen capture and OCR for option 3:
mssnumpyrapidocr-openvino
You can install these packages with:
pip install python-dotenv colorama tqdm google-generativeai openai gitingest pylint mss numpy rapidocr-openvinoThe exact versions are not hard coded in the code. You can pin them in requirements.txt if needed.
Put mega_coder.py and the rest of the files in a dedicated folder, for example:
/path/to/mega-coder/
From inside the project folder:
python -m venv .venvOn macOS or Linux:
source .venv/bin/activateOn Windows PowerShell:
.venv\Scripts\Activate.ps1pip install python-dotenv colorama tqdm google-generativeai openai gitingest pylint mss numpy rapidocr-openvinoIn the project root folder, create a file called .env:
GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
Do not commit this file to Git. It should stay local only.
The app uses python-dotenv and loads environment variables from .env automatically:
-
GOOGLE_API_KEY- Required for Gemini 2.5 Flash Lite and Gemini 2.5 Pro via the
google-generativeailibrary. - Used by options 1 and 2.
- Required for Gemini 2.5 Flash Lite and Gemini 2.5 Pro via the
-
OPENAI_API_KEY- Required for GPT 5 mini via the
openailibrary. - Used by option 3.
- Required for GPT 5 mini via the
If any required key is missing:
- Option 1 or 2 will exit with an error message about
GOOGLE_API_KEY. - Option 3 will print a clear
[screen-helper]message thatOPENAI_API_KEYis missing and will not start the coaching loop.
exercise1-mega-coder-demo.mp4
From the project root, with the virtual environment activated:
python mega_coder.pyThe app enforces the “no command line arguments” rule.
If you try to run python mega_coder.py something, the program will exit with an error.
When the program starts, it prints:
I’m Mega Coder. What would you like me to do today?
1. Develop a python program.
2. Fix/change something in a Github repository.
3. Look at my screen and give me realtime coding tips.
You can now type:
1and Enter: option 12and Enter: option 23and Enter: option 3qand Enter: quit the program gracefully
All user input is read from standard input using input() inside the mega_coder.py app itself.
Generated programs are required not to use input() or command line arguments.
Option 1 implements the full pipeline described in the assignment:
- Ask the user for a program description
- Generate Python code with Gemini 2.5 Flash Lite
- Write the code to
generated-code-gemini.py - Run the generated code and collect exit code, stdout and stderr
- If there is an error, ask Gemini to fix the code, up to 5 attempts
- Add random mutations to sometimes break the code on purpose
- Run
pylinton the generated file and use Gemini to fix lint issues, up to 3 rounds - Measure runtime of the final program and try to generate a faster version
- Print a human readable summary and append a log entry to
mega_coder.log
When you select option 1:
-
The app prints:
=== Option 1: develop a Python program === Describe me which python program you want me to develop. -
You type a description, for example:
A program that checks if a number is prime and prints all primes up to 100 -
The description is sanitized and validated, then used to build a strict prompt for Gemini. The prompt explicitly requires:
- No command line arguments
- No calls to
input() - A complete, runnable Python 3 program
- Only standard library imports
- Meaningful
assertstatements that check the core logic
-
The app calls Gemini 2.5 Flash Lite (model name
gemini-2.5-flash-lite) usinggoogle-generativeai.
The raw response is converted to plain Python code and written togenerated-code-gemini.py. -
The program then starts an auto fix loop:
- Constant:
MAX_FIX_ATTEMPTS = 5 - Before each run, there is a small chance to inject a tiny syntax bug into the code. This uses
MUTATION_PROBABILITY = 0.3and adds an invalidprint(line. - The generated program is executed by spawning a new Python process using
subprocess.run. - If exit code is not zero or an assertion fails, the error details and the current code are sent back to Gemini, which returns a fixed version.
- This process repeats until the program runs successfully or until 5 attempts are used.
If after 5 attempts the code still does not run correctly, the app prints:
Sorry master, I have failed you. I can’t create this program without issues - Constant:
-
Once a working version is found, it is written to
generated-code-gemini.py. -
The app then runs a lint and auto fix loop using
pylint:- Constant:
MAX_LINT_FIX_ATTEMPTS = 3 - Command:
pylint --disable=C,R generated-code-gemini.py
- If
pylintreturns issues, the fullpylintoutput and current code are sent to Gemini, which returns an updated version. - The updated version is written back to
generated-code-gemini.pyand linting is repeated, up to 3 rounds.
If after 3 rounds there are still lint issues, the app prints:
There are still lint errors/warningsIf there are no issues, the app prints:
Amazing. No lint errors/warningsDuring this process, the
tqdmprogress bar shows the lint fixing rounds. - Constant:
-
After the linting stage, the app measures runtime of the current program:
- It runs
generated-code-gemini.pyand measures execution time in milliseconds. - Using this timing and the code, it asks Gemini to generate a more efficient version that keeps the same asserts and behavior.
- It writes the optimized version to
generated-code-gemini.py, measures its runtime, and compares times.
If the optimized version is faster, it prints a message similar to:
Code running time optimized! It now runs in {after} milliseconds, while before it was {before} millisecondsIf the optimized version is slower or fails to run, the original version is restored.
- It runs
-
At the end of the pipeline, the app:
- Prints a short summary block to the console.
- Appends a log entry for this run to
mega_coder.log.
All prints use colorama for colored output.
Option 2 uses gitingest and Gemini 2.5 Pro to analyze a public GitHub repository based on a natural language request from the user.
When you select option 2:
-
The app prints:
=== Option 2: fix or explain a GitHub repository === -
It asks:
Give me the full url of a public github repository (or 'q' to go back to the main menu): -
After you provide a valid
github.comURL, the app asks:Tell me what you want me to fix/change/explain in that repository (or 'q' to go back to the main menu): -
The app then uses:
gitingest.ingest(repo_url, include_submodules=False)
to build a text digest of the repository. The digest joins:
- Summary
- Tree
- Content
It also limits the length of the digest using the constant:
MAX_REPO_DIGEST_CHARS = 200_000
If the digest is longer, it keeps the first part and adds a note that the digest was truncated.
-
A detailed prompt is built that includes:
- Repository URL
- The digest text
- The user’s request
The prompt instructs Gemini to:
- Use only information from the digest
- Mention file names and symbols when possible
- Either propose concrete fixes or detailed explanations
-
The prompt is sent to Gemini 2.5 Pro (model name
gemini-2.5-pro).
The response is printed directly to the console between:=== Gemini 2.5 Pro answer === ... === End of answer === -
If
gitingestis not installed, option 2 prints a clear message that it is required and suggests:pip install gitingestThen it returns to the main menu.
Option 3 uses screen capture plus OCR plus GPT 5 mini to act as a realtime “code coach” that watches your screen and gives suggestions on code it sees.
When you select option 3:
-
The app prints:
=== Option 3: look at my screen and give realtime coding tips === Perfect. Show me your screen and I will be giving you tips on how to improve the code I see. -
It checks that:
mssandnumpyare installed- The
rapidocr-openvinopackage is available OPENAI_API_KEYis set and an OpenAI client can be created
If any of these are missing or fail to initialize, the app prints a
[screen-helper]error message and returns to the main menu. -
If everything is available, the app:
- Starts a loop that captures the screen every
SCREEN_GRAB_INTERVAL_SECONDSseconds (constant set to1.0) usingmss. - Converts the screenshot to a NumPy array and passes it to
RapidOCR(). - Collects the recognized text from all boxes and normalizes it by removing empty lines and trailing spaces.
- Starts a loop that captures the screen every
-
Once the OCR text is ready, the app decides if it looks like code using a heuristic
looks_like_code:- It checks for strong markers like
def,class,import,fromorif __name__ == '__main__':. - It looks for common code tokens in multiple lines. If the score passes a threshold, it treats the text as code.
- It checks for strong markers like
-
If the current OCR code text is:
- Not empty
- Looks like code
- Different from the previous frame
then the app:
- Prints a message that it detected new code on the screen.
- Sends the code text to GPT 5 mini via the OpenAI client.
GPT 5 mini receives this instruction:
- System prompt: you are a senior software engineer that receives code from the screen and should give short, actionable tips for readability, correctness and performance.
- User prompt: includes the code and asks for concise suggestions, plain text only, no markdown and no bullet points.
The response from GPT 5 mini is printed to the console under:
[screen-helper] Suggestions for the code on screen: -
The loop continues until you press
Ctrl+C. In that case the app prints a message and returns to the main menu.
As an extra feature beyond the mandatory tasks, option 1 also adds:
At the end of option 1, after the auto fix, lint and optimization steps, the app prints:
=== Mega Coder summary ===
User request: <your description here>
Auto fix: success
Lint step: completed, see lint output above for details
Runtime optimization: completed, see timing output above for details
Generated file: generated-code-gemini.py
=====================================
This gives the user a clear overview of what Mega Coder did for this run.
Each successful run of option 1 adds a new entry to mega_coder.log in the project root.
The log includes:
- Timestamp
- That option 1 was used
- The user’s description
- Fixed pipeline status (auto fix, lint, optimization)
- Path of the generated file
Example:
[2025-11-22 21:13:45] Option 1
Description: A program that checks if a number is prime and prints all primes up to 100
Auto fix: success
Lint step: completed
Runtime optimization: completed
Generated file: generated-code-gemini.py
---
This log is useful both for debugging and for tracking what the tool was asked to generate over time.
- Generated programs must not use
input()or command line arguments. - Only Python standard library imports are allowed in the generated code.
- The tool depends on external APIs and network connectivity. Failures in those services will cause options 1, 2 or 3 to fail gracefully with error messages.
- For very large GitHub repositories, the digest is truncated to
MAX_REPO_DIGEST_CHARS, so the analysis is based only on the first part of the repository.