The Heedless GPU Command Center

Run heavy AI training jobs on high-end Cloud GPUs (Tesla P100/T4) for free, controlled from a persistent free-tier micro-VM.

This repository documents the architecture and setup for a "Heedless" Command Center using Oracle Cloud Infrastructure (OCI) and Kaggle.

The Concept

We want a persistent, always-on cloud environment to prototype AI models, but:

Always-Free Cloud VMs usually have no GPUs or weak CPUs.
Free GPU Notebooks (Colab/Kaggle) are ephemeral (sessions die, data is wiped).

The Solution: We build a "Command Center" on a free OCI Micro-VM. It acts as a permanent remote control that dispatches heavy training jobs to Kaggle's powerful GPUs via CLI.

Prerequisites

Oracle Cloud Account: Sign up for Free Tier.
Kaggle Account: Sign up here.
Mobile Phone: Required for verifying your Kaggle account (crucial for unlocking GPU access).

Phase 1: Creating the OCI "Command Center" VM

The goal is to snag an "Always Free" VM. While the Ampere A1 (ARM) instances are best, they are often out of stock. We use the AMD Micro instance as a reliable fallback.

1. Networking (The "Public Subnet" Fix)

The OCI instance creation wizard often glitches and fails to assign a Public IP. We fix this by creating the network first.

Log in to OCI Console.
Go to Networking -> Virtual Cloud Networks.
Click "Start VCN Wizard".
Select "Create VCN with Internet Connectivity".
Name it kaggle-network and click Create.
- This ensures you have a Public Subnet and an Internet Gateway ready.

2. Launching the Instance

Go to Compute -> Instances -> Create Instance.
Name: kaggle-controller
Image: Click "Change Image" -> Canonical Ubuntu.
- Recommendation: Choose Canonical Ubuntu 22.04 Minimal.
- Why: The standard version uses ~500MB RAM. The Minimal version uses ~150MB, leaving more room for your Python scripts on the 1GB RAM Micro instance.
Shape: Click "Change Shape" -> Specialty and Legacy.
- Select VM.Standard.E2.1.Micro (Always Free-eligible).
- Specs: 1 OCPU, 1 GB Memory.
Networking:
- Select "Select existing virtual cloud network".
- VCN: kaggle-network.
- Subnet: public subnet-kaggle-network.
- CRITICAL: Ensure "Assign a public IPv4 address" says Yes.
SSH Keys:
- Generate a key on your local machine (PowerShell): ssh-keygen -t rsa -b 4096
- Select "Paste public keys" in OCI and paste the content of your .pub file.
Click Create.

3. Connection

Once the instance status is Green (Running), grab the Public IP and connect:

ssh -i /path/to/private/key ubuntu@YOUR_PUBLIC_IP

Phase 2: Configuration

Since we used the "Minimal" image, we need to install the basics.

# 1. Update and install Python/Pip
sudo apt update
sudo apt install python3-pip unzip -y

# 2. Install Kaggle CLI
pip3 install kaggle

# 3. Add local bin to PATH (so you can type 'kaggle' instead of the full path)
echo 'export PATH=$HOME/.local/bin:$PATH' >> ~/.bashrc
source ~/.bashrc

Phase 3: Kaggle Authentication

To control the GPUs, we authenticate using an environment variable.

Obtain your API Token:
- Go to Kaggle.com -> Settings -> API -> Create New Token.
- Copy the token string provided (e.g., KGAT_...). Note: If a file downloads, you can ignore it; we only need the token string.
Configure Environment:
- Run this command to save your token permanently to your shell configuration (replace YOUR_TOKEN_STRING with your actual token):
```
echo 'export KAGGLE_API_TOKEN="YOUR_TOKEN_STRING"' >> ~/.bashrc
source ~/.bashrc
```
Test:
```
kaggle competitions list
```
If you see a list of competitions, you are connected.

Phase 4: The GPU Workflow

This is how you run code on the cloud.

1. Create a Script

Write your PyTorch/TensorFlow code in a standard .py file. See examples/000_hello_gpu/main.py.

2. Initialize Metadata

Run kaggle kernels init to generate kernel-metadata.json. You must edit this file to enable the GPU.

Crucial Configuration:

{
  "id": "YOUR_KAGGLE_USERNAME/project-name",
  "title": "GPU Test",
  "code_file": "main.py",
  "language": "python",
  "kernel_type": "script",
  "is_private": "true",
  "enable_gpu": "true",
  "enable_internet": "true",
  "dataset_sources": [],
  "kernel_sources": [],
  "competition_sources": []
}

3. The "Push" (Execute)

kaggle kernels push

4. Check Status & Logs

# Check status
kaggle kernels status YOUR_USERNAME/project-name

# Download logs (only works after status is COMPLETE)
kaggle kernels output YOUR_USERNAME/project-name
cat project-name.log

Troubleshooting

"FAILURE: No GPU detected"

If the logs say CUDA is not available, it is usually because your Kaggle account is not phone verified.

Go to Kaggle Settings -> Phone Verification.
Verify your number.
Go to any notebook on the web interface and manually switch the Accelerator to "GPU T4" once to "unlock" the feature.

"Command not found"

Run export PATH=$HOME/.local/bin:$PATH or add it to your .bashrc.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
examples/000_hello_gpu		examples/000_hello_gpu
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
run_gpu.sh		run_gpu.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Heedless GPU Command Center

The Concept

Prerequisites

Phase 1: Creating the OCI "Command Center" VM

1. Networking (The "Public Subnet" Fix)

2. Launching the Instance

3. Connection

Phase 2: Configuration

Phase 3: Kaggle Authentication

Phase 4: The GPU Workflow

1. Create a Script

2. Initialize Metadata

3. The "Push" (Execute)

4. Check Status & Logs

Troubleshooting

"FAILURE: No GPU detected"

"Command not found"

About

Uh oh!

Releases

Packages

Languages

License

Foadsf/heedless-gpu-controller

Folders and files

Latest commit

History

Repository files navigation

The Heedless GPU Command Center

The Concept

Prerequisites

Phase 1: Creating the OCI "Command Center" VM

1. Networking (The "Public Subnet" Fix)

2. Launching the Instance

3. Connection

Phase 2: Configuration

Phase 3: Kaggle Authentication

Phase 4: The GPU Workflow

1. Create a Script

2. Initialize Metadata

3. The "Push" (Execute)

4. Check Status & Logs

Troubleshooting

"FAILURE: No GPU detected"

"Command not found"

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages