## Getting Started with SpaceMining

Welcome to SpaceMining — a modern Gymnasium-compatible reinforcement learning (RL) environment for asteroid mining in 2D. This guide helps you install the project, run quick demos, train a baseline PPO agent, generate GIFs, and understand the environment’s core concepts.

- **Audience**: New users, students, researchers evaluating RL agents and reward design.
- **Goal**: Go from zero to a working demo in minutes, then explore training and evaluation.

## 1. Installation

### 1.1. Prerequisites
- Python 3.10+
- A recent pip and virtualenv (or Conda/Mamba) installation
- Optional: GPU/CUDA if training large models (CPU is fine for demos)

### 1.2. Install from PyPI:

In [None]:
!pip install space-mining

Or, you can install the package from source:

In [None]:
!pip install git+https://github.com/reveurmichael/space_mining.git

### 1.3. Verify the installation

In [None]:
!python -c "import space_mining; print(getattr(space_mining, '__version__', 'installed'))"

Now you can run the following code to verify the installation:

In [None]:
import space_mining
from space_mining import make_env

print(getattr(space_mining, "__version__", "installed"))

## 2. Quickstart Demos

### 2.1. Run a random agent (human render)
Real-time window showing a randomly acting agent for a quick sanity check.

In [None]:
from space_mining import make_env

env = make_env(render_mode="rgb_array", max_episode_steps=100)
obs, info = env.reset()
for _ in range(100):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        break
env.close()

What you should see:
- A 2D space with a mining robot, asteroids, a mothership, and moving obstacles
- Episode resets upon termination or truncation; total reward prints in terminal


### 2.2. Load a pre-trained agent from Hugging Face and render

In [None]:
from space_mining import make_env
from space_mining.agents.ppo_agent import PPOAgent

env = make_env(render_mode="rgb_array", max_episode_steps=200)
agent = PPOAgent.load_from_hf(
    "LUNDECHEN/space-mining-ppo", filename="final_model.zip", env=env, device="cpu"
)
obs, info = env.reset()
for _ in range(200):
    action = agent.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        break
env.close()

Notes:
- This downloads the model `LUNDECHEN/space-mining-ppo` and runs a short evaluation.
- If you’re on a headless server, set `render_mode='rgb_array'` and save frames instead.


## 3. Train a PPO Agent

You can use the built-in CLI utility or the programmatic API.

### 3.1. CLI-style (programmatic call under the hood)

In [None]:
!python -m space_mining.agents.train_ppo --total-timesteps 100000 --output-dir runs/ppo --checkpoint-freq 10000 --eval-freq 20000

Outputs:
- Checkpoints and logs under `runs/ppo/`
- Console progress; evaluations at configured intervals

### 3.2. Minimal programmatic call

In [None]:
from space_mining.agents.train_ppo import train_ppo

train_ppo(total_timesteps=100000, output_dir="runs/ppo")

This calls `train_ppo(total_timesteps=100000, output_dir='runs/ppo')` with sensible defaults.

Tips:
- Start small (e.g., 300k–1M steps) to validate the pipeline and iterate faster.
- For long runs, consider using a screen/tmux session or a job scheduler.


## 4. Generate a Demo GIF

Turn a trajectory into a publication-ready GIF in minutes.

### 4.1. From a checkpoint

In [None]:
!python -m space_mining.scripts.make_gif --checkpoint runs/ppo/final_model.zip --output output_gif/agent.gif --steps 1200 --fps 20

### 4.2. Minimal programmatic call

In [None]:
from space_mining.scripts.make_gif import generate_trajectory, save_gif

frames = generate_trajectory(
    checkpoint_path="runs/ppo/final_model.zip", num_steps=1200, deterministic=True
)
save_gif(frames, "output_gif/agent.gif", fps=30)


## 5. Troubleshooting

- "No module named space_mining" → Ensure you activated the virtualenv and installed with `pip install space-mining`
- Rendering issues on remote Linux → switch to `render_mode='rgb_array'` or use `xvfb-run`
- Long training times → start with fewer timesteps; validate the pipeline first
- Memory issues when generating GIFs → reduce `--steps`, lower `--fps`, or use `rgb_array` at smaller frame sizes if available


## 6. Frequently Asked Questions (FAQ)

- **Can I wrap the environment with Gymnasium wrappers?** Yes, standard wrappers for logging, normalization, or frame stacking should work.
- **What’s the action range?** Continuous thrust ∈ [-1, 1] for x/y, plus a mining activation ∈ [0, 1].
- **How do I change the episode length?** Pass `max_episode_steps` to `make_env`.
- **How do I render off-screen?** Use `render_mode='rgb_array'`.


## 7. Next Steps

- Explore the `docs/` for the repository (https://github.com/reveurmichael/space_mining) for in-depth guidance:
  - `docs/installation.md`: installations, extras, caveats
  - `docs/examples.md`: more runnable snippets
  - `docs/stable-baseline3.md`: SB3 integration tips
  - `docs/faq.md`: additional questions
- Try customizing reward structures or evaluation protocols for your research.
- Compare policies: random vs PPO vs ablations.