LM-Nav: Vision-Language Navigation in Gazebo Simulation

A Vision-Language Navigation (VLN) pipeline that uses CLIP + LLM to navigate a TurtleBot3 robot through a simulated house based on natural language instructions.

Architecture

User Instruction → LLM Landmark Extractor → CLIP Scorer → Graph Search → Nav2 Execution
       ↓                    ↓                    ↓              ↓              ↓
 "Go to kitchen"     ["kitchen",          Score nodes     Find optimal    Drive robot
                      "refrigerator"]     with CLIP       path through    via Nav2
                                          similarity      topological     actions
                                                          graph

Pipeline Components

Component	File	Description
LLM Extractor	`lmnav/llm_extractor.py`	Extracts landmarks from natural language (Ollama/spaCy/OpenAI)
CLIP Scorer	`lmnav/clip_scorer.py`	Scores graph node images against landmark descriptions
Graph Search	`lmnav/graph_search.py`	Finds optimal path through topological graph
Visualizer	`lmnav/visualizer.py`	Generates walk visualization images
House Explorer	`scripts/explore_house.py`	Drives robot through house, captures node images
Pipeline Runner	`scripts/run_pipeline.py`	Runs the full VLN pipeline (offline)
Walk Executor	`scripts/execute_walk.py`	Drives the robot along the planned path
Ego View	`scripts/ego_view.py`	First-person camera view from the robot

Prerequisites

System Requirements

OS: Ubuntu 22.04 (native or via WSL2/Distrobox)
GPU: NVIDIA GPU with driver support (tested on RTX 4050)
RAM: 8GB+ recommended
Disk: ~5GB for ROS2 + dependencies

Software Dependencies

ROS2 Humble (full desktop install)
Gazebo Classic 11 (comes with ROS2 Humble desktop)
Nav2 (ROS2 navigation stack)
TurtleBot3 packages
Conda (Miniconda/Anaconda)
Ollama (optional, for LLM landmark extraction)

Installation Guide

Step 1: Install ROS2 Humble

If using Distrobox/WSL2, run these inside the Ubuntu container.

# Set locale
sudo apt update && sudo apt install locales
sudo locale-gen en_US en_US.UTF-8
sudo update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8
export LANG=en_US.UTF-8

# Add ROS2 repository
sudo apt install software-properties-common
sudo add-apt-repository universe
sudo apt update && sudo apt install curl -y
sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2/ubuntu $(. /etc/os-release && echo $UBUNTU_CODENAME) main" | sudo tee /etc/apt/sources.list.d/ros2.list > /dev/null

# Install ROS2 Humble Desktop
sudo apt update
sudo apt install ros-humble-desktop -y

Step 2: Install Nav2 & TurtleBot3 Packages

sudo apt install -y \
    ros-humble-navigation2 \
    ros-humble-nav2-bringup \
    ros-humble-turtlebot3-gazebo \
    ros-humble-turtlebot3-description \
    ros-humble-turtlebot3-navigation2 \
    ros-humble-gazebo-ros-pkgs \
    ros-humble-cv-bridge \
    python3-colcon-common-extensions

Step 3: Create Conda Environment

conda create -n dl_env python=3.10 -y
conda activate dl_env

# Install Python dependencies
cd VLN_Solution_Hack60
pip install -r requirements.txt

# Download spaCy model (for NLP fallback)
python -m spacy download en_core_web_sm

Step 4: Install Ollama (Optional — for LLM landmark extraction)

curl -fsSL https://ollama.com/install.sh | sh
ollama serve &        # Start in background
ollama pull llama3    # Download the model

If you don't install Ollama, set llm_backend: "spacy" in config/pipeline_config.yaml.

Step 5: Clone the AWS Small House World

This should already be included in the repository under aws-robomaker-small-house-world/.

If missing:

cd VLN_Solution_Hack60
git clone https://github.com/aws-robotics/aws-robomaker-small-house-world.git

Running the Pipeline

The pipeline has 3 phases, each in a separate terminal.

Phase 0: Pre-flight Check

Make sure you're inside the Ubuntu environment (Distrobox/WSL2) and activate the conda env:

# If using Distrobox:
distrobox enter ubuntu

# Activate conda & ROS2:
conda activate dl_env
source /opt/ros/humble/setup.bash

# Navigate to project:
cd /path/to/VLN_Solution_Hack60

Phase 1: Launch Simulation (Terminal 1)

bash launch_sim.sh

This single command handles everything:

✅ Sets NVIDIA GPU rendering environment variables
✅ Sets GAZEBO_PLUGIN_PATH and GAZEBO_MODEL_PATH
✅ Sets TURTLEBOT3_MODEL=waffle
✅ Launches Gazebo + Nav2 + RViz
✅ Waits for Gazebo to be ready → spawns the robot (anti-topple model)
✅ Tunes Nav2 costmap for indoor navigation (reduced inflation)

Wait until you see:

🎉 Simulation is fully running!
   Robot: TurtleBot3 Waffle at (-2.0, -0.5)
   Costmap: tuned for indoor doorways

Phase 2: Explore the House (Terminal 2)

conda activate dl_env
source /opt/ros/humble/setup.bash
cd /path/to/VLN_Solution_Hack60

python scripts/explore_house.py

This drives the robot through 55 predefined waypoints covering:

Hallway, Living Room, Kitchen, Bathroom, Bedroom, Fitness Room

At each waypoint, it captures a first-person image and records the pose. Output:

data/aws_house_graph/
├── node_000.png ... node_054.png    # First-person images
└── poses.json                       # Robot poses at each node

⏱️ Takes ~10-15 minutes for a full exploration.

Phase 3: Run the VLN Pipeline (Terminal 2 — after exploration)

python scripts/run_pipeline.py -i "Go to the kitchen and find the refrigerator"

This runs offline (no robot movement):

Extracts landmarks from the instruction using LLM/spaCy
Scores all node images with CLIP against each landmark
Finds the optimal path through the topological graph
Saves the planned walk to output/planned_walk.json

Phase 4: Execute the Walk (Terminal 3)

conda activate dl_env
source /opt/ros/humble/setup.bash
cd /path/to/VLN_Solution_Hack60

python scripts/execute_walk.py

This reads output/planned_walk.json and drives the robot along the planned path using Nav2.

Environment Variables Reference

These are automatically set by launch_sim.sh, but documented here for manual use:

# ── ROS2 ──
source /opt/ros/humble/setup.bash
export TURTLEBOT3_MODEL=waffle

# ── Gazebo Plugin & Model Paths (CRITICAL) ──
export GAZEBO_PLUGIN_PATH=/opt/ros/humble/lib:${GAZEBO_PLUGIN_PATH}
export GAZEBO_MODEL_PATH=/opt/ros/humble/share/turtlebot3_gazebo/models:$(pwd)/aws-robomaker-small-house-world/models:${GAZEBO_MODEL_PATH}

# ── NVIDIA GPU Rendering (for laptop hybrid GPU setups) ──
export __NV_PRIME_RENDER_OFFLOAD=1
export __GLX_VENDOR_LIBRARY_NAME=nvidia
export __VK_LAYER_NV_optimus=NVIDIA_only
export __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/10_nvidia.json
export MESA_GL_VERSION_OVERRIDE=3.3
export OGRE_RTT_MODE=FBO

Nav2 Costmap Tuning

The default Nav2 costmap parameters are too conservative for indoor navigation. launch_sim.sh automatically applies these tuned values:

Parameter	Default	Tuned	Why
`inflation_radius`	0.55m	0.15m	Allows passage through doorways
`cost_scaling_factor`	3.0	15.0	Cost drops off faster from walls
`robot_radius`	0.22m	0.12m	TurtleBot3 fits through tight spaces

To adjust manually (live, without restart):

ros2 param set /local_costmap/local_costmap inflation_layer.inflation_radius 0.15
ros2 param set /global_costmap/global_costmap inflation_layer.inflation_radius 0.15
ros2 param set /local_costmap/local_costmap inflation_layer.cost_scaling_factor 15.0
ros2 param set /global_costmap/global_costmap inflation_layer.cost_scaling_factor 15.0

Robot Model: Anti-Topple Mod

The file waffle_stable.model is a modified TurtleBot3 Waffle SDF with anti-topple physics:

Property	Original	Modified	Effect
Mass	1.0 kg	20.0 kg	Too heavy to push over
Center of mass (z)	0.048m	0.005m	Very low center of gravity
Roll inertia (ixx)	0.001	1.0	1000x resistance to roll
Pitch inertia (iyy)	0.001	1.0	1000x resistance to pitch

Project Structure

VLN_Solution_Hack60/
├── launch_sim.sh                    # One-command simulation launcher
├── waffle_stable.model              # Anti-topple robot SDF
├── pipeline_config.yaml             # Quick config (backend selector)
├── requirements.txt                 # Python dependencies
├── README.md                        # This file
│
├── config/
│   └── pipeline_config.yaml         # Full pipeline configuration
│
├── lmnav/                           # Core pipeline modules
│   ├── __init__.py
│   ├── llm_extractor.py             # LLM landmark extraction
│   ├── clip_scorer.py               # CLIP image-text scoring
│   ├── graph_search.py              # Topological graph search
│   ├── visualizer.py                # Walk visualization
│   ├── adapter.py                   # Base environment adapter
│   ├── aws_house_adapter.py         # AWS house specific adapter
│   └── mp3d_adapter.py              # Matterport3D adapter
│
├── scripts/                         # Executable scripts
│   ├── explore_house.py             # Robot house exploration
│   ├── run_pipeline.py              # VLN pipeline (offline)
│   ├── execute_walk.py              # Execute planned walk
│   ├── ego_view.py                  # First-person camera view
│   └── generate_test_data.py        # Test data generator
│
├── aws-robomaker-small-house-world/ # Gazebo world + maps
│   ├── worlds/small_house.world
│   ├── models/                      # House furniture models
│   └── maps/turtlebot3_waffle_pi/
│       ├── map.yaml
│       └── map.pgm
│
├── data/
│   └── aws_house_graph/             # Generated by explore_house.py
│       ├── node_*.png               # First-person images
│       └── poses.json               # Node poses
│
└── output/                          # Generated by run_pipeline.py
    ├── planned_walk.json
    ├── execution_report.json
    └── walk_visualization.png

Troubleshooting

Robot not spawning in Gazebo

Cause: Race condition — spawn_entity runs before Gazebo is ready.
Fix: Use launch_sim.sh which handles this automatically.

Robot topples over

Cause: Original waffle.model has low mass/inertia.
Fix: launch_sim.sh uses waffle_stable.model with anti-topple physics.

Robot gets stuck at doorways

Cause: Default Nav2 inflation radius (0.55m) is too large.

Fix: launch_sim.sh auto-tunes to 0.15m. For live adjustment:

ros2 param set /local_costmap/local_costmap inflation_layer.inflation_radius 0.15
ros2 param set /global_costmap/global_costmap inflation_layer.inflation_radius 0.15

Gazebo renders with Mesa (CPU) instead of NVIDIA GPU

Cause: Hybrid GPU laptop not routing to NVIDIA.
Fix: launch_sim.sh sets __NV_PRIME_RENDER_OFFLOAD=1 and related vars.
Verify: In Gazebo, check the rendering engine in the bottom status bar.

`GAZEBO_PLUGIN_PATH` not set

Symptoms: Robot spawns but doesn't publish /odom, /scan, etc.
Fix: launch_sim.sh exports GAZEBO_PLUGIN_PATH=/opt/ros/humble/lib.

Ollama not running

Symptoms: run_pipeline.py fails with connection error.
Fix: Start Ollama: ollama serve &
Alternative: Use spaCy backend: set llm_backend: "spacy" in config/pipeline_config.yaml.

Quick Reference Commands

# === Kill everything ===
killall gzserver gzclient 2>/dev/null; pkill -f ros2

# === Delete and respawn robot (without restarting Gazebo) ===
ros2 service call /delete_entity gazebo_msgs/srv/DeleteEntity '{name: "turtlebot3_waffle"}'
sleep 2
ros2 run gazebo_ros spawn_entity.py -entity turtlebot3_waffle \
    -file $(pwd)/waffle_stable.model -x -2.0 -y -0.5 -z 0.01

# === Check robot status ===
ros2 topic info /odom                          # Should show 1 publisher
ros2 topic echo /odom --once | head -15        # Check position

# === List Gazebo models ===
ros2 service call /get_model_list gazebo_msgs/srv/GetModelList '{}'

# === Tune costmap live ===
ros2 param set /local_costmap/local_costmap inflation_layer.inflation_radius 0.15
ros2 param set /global_costmap/global_costmap inflation_layer.inflation_radius 0.15

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
aws-robomaker-small-house-world		aws-robomaker-small-house-world
config		config
data		data
lmnav		lmnav
output		output
scripts		scripts
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
README.md		README.md
debug_poses.py		debug_poses.py
emergency_spawn.sh		emergency_spawn.sh
launch_sim.sh		launch_sim.sh
node_map_visualization.png		node_map_visualization.png
pipeline_config.yaml		pipeline_config.yaml
requirements.txt		requirements.txt
waffle_stable.model		waffle_stable.model

Folders and files

Latest commit

History

Repository files navigation

LM-Nav: Vision-Language Navigation in Gazebo Simulation

Architecture

Pipeline Components

Prerequisites

System Requirements

Software Dependencies

Installation Guide

Step 1: Install ROS2 Humble

Step 2: Install Nav2 & TurtleBot3 Packages

Step 3: Create Conda Environment

Step 4: Install Ollama (Optional — for LLM landmark extraction)

Step 5: Clone the AWS Small House World

Running the Pipeline

Phase 0: Pre-flight Check

Phase 1: Launch Simulation (Terminal 1)

Phase 2: Explore the House (Terminal 2)

Phase 3: Run the VLN Pipeline (Terminal 2 — after exploration)

Phase 4: Execute the Walk (Terminal 3)

Environment Variables Reference

Nav2 Costmap Tuning

Robot Model: Anti-Topple Mod

Project Structure

Troubleshooting

Robot not spawning in Gazebo

Robot topples over

Robot gets stuck at doorways

Gazebo renders with Mesa (CPU) instead of NVIDIA GPU

GAZEBO_PLUGIN_PATH not set

Ollama not running

Quick Reference Commands

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GAZEBO_PLUGIN_PATH` not set

Packages