Skip to content

baljo/Inspector_Rover

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Autonomous Checkpoint Inspection Rover

This project came out of a fairly practical problem. I wanted a small rover that could drive to a few fixed checkpoints, stop in a repeatable way, take a picture, and decide whether something looked wrong. In theory, a full ROS navigation stack should have been the obvious solution. In practice, on this hardware, it turned into more complexity and instability than I wanted for a small inspection robot.

Going in the other direction did not work either. Simple encoder replay was easy enough, but not accurate enough on its own. The rover could get close, but not close enough to trust the camera view for repeatable inference. So I ended up building a simpler checkpoint-based system instead: teach the route once, replay it later, verify the checkpoint with LiDAR, align with an AprilTag, capture an image, and run inference locally on the rover.

In the current proof-of-concept mission, the rover visits three checkpoints. Two of them use Edge Impulse Visual Anomaly Detection to check door-related conditions, and one uses a standard classification model to check whether a stove has been left on.

This is not meant to compete with general robot navigation. It is a narrower approach for a narrower job. But for fixed inspection routes, such as home safety rounds, facility inspection, or other checkpoint-based monitoring tasks, that tradeoff can be worth it: less infrastructure, less tuning, more repeatable camera views, and a system that is easier to understand, debug, and reproduce.

Demonstration

Short GIF-video below, if not visible, see this YouTube video. Both videos are created by stitching together the still pictures the rover itself took every 2 seconds, leading to an astonishing frame rate of 0.5 FPS!


Table of Contents

  1. Overview
  2. Hardware Requirements
  3. Rover Installation and Setup
  4. Data Collection and Cameras
    4.1. OAK Camera
    4.1.1. Capture Images for VAD
    4.1.2. OAK to Edge Impulse Program
    4.2. Pan and Tilt USB Camera
    4.2.1. Capture Images for Classification
    4.2.2. Teleop Program
  5. Route Teach and Replay
    5.1. Navigation Strategy
    5.2. Teach the Route
    5.3. Replay the Route
  6. Edge Impulse Models
    6.1. Visual Anomaly Detection Project
    6.1.1. Create an Impulse
    6.1.2. Train
    6.2. Classification Project
    6.2.1. Create an Impulse
    6.2.2. Train
    6.3. Deploy Models
  7. Run the Full Mission
    7.1. Results
    7.1.1. Anyone Left the Doors Open or Blocked?
    7.1.2. Did I Forget to Turn Off the Stove?
  8. Example Applications
  9. Tradeoffs Compared to ROS Navigation Frameworks
  10. Conclusion

1. Overview

The rover performs autonomous inspection missions consisting of:

  1. Replay previously recorded encoder motion
  2. Verify checkpoint identity using LiDAR signature matching
  3. Align orientation using AprilTags
  4. Capture RGB inspection image
  5. Run Edge Impulse VAD or classification inference locally
  6. Store inference results and inspection image

This enables repeatable condition monitoring at fixed inspection points without requiring external infrastructure or complex navigation frameworks.


2. Hardware Requirements

Tested platform:

  • Waveshare UGV Rover PI ROS, including:
    • the rover itself
    • Raspberry Pi 4/5
    • LD19 LiDAR
    • OAK camera
    • Pan and Tilt USB Camera
  • AprilTags (recommended size: 10 x 10 cm or larger)
    • Print out the tags on white paper and fasten them close to the checkpoints

What are AprilTags?

  • AprilTags are conceptually similar to QR Codes, in that they are a type of two-dimensional bar code. However, they are designed to encode far smaller data payloads (between 4 and 12 bits), allowing them to be detected more robustly and from longer ranges. Further, they are designed for high localization accuracy: you can compute the precise 3D position of the AprilTag with respect to the camera (Source). In practice this means that with some calibration you'll know the distance from the camera to the tag, horizontal and vertical angles, and rotation. The final accuracy depends on the camera resolution, tag size, the real distance to the tag, and visibility.


3. Rover Installation and Setup

Connect to the onboard Raspberry Pi over your normal field workflow, for example via SSH, serial console, or a directly attached keyboard and display. Newcomers are strongly recommended to follow the official rover instructions.

  • Clone this repository onto the Raspberry Pi and work from the project root directory.
  • Install the required Python packages in the environment you will use on the rover:
pip install edge_impulse_linux opencv-contrib-python pillow numpy pyserial pyyaml depthai

4. Data Collection and Cameras

This project uses two camera paths for different parts of the mission. The OAK camera for the two VAD door checkpoints, and the pan & tilt USB-camera for the stove on/off checkpoint.

4.1. OAK Camera

The OAK camera is used for checkpoint imaging, AprilTag alignment, and VAD-oriented capture flow.

Use the OAK camera when you need:

  • AprilTag-based alignment
  • consistent checkpoint framing
  • capture for VAD checkpoints

4.1.1. Capture Images for VAD

For VAD collection, the main goal is consistency rather than class balancing.

Use the OAK camera to collect:

  • normal-condition images only for the VAD training set
  • approximately similar viewpoints at each checkpoint, accepting normal rover drift in position and orientation
  • enough variation in lighting, small pose offsets, and minor natural scene changes to make the model robust to real mission conditions

oak_to_ei_v321.py is useful when you want to send images directly into Edge Impulse. oak_save_frames.py can still be used for manual local frame capture, but it is not the primary workflow.

4.1.2. OAK to Edge Impulse Program

The primary OAK-side ingestion helper is oak_to_ei_v321.py.

Use it when you want to stream OAK images directly into an Edge Impulse training dataset. The script:

  • opens the OAK camera through DepthAI
  • captures 640 x 360 RGB frames
  • prompts for one label
  • uploads JPEG frames to the Edge Impulse ingestion API at roughly 1 FPS
  • keeps running until you stop it with Ctrl+C

Before running it, set your Edge Impulse API key either in the shell environment or in a nearby .env file. You can create API keys in Edge Impulse from Dashboard/Keys.

EI_API_KEY=your_edge_impulse_key

Typical usage:

python3 Source/oak_to_ei_v321.py

Then:

  1. enter the label when prompted
  2. let it upload frames for that label
  3. change slowly the rover pose and surrounding lighting - don't strive for perfection (the real world is anyhow most often messy...)
  4. stop it with Ctrl+C
  5. restart it with a different label if you want to collect another class or condition

4.2. Pan and Tilt USB Camera

The pan and tilt USB camera is used where a movable viewpoint is needed, especially for checkpoint-specific classification flows such as the stove/lamp style inspection case.

Note: The default USB-camera on the rover has a 160° ultra-wide-angle lens which is perfect for live viewing purposes, but far from optimal for classification of a small stove lamp 70 cm from the floor. While it's easy to swap out the camera or in some cases only the lens, I solved this challenge by using traditional computer vision. In short, the stove panel is found with a reference-image feature-matching + homography approach, using OpenCV functions. This of course means that you - unless you have similar stove as me (unlikely) - need to adjust the algorithm to fit your use case.

4.2.1. Capture Images for Classification

Use the USB pan/tilt camera when the target requires a carefully aimed or adjustable viewpoint.

Recommended flow:

  1. start pt_camera_teleop.py
  2. drive and aim the camera until the target is framed correctly
  3. use p for single images or b for bursts
  4. collect examples for each class, such as on and off and don't forget to collect background images as well
  5. repeat from slightly different but still realistic viewpoints, also consider the lighting

This is the more natural capture path for checkpoint-specific classification tasks than the fixed OAK view.

4.2.2. Teleop Program

The main USB-camera collection tool is pt_camera_teleop.py.

Use it when you want live pan/tilt control, rover movement, and image capture from the USB camera in one interactive loop.

Typical usage:

python3 source/pt_camera_teleop.py

By default, teleop updates a lightweight preview/viewer set under:

  • output/mission_preview/latest.jpg
  • output/mission_preview/latest.json
  • output/mission_preview/view_latest.html

Open view_latest.html in a browser if you want a continuously refreshing live view while teleop is running. If you, like I, are using Visual Studio Code for development, you can install a light-weight HTML-viewer to have a live video feed (live in this case means 0.5-1 FPS depending on settings).

Important controls from the script:

Pan & Tilt Camera:

  • i tilt up
  • k tilt down
  • j pan left
  • l pan right

Rover Control:

  • e forward
  • d backward
  • s turn left
  • f turn right

Capture images by:

  • p save one timestamped snapshot
  • b save a burst of snapshots

Additional commands:

  • x stop rover now
  • c center the gimbal
  • q quit

While teleop is running, latest.jpg is refreshed automatically and view_latest.html polls latest.json for updates. Burst capture is especially useful for collecting multiple labeled examples quickly.


5. Route Teach and Replay

5.1. Navigation Strategy

This project intentionally avoids ROS navigation frameworks in order to reduce system complexity and improve reproducibility on embedded hardware platforms.

Instead, navigation is based on:

  • encoder teach-and-replay motion
  • LiDAR checkpoint signature verification
  • AprilTag alignment for final positioning

This produces stable and repeatable checkpoint inspection behaviour with minimal infrastructure requirements.

5.2. Teach the Route

During the teach phase:

  • rover is manually driven between checkpoints
  • encoder ticks are recorded
  • LiDAR checkpoint signatures are captured
  • checkpoint orientations are stored

These recordings define the inspection route.

Use keyboard_cp_teach.py to record a taught route:

python3 source/keyboard_cp_teach.py --record-route --route-out /home/ws/EI_VAD/data/route_teach_latest.yaml

Controls:

  • e forward
  • d backward
  • s turn left
  • f turn right
  • x or space stop
  • r run align
  • c capture checkpoint
  • t toggle route logging on or off
  • q quit and save

Recording flow:

  1. Start the script with --record-route.
  2. Manually drive the rover with e, d, s, and f.
  3. At each checkpoint, stop and press c.
  4. Enter the checkpoint name when prompted, for example cp1.
  5. Optionally enter orient_right_ticks for that checkpoint, or leave it blank.
  6. Repeat for all checkpoints, then press q to save the route.

What is saved:

  • route YAML at --route-out, default /home/ws/EI_VAD/data/route_teach_latest.yaml
  • dense route-signature sidecar JSON
  • route motions, events with cp_capture, and lidar_samples

Useful note:

  • teach defaults are --speed 0.08 and --turn-speed 0.08; matching replay speed to teach speed helps preserve route geometry

5.3. Replay the Route

Use the current mission runner:

python3 /home/ws/EI_VAD/cp_mission/run_cp_mission.py --config /home/ws/EI_VAD/cp_mission/mission.yaml

For direct route replay without full mission orchestration:

python3 source/replay_taught_route.py --route /home/ws/EI_VAD/data/route_teach_latest.yaml

During autonomous execution:

  1. encoder motion replay moves rover toward checkpoint
  2. LiDAR signature verifies checkpoint identity
  3. AprilTag detection corrects final orientation
  4. RGB inspection image captured
  5. Edge Impulse inference executed
  6. anomaly score stored

Live view from inside VS Code:


6. Edge Impulse Models

6.1. Visual Anomaly Detection Project

Once you have collected some images, in this case from the OAK-camera, it's time to build the model in Edge Impulse. I recommend that you develop in iterations, start with a hundreds images or so, train, and test the model. Most probably you need to collect more images, but in some cases you might even decide to start over if you made some mistakes or the model does not perform well, or at all.

6.1.1. Create an Impulse

I've found a resolution of 160x160 to work well on the RPi5.

After the impulse is saved, select Image from the left hand menu, then Save parameters, and finally Generate features

6.1.2. Train

Set up the model like below, I found the EfficientNet V2B0 to be optimal for this project, then click Save & train.

6.2. Classification Project

As mentioned, in this project the USB-camera is used to check if the stove is on or off. So this is a typical classification project with only three classes: on, off, and background (no_lights).

6.2.1. Create an Impulse

Also here I went for the sweet spot on a RPi 5 with a resolution of 160 x 160.

Note: As the stove lights are close to the left side of the panel, they were in many cases cropped out when using the default Fit shortest axis resize mode. By changing to Squash this issue got resolved.

After the impulse is saved, select Image from the left hand menu, then Save parameters, and finally Generate features

6.2.2. Train

As I did not get sufficient accuracy by using a MobileNetV2 model, I again tested with an EfficientNet model and got very good results even with the lowest B0 model, and perfect results with B2. As the RPi 5 is not a really constrained device, why not use its full capacity.

6.3. Deploy Models

To be able to use the models on the rover, you of course need to deploy them.

  • Navigate to the Deployment menu
  • The two VAD-projects were deployed as EIM-binary files
  • If you want, you can test inferencing on your smartphone by scanning the QR-code. However, as the camera optics are different compared to the rover cameras, don't be surprised if the results are not identical.

  • The classification model was deployed as a Tensorflow Lite file via the Dashboard, using the full float32 file.

  • Regardless of the model used, you'll end up with one file per model. Copy these to the RPi.
  • Check the mission.yaml file for how the models are connected to each checkpoint. As an example, the extract below shows how the stove on/off classification model is connected to checkpoint 2.
...
  model_path: /home/ws/EI_VAD/models/cp2.lite
  tflite_python: /home/ws/ei_tflite/bin/python3
  tflite_norm: none
  precrop: stove_panel
  crop_config: /home/ws/EI_VAD/config/stove_indicator_example.json
  crop_mode: panel
  expand_left: 0.32
  expand_right: 0.12
  expand_top: 0.32
  expand_bottom: 0.18
...

7. Run the Full Mission

Use the mission runner:

python3 /home/ws/EI_VAD/cp_mission/run_cp_mission.py --config /home/ws/EI_VAD/cp_mission/mission.yaml

The mission flow can publish preview updates to the same viewer pattern:

  • output/mission_preview/latest.jpg
  • output/mission_preview/latest.json
  • output/mission_preview/view_latest.html

Open view_latest.html during a run if you want to monitor the latest mission image, metadata, and overlays.

Expected behaviour:

  • CP1 reached
  • signature verified
  • AprilTag aligned
  • image captured
  • anomaly or classification score reported

7.1. Results

7.1.1. Anyone Left the Doors Open or Blocked?

As examples from a real mission run, the model detected anomalies when the doors were ajar. The text Anomaly = True is superimposed on the live pictures, and below the picture you'll also see the anomaly score.

7.1.2. Did I Forget to Turn Off the Stove?

Similarly as with the VAD-checkpoints, you'll also clearly see the result of the classification model. In this case the stove was indeed on as can be seen from the raw image.


8. Example Applications

To give some food for thoughts, this system can be adapted for:

  • facility inspection robots
  • warehouse monitoring checkpoints
  • home safety patrol robots
  • industrial inspection routines
  • energy infrastructure condition monitoring

9. Tradeoffs Compared to ROS Navigation Frameworks

You might wonder why I didn't use the ROS navigation framework. That was actually the initial plan, and I did spend endless hours in getting it to work consistently every time. In the end I was fighting both software and hardware, the latter in the sense that I suspect the Sony Murata 18650 Li-Ion batteries do not always provide enough amperage, even if they are rated at 30A. The rover has many subsystems with servos, motors, LiDAR, cameras, ESP32, and Raspberry Pi 5, so there will from time to time be large current needs. Random issues occurred too frequently, leading to crashes, reboots, and restarting Docker images.

Eventually I decided to try developing a much simpler navigation method in order to improve reproducibility and reduce system complexity. However, this approach also introduces some limitations compared to full robotic navigation frameworks.

Advantages of this approach:

  • significantly simpler system setup
  • fewer dependencies
  • deterministic checkpoint behaviour
  • well suited for fixed-route inspection tasks
  • easier reproducibility for teaching and experimentation

Limitations compared to ROS navigation frameworks:

  • requires a teach phase before operation
  • does not support dynamic path planning
  • limited obstacle avoidance compared to SLAM-based navigation
  • route flexibility is lower
  • environment changes may require re-teaching checkpoints
  • not intended for general-purpose autonomous navigation

When ROS navigation may be preferable:

  • large environments
  • changing environments
  • multi-room navigation
  • map-based localization requirements
  • dynamic obstacle-rich scenarios
  • multi-robot coordination systems

For repeatable checkpoint inspection missions, however, the lightweight teach-and-replay approach provides a practical and robust alternative with much lower infrastructure requirements.


10. Conclusion

This project shows that a small rover can do useful inspection work without a full navigation stack. By combining teach-and-replay driving, LiDAR checkpoint verification, AprilTag alignment, camera capture, and local Edge Impulse inference, the rover can move between fixed checkpoints, stop in a repeatable pose, and inspect what it sees in a consistent way.

In the proof-of-concept mission, that approach worked well in practice. The door checkpoints could be checked with visual anomaly detection, and the stove checkpoint could be handled with a simple classification model. The result is a system that is easier to understand, easier to debug, and easier to adapt than a much heavier robotics setup, while still being capable enough for real checkpoint-based inspection tasks.

The main benefit is not that this replaces general-purpose robot navigation. It is that for fixed routes and clearly defined inspection points, a simpler system can often be the better tool. It keeps the hardware and software demands reasonable, makes iteration faster, and lowers the barrier to building something that is genuinely useful. For that kind of work, this approach is a solid place to start.


About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages