Real-Time Streaming Adaptation of SAM3

This repository transforms SAM3’s offline video inference into a live, real-time streaming pipeline. Instead of preloading and processing an entire video sequence offline, it ingests frames incrementally and performs per-frame inference on the fly. This allows SAM3 to work with any live video source (e.g. webcams, RTSP streams, YARP ports), enabling online operation and expanding use cases to robotics, teleoperation, live surveillance, AR/VR and real-time content creation.

Installation

This fork uses Pixi for environment management to provide fully reproducible setups with fast and stable dependency resolution.

1. Install Pixi (if needed)

As of December 2025, you can install Pixi on Linux and MacOS via:

curl -fsSL https://pixi.sh/install.sh | sh

Or if you prefer using wget:

wget -qO- https://pixi.sh/install.sh | sh

For Windows or more recent instructions, visit the Pixi installation guide.

2. Create the environment

From the repository root:

pixi install

3. Enter the environment

pixi shell

4. Verify SAM3 loads from this fork

python -c "import sam3; print(sam3.__file__)"

You should see a path pointing to this repository, confirming the editable install.

Quick Start

⚠️ Before using SAM 3, please request access to the checkpoints on the SAM 3 Hugging Face repo. Once accepted, you need to be authenticated to download the checkpoints. You can do this by running the following steps (e.g. hf auth login after generating an access token.)

Demo Notebook

Mirroring the original demo notebooks, examples/sam3_stream_predictor_example.ipynb demonstrates how to run SAM3 in real-time on a video stream. The notebook loads a video file, starts a streaming session, adds a text prompt on frame 0 and pushes frames incrementally, running per-frame inference with optional visualization and FPS reporting.

CLI Script

For a command-line run, use scripts/inference/video_stream.py. It mirrors the notebook flow: opens a live source (webcam/video/YARP), starts a streaming session, adds a text prompt on the first frame and performs per-frame inference (with optional visualization and saving). Note that you can also potentially add more textual prompts in later frames.

Basic webcam example:
- python scripts/inference/video_stream.py --stream_type webcam --webcam_index 0 --viz_results --save_video
Flags of interest:
- --stream_type {webcam|video|yarp}: choose the input source
- --video_path PATH: path to a video file when --stream_type video
- --viz_results: display live overlays
- --save_images / --save_video: store outputs under outputs/<run_id>/
- --run_output_name NAME: set a custom run id (else datetime is used)

Current Limitations and Known Issues

Single-GPU streaming: The provided streaming predictor targets one GPU. Multi-GPU support exists in the base model but isn’t integrated into the streaming predictor yet.
Memory leaking: There seems to be a memory leak during long streaming sessions. The cause is under investigation. I got an OOM after ~5 minutes of streaming at 480p on a NVIDIA GeForce RTX 3090.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
assets		assets
examples		examples
sam3		sam3
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_ORIGINAL.md		README_ORIGINAL.md
README_TRAIN.md		README_TRAIN.md
pixi.lock		pixi.lock
pixi.toml		pixi.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time Streaming Adaptation of SAM3

Installation

1. Install Pixi (if needed)

2. Create the environment

3. Enter the environment

4. Verify SAM3 loads from this fork

Quick Start

Demo Notebook

CLI Script

Current Limitations and Known Issues

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Real-Time Streaming Adaptation of SAM3

Installation

1. Install Pixi (if needed)

2. Create the environment

3. Enter the environment

4. Verify SAM3 loads from this fork

Quick Start

Demo Notebook

CLI Script

Current Limitations and Known Issues

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages