A complete machine learning pipeline for assessing risk in open-world environments using RGB-D vision and Bayesian inference.
This system processes RGB-D video data to generate spatial risk assessments by combining prior knowledge (common-sense understanding) with likelihood observations (real-world data) through Bayesian posterior computation.
The system follows a four-stage Bayesian inference pipeline:
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β πΉ Video β β β π§ Prior β β β π Like- β β β π Post- β
β Processing β β Training β β lihood β β erior β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β β β β
video_processing/ common_sense_priors/ data_deriv_likelihood/ posterior/
Bayesian Framework:
P(risk | observations) β P(observations | risk) Γ P(risk)
Processes RGB-D video data to extract object-level features and risk information.
Key Scripts:
process_video.py- Main processing script with object detectionprocess_video_utils.py- Utility functionsimages_to_video.py- Frame-to-video conversion
π₯ Input: RGB-D video data (.npz files with RGB, depth, pose)
π€ Output: Object masks, features, manipulation scores per frame
Trains models to encode common-sense knowledge about object risk.
Key Scripts:
train_prior.py- Prior model trainingmodel_prior.py- Model architecturedataset_prior.py- Dataset handlingbatch_prior_over_folders.py- Batch processing
π₯ Input: Synthetic risk data, object labels
π€ Output: Trained prior models with risk knowledge
Extracts risk likelihood from real video observations.
Key Scripts:
inference_riskfield_video.py- Main inference scriptinference_riskfield_video_lite.py- Lightweight versionbuild_lookup_likelihood.py- Lookup table builderlikelihood_from_single_frame.py- Single frame processor
π₯ Input: Processed video from
video_processing/
π€ Output: Risk likelihood maps per frame
Combines prior knowledge with likelihood observations for final risk assessment.
Key Scripts:
combine_likelihood_and_prior.py- Basic combination (single object)combine_likelihood_and_prior_tree.py- Advanced with depth modulationcombine_likelihood_and_prior_tree_.py- Multi-object without depthposterior_to_ply.py- 3D point cloud generator
π₯ Input: Prior risk maps + likelihood risk maps
π€ Output: Final posterior risk assessments (2D + 3D)
docker pull bayesianrisk/open_world_risk_main:latest &
docker pull bayesianrisk/open_world_risk_hoist_former:latest &docker network create open_world_risk_netStart Hoist Former Container (Server)
docker run --rm \
--runtime=nvidia --gpus all \
--network open_world_risk_net \
-p 5002:5000 \
-v $(pwd):/workspace -w /workspace \
-e PYTHONWARNINGS=ignore::FutureWarning \
-e PYTHONPATH=/workspace/open_world_risk/hoist_former/Mask2Former \
--name hoist_former \
bayesianrisk/open_world_risk_hoist_former:latest \
/bin/bash -c "source /home/ubuntu/miniconda3/etc/profile.d/conda.sh && \
conda activate hoist2 && \
cd /workspace/open_world_risk/hoist_former/Mask2Former && \
pip install -qq -r requirements.txt && \
cd mask2former/modeling/pixel_decoder/ops && \
sh make.sh > /dev/null 2>&1 && \
pip install -qq setuptools==59.5.0 && \
cd /workspace && \
python -c 'import mask2former; print(\"π Mask2Former ready!\")' && \
python open_world_risk/hoist_former/api/endpoints.py"Start Main Container (Interactive)
docker run -it --rm \
--runtime=nvidia --gpus all \
--shm-size=8gb \
--network open_world_risk_net \
-p 5001:5000 \
-v $(pwd):/workspace -w /workspace \
-e PYTHONWARNINGS=ignore::UserWarning,ignore::FutureWarning \
--name main \
bayesianrisk/open_world_risk_main:latest \
bashActivate Python Environment (inside main container)
conda activate feature_splatting2Execute the following steps in order (all commands run inside the main container): 0. **Download Docker Images and Run Containers Pull images from Docker Hub
docker pull bayesianrisk/open_world_risk_main:latest &
docker pull bayesianrisk/open_world_risk_hoist_former:latest &Create Network for Containers to Communicate
docker network create open_world_risk_netRun hoist_former container as server
docker run --rm \
--runtime=nvidia --gpus all \
--network open_world_risk_net \
-p 5002:5000 \
-v $(pwd):/workspace -w /workspace \
-e PYTHONWARNINGS=ignore::FutureWarning \
-e PYTHONPATH=/workspace/open_world_risk/hoist_former/Mask2Former \
--name hoist_former \
bayesianrisk/open_world_risk_hoist_former:latest \
/bin/bash -c "source /home/ubuntu/miniconda3/etc/profile.d/conda.sh && conda activate hoist2 && cd /workspace/open_world_risk/hoist_former/Mask2Former && pip install -qq -r requirements.txt && cd mask2former/modeling/pixel_decoder/ops && sh make.sh > /dev/null 2>&1 && pip install -qq setuptools==59.5.0 && cd /workspace && python -c 'import mask2former; print(\"πππ Mask2Former import successful! πππ\")' && python open_world_risk/hoist_former/api/endpoints.py"Run main container interactively (all subsequent commands will be executed in the main container)
docker run -it --rm \
--runtime=nvidia --gpus all \
--shm-size=8gb \
--network open_world_risk_net \
-p 5001:5000 \
-v $(pwd):/workspace -w /workspace \
-e PYTHONWARNINGS=ignore::UserWarning,ignore::FutureWarning \
--name main \
bayesianrisk/open_world_risk_main:latest \
bashIn main container activate the python environment
conda activate feature_splatting2Extract object features and risk information from RGB-D video:
python open_world_risk/video_processing/process_video.pyTrain models on common-sense risk knowledge:
python open_world_risk/common_sense_priors/train_prior.pyExtract risk likelihood from processed video:
python open_world_risk/data_deriv_likelihood/inference_riskfield_video.pyCompute final posterior risk assessments:
python open_world_risk/posterior/combine_likelihood_and_prior_tree.pyConvert posterior to 3D point clouds:
python open_world_risk/posterior/posterior_to_ply.pyEach pipeline component uses YAML configuration files for easy customization:
| Component | Config File |
|---|---|
| πΉ Video Processing | video_processing/process_video_config.yaml |
| π§ Prior Training | common_sense_priors/train_prior_config.yaml |
| π Likelihood | data_deriv_likelihood/infer_risk_field_vid_cfg.yaml |
| π Posterior | posterior/combine_tree_config.yaml |
Before running the training scripts, you must configure your Weights & Biases credentials:
- Create a WandB account at wandb.ai (if you don't have one)
- Update the following config files with your WandB username:
open_world_risk/train/train_config.yaml(line 17)open_world_risk/common_sense_priors/train_prior_config.yaml(line 17)
Replace "userA" with your actual WandB username:
wandb:
project: "open_world_risk_0"
run_name: "rf_bezier_supervision"
entity: "your_wandb_username" # β Replace "userA" with YOUR usernameπ‘ Tip: Set
entity: nullif you're not using a WandB team/organization.
- π― Multi-object Risk Assessment - Handles multiple objects simultaneously
- π Depth-aware Processing - Incorporates depth information for 3D risk modeling
- π¨ 3D Visualization - Generates PLY point clouds for risk visualization
- π§© Modular Design - Each component can be used independently
- π Docker (to download containers with environment)
- π₯οΈ NVIDIA GPU (CUDA-compatible) required for training/inference
(We used an RTX 4090 for our experiments.)