Skip to content

MeetShraddha/EfficientSegNet

Repository files navigation

EfficientSegNet: Real-Time Uncertainty-Aware Point Cloud Segmentation

A PyTorch implementation of EfficientSegNet for real-time instance segmentation of 3D point clouds with integrated Bayesian uncertainty quantification. Designed for robotic applications requiring both speed and confidence estimates.

Features

  • Hierarchical Coarse-to-Fine Architecture: 3.3× speedup over standard learnable region-growing methods
  • Bayesian Uncertainty Quantification:
    • Epistemic uncertainty (model uncertainty) via Monte Carlo dropout
    • Aleatoric uncertainty (data uncertainty) with learned per-point estimates
  • Class-Agnostic: Single model segments objects of any class without shape/size assumptions
  • Real-Time Performance: ~720ms inference on GPU for 4096-point clouds
  • Robot-Ready: Integration with grasp planning and manipulation workflows

Installation

Requirements

  • Python 3.8+
  • PyTorch 1.9+
  • NumPy
  • Matplotlib (for visualization)
  • tqdm

Setup

# Clone or download the implementation
cd efficient_segnet

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install torch numpy matplotlib tqdm

# Verify installation
python -c "import torch; print(f'PyTorch version: {torch.__version__}')"

Quick Start

1. Basic Inference

from inference_efficient_segnet import SegmentationInference, PostProcessor
import numpy as np

# Initialize inference engine
segmenter = SegmentationInference('model_checkpoint.pt', device='cuda')

# Load point cloud (N, 3) array
points = np.random.randn(5000, 3)  # Replace with real data

# Run segmentation
result = segmenter.process_point_cloud(
    points,
    return_uncertainty=True,
    confidence_threshold=0.5,
)

# Extract instances
instances = PostProcessor.extract_instances(result, points)
print(f"Detected {result.num_instances} instances")
print(f"Inference time: {result.inference_time:.3f}s")

2. Training from Scratch

from train_efficient_segnet import Trainer, S3DISDataset, PointCloudCollator
from efficient_segnet_model import EfficientSegNet
import torch
from torch.utils.data import DataLoader

# Create model
model = EfficientSegNet(
    in_channels=3,
    feature_dim=512,
    hidden_dim=256,
    max_instances=32,
    num_mc_samples=5,
)

# Create data loaders
train_dataset = S3DISDataset(split='train', num_points=4096)
val_dataset = S3DISDataset(split='val', num_points=4096)

collator = PointCloudCollator(max_instances=32)
train_loader = DataLoader(train_dataset, batch_size=8, collate_fn=collator, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=8, collate_fn=collator)

# Train
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
trainer = Trainer(model, train_loader, val_loader, device=device, num_epochs=100)
metrics = trainer.train()

3. Robotic Manipulation Pipeline

from inference_efficient_segnet import SegmentationInference, RobotGraspPlanner
from utils_efficient_segnet import PostProcessor
import numpy as np

# Initialize components
segmenter = SegmentationInference('model_checkpoint.pt')
grasp_planner = RobotGraspPlanner(gripper_width=0.1)

# Get point cloud from sensor (e.g., RGB-D camera)
points = get_point_cloud_from_sensor()  # Your sensor interface

# Segment point cloud
result = segmenter.process_point_cloud(points, return_uncertainty=True)

# Extract instances
instances = PostProcessor.extract_instances(result, points, min_points=50)

# Plan grasps for each instance
all_grasps = []
for inst_id, inst_points in instances.items():
    # Get confidence of this instance
    inst_confidence = result.confidence[result.instance_labels == inst_id].mean()
    
    # Skip low-confidence predictions
    if inst_confidence < 0.5:
        continue
    
    # Plan grasps
    grasps = grasp_planner.plan_grasps(inst_id, inst_points, inst_confidence)
    all_grasps.extend(grasps)

# Select best grasp with quality threshold
best_grasp = grasp_planner.select_best_grasp(all_grasps, quality_threshold=0.6)

if best_grasp:
    # Execute grasp with robot
    execute_grasp(best_grasp)

4. Uncertainty Analysis

from inference_efficient_segnet import SegmentationInference
from utils_efficient_segnet import PointCloudVisualizer, EvaluationMetrics
import numpy as np

# Run inference
segmenter = SegmentationInference('model_checkpoint.pt')
result = segmenter.process_point_cloud(points, return_uncertainty=True)

# Visualize uncertainty
fig = PointCloudVisualizer.visualize_uncertainty(
    points,
    result.epistemic_uncertainty,
    result.aleatoric_uncertainty,
)
plt.show()

# Evaluate uncertainty quality
metrics = EvaluationMetrics.compute_all_metrics(
    result.instance_labels,
    ground_truth_labels,
    result.confidence,
)
print(f"Calibration Error: {metrics['ECE']:.4f}")
print(f"ROC-AUC: {metrics['AUC_ROC']:.4f}")

Model Architecture

Hierarchical Encoder

  • Coarse Stage: Downsamples to 20% density via farthest point sampling

    • PointNet-style hierarchical feature extraction
    • Outputs coarse instance proposals and confidence scores
  • Refinement Stage: Confidence-guided processing

    • Only refines high-confidence regions (confidence > 0.7)
    • Reduces computation by 3-5× compared to uniform processing

Region Growing Module

  • Iterative refinement using GRU cells
  • Adaptive iteration counts (2-8 iterations) based on confidence
  • Predicts add/remove probabilities for boundary points

Uncertainty Quantification

  • Epistemic Uncertainty: Monte Carlo dropout with concrete dropout

    • Computed as variance of predictions across T=5 dropout samples
    • Captures model uncertainty about predictions
  • Aleatoric Uncertainty: Learned per-point with probabilistic loss

    • Predicts data-dependent uncertainty
    • Higher for ambiguous points (e.g., boundaries)

Instance Segmentation Head

  • Embedding-based approach for instance discrimination
  • Multi-resolution feature fusion
  • Supports up to 32 instances per point cloud

API Reference

Model Class

model = EfficientSegNet(
    in_channels=3,              # Point cloud feature dimension
    feature_dim=512,            # Hidden feature dimension
    hidden_dim=256,             # GRU hidden dimension
    downsample_ratio=0.2,       # Coarse stage downsampling
    num_mc_samples=5,           # MC dropout samples
    max_instances=32,           # Maximum instances
)

# Forward pass
outputs = model(
    points,                     # (B, N, 3) point coordinates
    return_uncertainty=True,    # Compute uncertainty
    mc_samples=5,              # Number of MC samples
)

# Output keys:
# - instance_labels: (B, N) predicted instance IDs
# - confidence: (B, N) per-point confidence scores
# - epistemic_uncertainty: (B, N) model uncertainty
# - aleatoric_uncertainty: (B, N) data uncertainty
# - embeddings: (B, N, 16) learned representations

Inference Engine

segmenter = SegmentationInference(
    model_path='checkpoint.pt',
    device='cuda',
    num_mc_samples=5,
)

result = segmenter.process_point_cloud(
    points,                     # (N, 3) numpy array
    return_uncertainty=True,    # Compute uncertainties
    confidence_threshold=0.5,   # Filter low-confidence points
    max_points=10000,          # Downsample if needed
)

Evaluation Metrics

metrics = EvaluationMetrics.compute_all_metrics(
    predictions,        # (N,) predicted labels
    targets,           # (N,) ground truth labels
    confidence,        # (N,) confidence scores
)

# Returns: accuracy, mAP, mIoU, mPQ, mSQ, mRQ, AUC_ROC, ECE

Grasp Planning

planner = RobotGraspPlanner(
    gripper_width=0.1,      # Maximum gripper opening (meters)
    max_grasp_quality=1.0,
)

grasps = planner.plan_grasps(
    instance_id=1,          # Instance to grasp
    instance_points=points, # (N, 3) instance point cloud
    confidence=0.8,         # Segmentation confidence
    num_candidates=10,      # Number of grasp candidates
)

# Select best grasp
best_grasp = planner.select_best_grasp(grasps, quality_threshold=0.6)

Performance Benchmarks

Inference Speed

  • Coarse Stage: 50-100ms (downsampled 20% of points)
  • Refinement: 500-600ms (adaptive, confidence-dependent)
  • MC Dropout: +30-40ms per sample
  • Total: ~720ms per 4096-point cloud on single GPU

Accuracy (S3DIS)

Metric ESN Region Growing [4]
mAP 63.8% 64.3%
mPQ 50.7% 51.2%
mSQ 77.9% 78.1%
mRQ 65.2% 65.6%

Robot Task Performance

  • Pick Success Rate: 82.1% (vs 74.2% baseline)
  • Task Success Rate: 71.2% (vs 60.3% baseline)
  • 18% improvement with uncertainty filtering

Configuration Files

Default Configuration

{
  "model": {
    "in_channels": 3,
    "feature_dim": 512,
    "hidden_dim": 256,
    "max_instances": 32,
    "num_mc_samples": 5
  },
  "training": {
    "batch_size": 8,
    "learning_rate": 0.001,
    "weight_decay": 1e-4,
    "num_epochs": 100,
    "num_points": 4096
  },
  "inference": {
    "confidence_threshold": 0.5,
    "num_mc_samples": 5,
    "max_points": 10000
  }
}

Data Format

Input Point Cloud

points = np.array([
    [x1, y1, z1],
    [x2, y2, z2],
    ...
    [xN, yN, zN],
])  # Shape: (N, 3), dtype: float32

Ground Truth Labels (Training)

instance_labels = np.array([
    0,  # Background
    1,  # Instance 1
    1,  # Instance 1
    2,  # Instance 2
    ...
])  # Shape: (N,), dtype: int32

Common Issues and Solutions

Out of Memory (OOM)

# Reduce batch size
train_loader = DataLoader(..., batch_size=4)

# Reduce number of points
result = segmenter.process_point_cloud(
    points, 
    max_points=5000  # Default is 10000
)

# Disable MC dropout during inference
outputs = model(points, return_uncertainty=False)

Poor Uncertainty Calibration

# Increase number of MC samples
model = EfficientSegNet(num_mc_samples=10)

# Use higher learning rate for aleatoric loss
loss_fn = EfficientSegNetLoss(lambda_aleatoric=1.0)

Slow Inference

# Use deterministic mode (single forward pass)
outputs = model(points, return_uncertainty=False)  # ~500ms

# Reduce point cloud resolution
result = segmenter.process_point_cloud(
    points,
    max_points=2000,
)

# Use CPU inference for small batches
model = model.to('cpu')

File Structure

efficient_segnet/
├── efficient_segnet_model.py      # Core model implementation
├── train_efficient_segnet.py       # Training loop and data loading
├── inference_efficient_segnet.py   # Inference engine and robot integration
├── utils_efficient_segnet.py       # Utilities, metrics, visualization
├── example_usage.py                # Complete usage examples
├── README.md                       # This file
└── requirements.txt                # Dependencies

Citation

If you use this implementation, please cite the accompanying research paper:

@article{efficientSegNet2026,
  title={Real-Time Uncertainty-Aware Point Cloud Segmentation for Autonomous Robotic Manipulation},
  author={Shraddha Sharma},
  year={2026}
}

License

MIT License - see LICENSE file for details

Contact and Support

For issues, questions, or contributions:

  1. Check existing issues and documentation
  2. Refer to the research paper for methodology details
  3. Review example scripts in example_usage.py

References

  1. J Chen et al. "LRGNet: Learnable Region Growing for Class-Agnostic Point Cloud Segmentation"
  2. Qi et al. (2017). "PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space"
  3. Gal & Ghahramani (2016). "Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning"
  4. Kendall & Gal (2017). "What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?"
  5. Berman et al. (2018). "The Lovász-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-over-Union Measure in Neural Networks"

Changelog

Version 1.0.0 (Initial Release)

  • Core model architecture with hierarchical encoder
  • Bayesian uncertainty quantification (epistemic + aleatoric)
  • Training pipeline with S3DIS/ScanNet support
  • Inference engine with real-time performance
  • Robot manipulation pipeline integration
  • Comprehensive evaluation metrics
  • Visualization and debugging tools

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages