Skip to content

emirhansarioglu/Parallel_ICP_Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

3D Point Cloud Alignment (ICP)

A C++ and CUDA implementation of the Iterative Closest Point (ICP) algorithm, both with brute force approach and KD-Tree implementation. This project demonstrates point cloud alignment and compares CPU vs GPU performance using parallel computing.

Features

  • CPU Implementation: Uses Eigen for SVD-based rigid transformations
  • CUDA Implementation: GPU-accelerated version with parallel nearest neighbor search, centroid computation, and transformation (Tested on remote GPU server in Linux only)
  • Visualization: Open3D-based replay tool (Tested on Windows only)
  • Performance Comparison: Demonstrates 300x speedup with GPU acceleration on large point clouds

Algorithm Details

ICP Pipeline

  1. Nearest Neighbor Search: Find closest target point for each source point using K-d Tree
  2. Centroid Computation: Calculate centers of mass for both point sets
  3. SVD Transformation: Compute optimal rigid transformation (rotation + translation)
  4. Apply Transform: Update source points
  5. Iterate: Repeat until convergence (50 iterations)

CUDA Optimizations

  • Parallel Nearest Neighbor Search:

Each thread is assigned to process one source point, enabling massive parallelism across the point cloud.

  • Shared Memory Tiling: (icp_cuda_v2)

To minimize slow global memory access, target points are cooperatively loaded into fast, on-chip shared memory in blocks (tiles). This reduces global memory reads by a factor equal to the TILE_SIZE.

  • Memory Alignment with float4: (icp_cuda_v3)

Data structures were shifted from float3 to float4 to ensure 128-bit memory alignment. This allows the GPU to perform perfectly coalesced memory transactions, maximizing effective bandwidth.

  • Shared Memory Reductions:

Centroid computation utilizes a parallel tree-reduction pattern in shared memory to avoid race conditions and safely sum point coordinates.

  • GPU-Accelerated Covariance Matrix:

The 3X3 covariance matrix is calculated using a hybrid approach where the GPU performs the heavy summation of outer products via atomicAdd and the CPU handles the final SVD.

  • Parallel Point Transformation:

Once the optimal rotation and translation are found, a specialized kernel applies the 4X4 transformation matrix to the entire source cloud in a single parallel pass.

Computerphile has videos on both ICP and KD-Tree:

ICP_Algorithm KD-Tree

Prerequisites

For Windows

  • Windows 10/11
  • Visual Studio 2022 (C++ Desktop Development workload)
  • CMake 3.18+
  • Open3D libraries (e.g., D:/Open3D/open3d-devel-windows-amd64-0.19.0/CMake)

For Linux

  • CUDA Toolkit 11.0+ with compatible NVIDIA GPU
  • CMake 3.18+
  • Eigen3 library

Installation

Eigen3 and OPEN3D Setup (for Linux)

sudo apt-get install libeigen3-dev
sudo apt-get install libopen3d-dev

If the sudo command does not work, download pre-built release (https://github.com/isl-org/Open3D/releases)

cd ~
wget https://github.com/isl-org/Open3D/releases/download/v0.18.0/open3d-devel-linux-x86_64-cxx11-abi-0.18.0.tar.xz
tar -xf open3d-devel-linux-x86_64-cxx11-abi-0.18.0.tar.xz

Update CMakeLists.txt to point to it, e.g: set(Open3D_DIR "$ENV{HOME}/open3d-devel-linux-x86_64-cxx11-abi-0.18.0/lib/cmake/Open3D")

OPEN3D Setup (for Windows)

Follow the guide here https://www.open3d.org/docs/release/getting_started.html#c

(Note: Eigen is also installed with Open3D, no need for Eigen installation if you are on Windows)

Update CMakeLists.txt to point to it, e.g: set(Open3D_DIR "D:/Open3D/open3d-devel-windows-amd64-0.19.0/CMake")

Build Instructions

Windows

# Open x64 Native Tools Command Prompt for VS 2022
mkdir build
cd build
cmake ..
cmake --build . --config Release

Linux

mkdir build && cd build
cmake ..
make -j$(nproc)

Usage

1. Generate Target Data

Create a transformed target point cloud:

Windows:

.\build\Release\generate_data.exe data/bun000.ply

Linux:

./build/generate_data data/bun000.ply

Output: data/bun000_target.ply

2. Run ICP Alignment

CPU Version: Remove kd_tree argument for brute force nearest neighbor search

Windows:

.\build\Release\icp_engine.exe data/bun000.ply data/bun000_target.ply kd_tree

Linux:

./build/icp_engine data/bun000.ply data/bun000_target.ply kd_tree

CUDA Version:

Not tested on Windows.

Linux:

./build/icp_cuda data/bun000.ply data/bun000_target.ply
./build/icp_cuda_kdtree data/bun000.ply data/bun000_target.ply

Output: Creates frames/ directory with intermediate alignment steps (iter_0.ply, iter_5.ply, etc.)

3. Visualize Results

.\build\Release\icp_vis.exe data/bun000_target.ply

Not tested on Linux.

Opens an interactive Open3D window showing the alignment animation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors