CUDA/CUPY kernel for building Common Line matrix

Given the successful speedups using CUDA for parts of the Sync3N algorithm, we should implement a similar GPU implementation for building the CL matrix.

For unit test sized problems our current implementation is tolerable, but for larger experiments (say 3000 images) it can take 5-6 hours with the current python implementation.  The legacy MATLAB code provided both a CPU and GPU implementation, though I am not sure how relevant either are to the implementation that exists in python today (tbd).

Another feature that was nice about the MATLAB code is that it provided a way to store and recall the CL matrix via the workspace.  We can consider optionally writing to disk and providing a method to load from disk.  I expect that might speed up some development tasks in the future.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA/CUPY kernel for building Common Line matrix #1114

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA/CUPY kernel for building Common Line matrix #1114

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions