Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 0 additions & 34 deletions .github/workflows/formatter.yml

This file was deleted.

17 changes: 17 additions & 0 deletions .github/workflows/pre-commit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: pre-commit

on:
pull_request:
push:
branches: [master]

jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends clang-format
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
- uses: pre-commit/action@v2.0.3
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
.vscode/
build/
*.pyc
*.pyc
63 changes: 63 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.0.1
hooks:
- id: trailing-whitespace
name: (Common) Remove trailing whitespaces
- id: mixed-line-ending
name: (Common) Fix mixed line ending
args: ['--fix=lf']
- id: end-of-file-fixer
name: (Common) Remove extra EOF newlines
- id: check-merge-conflict
name: (Common) Check for merge conflicts
- id: requirements-txt-fixer
name: (Common) Sort "requirements.txt"
- id: fix-encoding-pragma
name: (Python) Remove encoding pragmas
args: ['--remove']
- id: double-quote-string-fixer
name: (Python) Fix double-quoted strings
- id: debug-statements
name: (Python) Check for debugger imports
- id: check-json
name: (JSON) Check syntax
- id: check-yaml
name: (YAML) Check syntax
- id: check-toml
name: (TOML) Check syntax
- repo: https://github.com/asottile/pyupgrade
rev: v2.19.4
hooks:
- id: pyupgrade
name: (Python) Update syntax for newer versions
args: ['--py36-plus']
- repo: https://github.com/google/yapf
rev: v0.31.0
hooks:
- id: yapf
name: (Python) Format with yapf
- repo: https://github.com/pycqa/isort
rev: 5.8.0
hooks:
- id: isort
name: (Python) Sort imports with isort
- repo: https://github.com/pycqa/flake8
rev: 3.9.2
hooks:
- id: flake8
name: (Python) Check with flake8
additional_dependencies: [flake8-bugbear, flake8-comprehensions, flake8-docstrings, flake8-executable, flake8-quotes]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.902
hooks:
- id: mypy
name: (Python) Check with mypy
additional_dependencies: [tokenize-rt]
- repo: local
hooks:
- id: clang-format
name: (C/C++/CUDA) Format with clang-format
entry: clang-format -style=google -i
language: system
files: \.(h\+\+|h|hh|hxx|hpp|cuh|c|cc|cpp|cu|c\+\+|cxx|tpp|txx)$
30 changes: 1 addition & 29 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2020 Haotian Tang, Zhijian Liu, Song Han
Copyright (c) 2020-2021 TorchSparse Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand All @@ -19,31 +19,3 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

--------------------------- LICENSE FOR MinkowskiEngine --------------------------------
MIT License

Copyright (c) 2020 NVIDIA CORPORATION.
Copyright (c) 2018-2020 Chris Choy (chrischoy@ai.stanford.edu)

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Please cite "4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural
Networks", CVPR'19 (https://arxiv.org/abs/1904.08755) if you use any part
of the code.
151 changes: 51 additions & 100 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,141 +1,92 @@
# TorchSparse

## News
TorchSparse is a high-performance neural network library for point cloud processing.

2020/09/20: We released `torchsparse` v1.1, which is significantly faster than our `torchsparse` v1.0 and is also achieves **1.9x** speedup over [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) v0.5 alpha when running MinkUNet18C!

2020/08/30: We released `torchsparse` v1.0.

## Overview
## Installation

We release `torchsparse`, a high-performance computing library for efficient 3D sparse convolution. This library aims at accelerating sparse computation in 3D, in particular the Sparse Convolution operation.
TorchSparse depends on the [Google Sparse Hash](https://github.com/sparsehash/sparsehash) library.

<img src="https://hanlab.mit.edu/projects/spvnas/figures/sparseconv_illustration.gif" width="1080">
* On Ubuntu, it can be installed by

The major advantage of this library is that we support all computation on the GPU, especially the kernel map construction (which is done on the CPU in latest [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) V0.4.3).
```bash
sudo apt-get install libsparsehash-dev
```

## Installation
* On Mac OS, it can be installed by

You may run the following command to install torchsparse.
```bash
brew install google-sparsehash
```

```bash
pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git
```
* You can also compile the library locally (if you do not have the sudo permission) and add the library path to the environment variable `CPLUS_INCLUDE_PATH`.

Note that this library depends on Google's [sparse hash map project](https://github.com/sparsehash/sparsehash). In order to install this library, you may run
The latest released TorchSparse (v1.4.0) can then be installed by

```bash
sudo apt-get install libsparsehash-dev
pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git@v1.4.0
```

on Ubuntu servers. If you are not sudo, please clone Google's codebase, compile it and install locally. Finally, add the path to this library to your `CPLUS_INCLUDE_PATH` environmental variable.

For GPU server users, we currently support PyTorch 1.6.0 + CUDA 10.2 + CUDNN 7.6.2. For CPU users, we support PyTorch 1.6.0 (CPU version), MKLDNN backend is optional.

## Usage

Our [SPVNAS](https://github.com/mit-han-lab/e3d) project (ECCV2020) is built with torchsparse. You may navigate to this project and follow the instructions in that codebase to play around.

Here, we also provide a walk-through on some important concepts in torchsparse.

### Sparse Tensor and Point Tensor

In torchsparse, we have two data structures for point cloud storage, namely `torchsparse.SparseTensor` and `torchsparse.PointTensor`. Both structures has two data fields `C` (coordinates) and `F` (features). In `SparseTensor`, we assume that all coordinates are **integer** and **do not duplicate**. However, in `PointTensor`, all coordinates are **floating-point** and can duplicate.

### Sparse Quantize and Sparse Collate

The way to convert a point cloud to `SparseTensor` so that it can be consumed by networks built with Sparse Convolution or Sparse Point-Voxel Convolution is to use the function `torchsparse.utils.sparse_quantize`. An example is given here:

```python
inds, labels, inverse_map = sparse_quantize(pc, feat, labels, return_index=True, return_invs=True)
```
If you use TorchSparse in your code, please remember to specify the exact version as your dependencies.

where `pc`, `feat`, `labels` corresponds to point cloud (coordinates, should be integer), feature and ground-truth. The `inds` denotes unique indices in the point cloud coordinates, and `inverse_map` denotes the unique index each point is corresponding to. The `inverse map` is used to restore full point cloud prediction from downsampled prediction.
## Benchmark

To combine a list of `SparseTensor`s to a batch, you may want to use the `torchsparse.utils.sparse_collate_fn` function.
We compare TorchSparse with [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) (where the latency is measured on NVIDIA GTX 1080Ti):

Detailed results are given in [SemanticKITTI dataset preprocessing code](https://github.com/mit-han-lab/e3d/blob/master/spvnas/core/datasets/semantic_kitti.py) in our [SPVNAS](https://github.com/mit-han-lab/e3d) project.
| | MinkowskiEngine v0.4.3 | TorchSparse v1.0.0 |
| :----------------------- | :--------------------: | :----------------: |
| MinkUNet18C (MACs / 10) | 224.7 ms | 124.3 ms |
| MinkUNet18C (MACs / 4) | 244.3 ms | 160.9 ms |
| MinkUNet18C (MACs / 2.5) | 269.6 ms | 214.3 ms |
| MinkUNet18C | 323.5 ms | 294.0 ms |

### Computation API
## Getting Started

The computation interface in torchsparse is straightforward and very similar to original PyTorch. An example here defines a basic convolution block:
### Sparse Tensor

```python
class BasicConvolutionBlock(nn.Module):
def __init__(self, inc, outc, ks=3, stride=1, dilation=1):
super().__init__()
self.net = nn.Sequential(
spnn.Conv3d(inc, outc, kernel_size=ks, dilation=dilation, stride=stride),
spnn.BatchNorm(outc),
spnn.ReLU(True)
)

def forward(self, x):
out = self.net(x)
return out
```
Sparse tensor (`SparseTensor`) is the main data structure for point cloud, which has two data fields:
* Coordinates (`coords`): a 2D integer tensor with a shape of N x 4, where the first three dimensions correspond to quantized x, y, z coordinates, and the last dimension denotes the batch index.
* Features (`feats`): a 2D tensor with a shape of N x C, where C is the number of feature channels.

where `spnn`denotes `torchsparse.nn`, and `spnn.Conv3d` means 3D sparse convolution operation, `spnn.BatchNorm` and `spnn.ReLU` denotes 3D sparse tensor batchnorm and activations, respectively. We also support direct convolution kernel call via `torchsparse.nn.functional`, for example:
Most existing datasets provide raw point cloud data with float coordinates. We can use `sparse_quantize` (provided in `torchsparse.utils.quantize`) to voxelize x, y, z coordinates and remove duplicates:

```python
outputs = torchsparse.nn.functional.conv3d(inputs, kernel, stride=1, dilation=1, transpose=False)
coords -= np.min(coords, axis=0, keepdims=True)
coords, indices = sparse_quantize(coords, voxel_size, return_index=True)
coords = torch.tensor(coords, dtype=torch.int)
feats = torch.tensor(feats[indices], dtype=torch.float)
tensor = SparseTensor(coords=coords, feats=feats)
```

where we need to define `inputs`(SparseTensor), `kernel` (of shape k^3 x OC x IC when k > 1, or OC x IC when k = 1, where k denotes the kernel size and IC, OC means input / output channels). The `outputs` is still a SparseTensor.
We can then use `sparse_collate_fn` (provided in `torchsparse.utils.collate`) to assemble a batch of `SparseTensor`'s (and add the batch dimension to `coords`). Please refer to [this example](https://github.com/mit-han-lab/torchsparse/blob/dev/pre-commit/examples/example.py) for more details.

Detailed examples are given in [here](https://github.com/mit-han-lab/e3d/blob/master/spvnas/core/modules/dynamic_sparseop.py), where we use the `torchsparse.nn.functional` interfaces to implement weight-shared 3D-NAS modules.
### Sparse Neural Network

### Sparse Hashmap API

Sparse hash map query is important in 3D sparse computation. It is mainly used to infer a point's memory location (*i.e.* index) given its coordinates. For example, we use this operation in kernel map construction part of 3D sparse convolution, and also sparse voxelization / devoxelization in [Sparse Point-Voxel Convolution](https://arxiv.org/abs/2007.16100). Here, we provide the following example for hash map API:
The neural network interface in TorchSparse is very similar to PyTorch:

```python
source_hash = torchsparse.nn.functional.sphash(torch.floor(source_coords).int())
target_hash = torchsparse.nn.functional.sphash(torch.floor(target_coords).int())
idx_query = torchsparse.nn.functional.sphashquery(source_hash, target_hash)
from torch import nn
from torchsparse import nn as spnn

model = nn.Sequential(
spnn.Conv3d(in_channels, out_channels, kernel_size),
spnn.BatchNorm(out_channels),
spnn.ReLU(True),
)
```

In this example, `sphash` is the function converting integer coordinates to hashing. The `sphashquery(source_hash, target_hash)` performs the hash table lookup. Here, the hash map has key `target_hash` and value corresponding to point indices in the target point cloud tensor. For each point in the `source_coords`, we find the point index in `target_coords` which has the same coordinate as it.

### Dummy Training Example

We here provides an entire training example with dummy input [here](examples/example.py). In this example, we cover

- How we start from point cloud data and convert it to SparseTensor format;
- How we can implement SparseTensor batching;
- How to train a semantic segmentation SparseConvNet.

You are also welcomed to check out our [SPVNAS](https://github.com/mit-han-lab/e3d) project to implement training / inference with real data.

### Mixed Precision (float16) Support

Mixed precision training is supported via `torch.cuda.amp.autocast` and `torch.cuda.amp.GradScaler`. Enabling mixed precision training can speed up training and reduce GPU memory usage. By wrapping your training code in a `torch.cuda.amp.autocast` block, feature tensors will automatically be converted to float16 if possible. See [here](examples/example.py) for a complete example.

## Speed Comparison Between torchsparse and MinkowskiEngine

We benchmark the performance of our torchsparse and latest [MinkowskiEngine V0.4.3](https://github.com/NVIDIA/MinkowskiEngine) here, latency is measured on NVIDIA GTX 1080Ti GPU:

| Network | Latency (ME V0.4.3) | Latency (torchsparse V1.0.0) |
| :----------------------: | :-----------------: | :--------------------------: |
| MinkUNet18C (MACs / 10) | 224.7 | 124.3 |
| MinkUNet18C (MACs / 4) | 244.3 | 160.9 |
| MinkUNet18C (MACs / 2.5) | 269.6 | 214.3 |
| MinkUNet18C | 323.5 | 294.0 |

## Citation

If you find this code useful, please consider citing:
If you use TorchSparse in your research, please use the following BibTeX entry:

```bibtex
@inproceedings{
tang2020searching,
title = {Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution},
author = {Tang, Haotian* and Liu, Zhijian* and Zhao, Shengyu and Lin, Yujun and Lin, Ji and Wang, Hanrui and Han, Song},
booktitle = {European Conference on Computer Vision},
@inproceedings{tang2020searching,
title = {{Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution}},
author = {Tang, Haotian and Liu, Zhijian and Zhao, Shengyu and Lin, Yujun and Lin, Ji and Wang, Hanrui and Han, Song},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2020}
}
```

## Acknowledgements

This library is inspired by [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine), [SECOND](https://github.com/traveller59/second.pytorch) and [SparseConvNet](https://github.com/facebookresearch/SparseConvNet).
TorchSparse is inspired by many existing open-source libraries, including (but not limited to) [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine), [SECOND](https://github.com/traveller59/second.pytorch) and [SparseConvNet](https://github.com/facebookresearch/SparseConvNet).
Loading