# Quickly label point clouds by using pre-annotations
In this tutorial, we will upload predicted labels (pre-annotations) to speed up **semantic point cloud labeling**. Our goal is to label 3 SemanticKITTI frames.

Run the following cell to install the dependencies in Colab.

Additionally, be sure to use a GPU-powered runtime by selecting it under `Runtime > Change runtime type` in the top bar of Colab.

In [None]:
! git clone https://github.com/segments-ai/demo-pointcloud-segmentation.git
! pip install segments-ai
%cd demo-pointcloud-segmentation/

## 1. Upload your point cloud frames

In this notebook, we will create and populate a dataset on Segments.ai programmatically. For this, you need an API key, which can be created on your [account page](https://segments.ai/account).

First, we will initialize the client:

In [None]:
from segments import SegmentsClient

api_key = 'API_KEY'  # paste your API key here
client = SegmentsClient(api_key)

Next, we will create a new dataset for point cloud segmentation.

In [None]:
from utils import get_kitti_attributes

dataset_name = "pointcloud-pre-annotations"
task_type = 'pointcloud-segmentation'

task_attributes = get_kitti_attributes()

dataset = client.add_dataset(dataset_name, task_type=task_type, task_attributes=task_attributes)
dataset_identifier = dataset['owner']['username'] + "/" + dataset_name

Now, we are ready to upload the point cloud frames to the newly created dataset.

In [None]:
import os

# Upload the samples to Segments.ai
for filename in os.listdir('data/sequences/00/velodyne'):
  with open(os.path.join('data/sequences/00/velodyne', filename), 'rb') as f:
    pcd_asset = client.upload_asset(f, filename + '.bin')

  attributes = {
    "pcd": {
        "url": pcd_asset['url']
    },
  }

  try:
    client.add_sample(dataset_identifier, filename, attributes)
  except:
    continue

Once the images are uploaded, go to the [web interface](https://segments.ai/home) and open the new dataset. In the samples tab you can see the uploaded frames. Try to label a frame now and see how long it takes you. Press the "Save" button when you're done.

## 2. Train a segmentation model (or set-up another algorithm)

Now that you've labeled a frame, go to the releases tab of your dataset and create a new release, for example with the name "v0.1". A release is a snapshot of your dataset at a particular point in time.

Through the Python SDK, we can now initialize a SegmentsDataset from this release. The SegmentsDataset is compatible with popular frameworks like PyTorch, Tensorflow and Keras.

In [None]:
from segments import SegmentsDataset

# Initialize a dataset from the release file
release = client.get_release(dataset_identifier, 'v0.1')
print(release)
dataset = SegmentsDataset(release, labelset='ground-truth', filter_by='labeled')

Next, you could you  train a segmentation model on the manually labeled frames. You could also use another algorithm to create label predictions. 

For demonstration purposes, we will use a pretrained [SqueezeSegV3](https://github.com/chenfengxu714/SqueezeSegV3) model here.

Run the next cell to install the requirements for this model.

In [None]:
! git clone https://github.com/chenfengxu714/SqueezeSegV3.git
! pip install -r SqueezeSegV3/requirements.txt

## 3. Generate and upload label predictions for the unlabeled images

Now that we have a trained model, we can run it on the unlabeled point clouds to generate label predictions. Then we can upload these predictions to Segments.ai to speed up labeling, or to check the performance of the model visually.

In [None]:
dataset_path = './unlabeled_data'
output_path = "./output"

Create a new `SegmentsDataset`, this time containing only unlabeled frames, and download the point cloud frames. 

In [None]:
import urllib.request
import os
from segments import SegmentsDataset


dataset = SegmentsDataset(release, labelset='ground-truth', filter_by='unlabeled')

download_path = os.path.join(dataset_path, 'sequences', '00', 'velodyne')
os.makedirs(download_path, exist_ok=True)

for sample in dataset:
  # Save 
  sample_url = sample['attributes']['pcd']['url']
  urllib.request.urlretrieve(sample_url, os.path.join(download_path, sample['name']))

Use the model (or your own algorithm) to generate labels for the frames in the dataset.



In [None]:
from utils import run_model

run_model(dataset_path, output_path)

Upload the predictions to Segments.ai

In [None]:
from utils import get_prediction
from segments import SegmentsDataset

predictions_path = os.path.join(output_path, 'sequences', '00', 'predictions')

for sample in dataset:
    name = sample["name"][:-4]
    label_path = os.path.join(predictions_path, name + '.label')
    annotations, point_annotations = get_prediction(label_path)

    # Upload the predictions to Segments.ai
    attributes = {
        'format_version': '0.1',
        "annotations": annotations,
        "point_annotations": point_annotations
    }
    client.add_label(sample['uuid'], 'ground-truth', attributes, label_status='PRELABELED')

## 4. Verify and correct the predicted labels

Now go back to [Segments.ai](https://segments.ai/home) and click the "Start labeling" button to start labeling. This time, your job is quite a bit easier: instead of having to label each image from scratch, you can simply correct the few mistakes your model made!

## Next steps

Using this workflow for using pre-annotations, you can iterate between labeling and model training. This way, your model will quickly get better and better. You'll reach a point where you're mostly just verifying the model's predictions, only having to correct the occasional mistakes on hard edge cases.

Was this useful for you? Let us know! Make sure to check out the Segments.ai [documentation](https://docs.segments.ai/python-sdk) and don't hesitate to [contact us](https://segments.ai/contact) if you have any questions.