Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add autoware_rosbag2_anonymizer usage #557

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/datasets/.pages
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
nav:
- index.md
- Data Anonymization: data-anonymization

Check warning on line 3 in docs/datasets/.pages

View workflow job for this annotation

GitHub Actions / spell-check-differential

Unknown word (Anonymization)

Check warning on line 3 in docs/datasets/.pages

View workflow job for this annotation

GitHub Actions / spell-check-differential

Unknown word (anonymization)
221 changes: 221 additions & 0 deletions docs/datasets/data-anonymization/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
# Rosbag2 Anonymizer

## Overview

Autoware provides a tool to anonymize ROS 2 bag files.
This tool is useful when you want to share your data with Autoware community but want to keep the privacy of the data.

With this tool you can blur any object (faces, license plates, etc.) in your bag files, and you can get a new bag file
with the blurred images.

## Installation

### Clone the repository

```bash
git clone https://github.com/autowarefoundation/autoware_rosbag2_anonymizer.git
```
StepTurtle marked this conversation as resolved.
Show resolved Hide resolved

### Download the pretrained models

```bash
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

wget https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/GroundingDINO_SwinB.cfg.py
wget https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swinb_cogcoor.pth

wget https://github.com/autowarefoundation/autoware_rosbag2_anonymizer/releases/download/v0.0.0/yolov8x_anonymizer.pt
```

### Install ROS 2 mcap dependencies if you will use mcap files

!!! warning

Be sure you have installed the ROS 2 on your system.

```bash
sudo apt install ros-humble-rosbag2-storage-mcap
```

### Install `autoware_rosbag2_anonymizer` tool

```bash
python3 -m pip install .
StepTurtle marked this conversation as resolved.
Show resolved Hide resolved
```

## Configuration

Define prompts in the validation.jsonfile. The tool will use these prompts to detect objects. You can add your prompts
as dictionaries under the prompts key. Each dictionary should have two keys:

- `prompt`: The prompt that will be used to detect the object. This prompt will be blurred in the anonymization process.

Check warning on line 51 in docs/datasets/data-anonymization/index.md

View workflow job for this annotation

GitHub Actions / spell-check-differential

Unknown word (anonymization)
- `should_inside`: This is a list of prompts that object should be inside. If the object is not inside the prompts, the
tool will not blur the object.

```json
{
"prompts": [
{
"prompt": "license plate",
"should_inside": ["car", "bus", "..."]
},
{
"prompt": "human face",
"should_inside": ["person", "human body", "..."]
}
]
}
```

You should set your configuration in the configuration files under config folder according to the usage. Following
instructions will guide you to set each configuration file.

- `config/anonymize_with_unified_model.yaml`

```yaml
rosbag:
input_bag_path: "path/to/input.bag" # Path to the input ROS 2 bag file with 'mcap' or 'sqlite3' extension
output_bag_path: "path/to/output/folder" # Path to the output ROS 2 bag folder
output_save_compressed_image: True # Save images as compressed images (True or False)
output_storage_id: "sqlite3" # Storage id for the output bag file (`sqlite3` or `mcap`)

grounding_dino:
box_threshold: 0.1 # Threshold for the bounding box (float)
text_threshold: 0.1 # Threshold for the text (float)
nms_threshold: 0.1 # Threshold for the non-maximum suppression (float)

open_clip:
score_threshold: 0.7 # Validity threshold for the OpenCLIP model (float

yolo:
StepTurtle marked this conversation as resolved.
Show resolved Hide resolved
confidence: 0.15 # Confidence threshold for the YOLOv8 model (float)

bbox_validation:
iou_threshold: 0.9 # Threshold for the intersection over union (float), if the intersection over union is greater than this threshold, the object will be selected as inside the validation prompt

blur:
kernel_size: 31 # Kernel size for the Gaussian blur (int)
sigma_x: 11 # Sigma x for the Gaussian blur (int)
```

- `config/yolo_create_dataset.yaml`

```yaml
rosbag:
input_bags_folder: "path/to/input/folder" # Path to the input ROS 2 bag files folder

dataset:
output_dataset_folder: "path/to/output/folder" # Path to the output dataset folder
output_dataset_subsample_coefficient: 25 # Subsample coefficient for the dataset (int)

grounding_dino:
box_threshold: 0.1 # Threshold for the bounding box (float)
text_threshold: 0.1 # Threshold for the text (float)
nms_threshold: 0.1 # Threshold for the non-maximum suppression (float)

open_clip:
score_threshold: 0.7 # Validity threshold for the OpenCLIP model (float

bbox_validation:
iou_threshold: 0.9 # Threshold for the intersection over union (float), if the intersection over union is greater than this threshold, the object will be selected as inside the validation prompt
```

- `config/yolo_train.yaml`

```yaml
dataset:
input_dataset_yaml: "path/to/input/data.yaml" # Path to the config file of the dataset, which is created in the previous step

yolo:
epochs: 100 # Number of epochs for the YOLOv8 model (int)
model: "yolov8x.pt" # Select the base model for YOLOv8 ('yolov8x.pt' 'yolov8l.pt', 'yolov8m.pt', 'yolov8n.pt')
```

- `config/yolo_anonymize.yaml`

```yaml
rosbag:
input_bag_path: "path/to/input.bag" # Path to the input ROS 2 bag file with 'mcap' or 'sqlite3' extension
output_bag_path: "path/to/output/folder" # Path to the output ROS 2 bag folder
output_save_compressed_image: True # Save images as compressed images (True or False)
output_storage_id: "sqlite3" # Storage id for the output bag file (`sqlite3` or `mcap`)

yolo:
model: "path/to/yolo/model" # Path to the trained YOLOv8 model file (`.pt` extension) (you can download the pre-trained model from releases)
config_path: "path/to/input/data.yaml" # Path to the config file of the dataset, which is created in the previous step
confidence: 0.15 # Confidence threshold for the YOLOv8 model (float)

blur:
kernel_size: 31 # Kernel size for the Gaussian blur (int)
sigma_x: 11 # Sigma x for the Gaussian blur (int)
```

## Usage

The tool provides two options to anonymize images in ROS 2 bag files.

### Option 1: Anonymize with Unified Model

You should provide a single rosbag and tool anonymize images in rosbag with a unified model. The model is a combination
of GroundingDINO, OpenCLIP, YOLOv8 and SegmentAnything. If you don't want to use pre-trained YOLOv8 model, you can
follow the instructions in the second option to train your own YOLOv8 model.

You should set your configuration in config/anonymize_with_unified_model.yaml file.

```bash
python3 main.py config/anonymize_with_unified_model.yaml --anonymize_with_unified_model
```

### Option 2: Anonymize Using the YOLOv8 Model Trained on a Dataset Created with the Unified Model

#### Step 1: Create a Dataset

Create an initial dataset with the unified model. You can provide multiple ROS 2 bag files to create a dataset. After
running the following command, the tool will create a dataset in YOLO format.

You should set your configuration in config/yolo_create_dataset.yaml file.

```bash
python3 main.py config/yolo_create_dataset.yaml --yolo_create_dataset
StepTurtle marked this conversation as resolved.
Show resolved Hide resolved
```

#### Step 2: Manually Label the Missing Labels

The dataset which is created in the first step has some missing labels. You should label the missing labels manually.

#### Step 3: Split the Dataset

Split the dataset into training and validation sets. Give the path to the dataset folder which is created in the first
step.

```bash
autoware-rosbag2-anonymizer-split-dataset /path/to/dataset/folder
```

#### Step 4: Train the YOLOv8 Model

Train the YOLOv8 model with the dataset which is created in the first step.

You should set your configuration in config/yolo_train.yaml file.

```bash
python3 main.py config/yolo_train.yaml --yolo_train
```

#### Step 5: Anonymize Images in ROS 2 Bag Files

Anonymize images in ROS 2 bag files with the trained YOLOv8 model. If you want to anonymize your ROS 2 bag file with only
YOLOv8 model, you should use following command. But we recommend to use the unified model for better results. You can
follow the Option 1 for the unified model with the YOLOv8 model trained by you.

You should set your configuration in config/yolo_anonymize.yaml file.

```bash
python3 main.py config/yolo_anonymize.yaml --yolo_anonymize
StepTurtle marked this conversation as resolved.
Show resolved Hide resolved
```

## Share Your Anonymized Data

After anonymizing your data, you can share your anonymized data with the Autoware community. If you want to share your
data with the Autoware community, you should create an issue and pull request to
the [Autoware Documentation repository](https://github.com/autowarefoundation/autoware-documentation).
Loading