Setup a Docker container with the correct PyTorch environment. This setup allows developers to use their favorite text editor to write code on their host while leveraging the power of PyTorch from inside the container's environment.
Pull down any project on your host machine with vcs
and docker will take care
of binding it to the inside of the container.
NOTE: Refer to the documentation in
the
rif-internal-docs repo
for instructions on how to train an image segmentation model with detectron2
,
how to interact with CVAT, how to generate synthetic data with blenderproc
,
etc.
- Docker: https://docs.docker.com/engine/install/ubuntu/
- docker-compose: https://docs.docker.com/compose/install/
- vcs: http://wiki.ros.org/vcstool
I prefer sudo apt update && sudo apt install python3 -y b&& sudo apt install python3-pip -y
.
The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. Although you will not have to install the CUDA Toolkit on your host system, you will need to install the Nvidia drivers. The instructions can be found in the Nvidia docs. Namely, execute the following:
-
Setup the
stable
repository and the GPG key:distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
-
Install the
nvidia-docker2
package:sudo apt update sudo apt install -y nvidia-docker2
NOTE: The nvidia-docker2
dependency is important if you want to use
Kubernetes with Docker 19.03 (and newer), because Kubernetes doesn't support
passing GPU information down to docker through the --gpus flag yet.
-
Restart the Docker daemon:
sudo systemctl restart docker
-
Test the setup by running a base CUDA container:
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
-
Setup workspace and clone this repository
mkdir -p /path/to/pytorch_ws/{src,data/repos} cd /path/to/pytorch_ws git clone git@github.com:RIF-Robotics/pytorch_setup.git
-
Clone additional repositories
cd /path/to/pytorch_setup vcs import ../src < pytorch.repos
NOTE: Regularly execute the following to keep the repositories up to date:
cd /path/to/pytorch_setup vcs pull ../src
-
Build Docker image
cd /path/to/pytorch_setup echo -e "USER_ID=$(id -u ${USER})\nGROUP_ID=$(id -g ${USER})" > .env docker compose build
Spin up the container:
cd /path/to/pytorch_setup
docker-compose up -d dev-nvidia
Drop inside a container. You can execute this in as many terminals as desired once the container is spinning. Keep in mind that they all drop you into the same container:
docker exec -it rif_detectron2 /bin/bash
Execute the following on your host to stop the container:
cd /path/to/pytorch_setup
docker-compose stop
- Using CVAT, export the datasets that you want to use:
Actions Export task dataset
. Settings:- Export Format:
CVAT for images 1.1
- Save Images:
True
(checkbox).
- Export Format:
Save the exported zip files to the pytorch_ws/data
directory. I exported the
following datasets:
- #2: RealSense Images
- #3: Surgical Instruments with Arm
-
Make individual directories (
mkdir
) for each dataset you downloaded and unzip the downloaded datasets into their respective directories. -
Visualize the dataset with fiftyone in your browser. Inside the docker container, run the command:
fiftyone_view_dataset cvat /path/to/data/<cvat-dataset>
-
If necessary, combine multiple CVAT datasets into a single CVAT dataset
cd ./data combine_datasets <output-cvat-dataset> <input-cvat-dataset0> <input-cvat-dataset1>
-
Convert the CVAT dataset to a COCO dataset with training, validation, and test splits. This creates three separate coco datasets under the
<output-coco-dataset>
folder.fiftyone_cvat_to_coco <input-cvat-dataset> <output-coco-dataset> --splits 0.7 0.2 0.1
-
Visualize the COCO training dataset in fiftyone to make sure it's as expected.
fiftyone_view_dataset coco <coco-dataset>/train
-
Train the model. The
<coco-dataset>
folder should containtrain
,val
, andtest
subfolders.detectron2_model_train <coco-dataset> --output_dir 2022-05-11-trained-model --train
-
While training, use
tensorboard
to visualize loss and other metrics. Open another terminal in the docker container and execute:tensorboard --logdir /path/to/2022-05-11-trained-model --bind_all
-
Evaluate the model's performance on the test set
detectron2_model_train <coco-dataset> --output_dir 2022-05-11-trained-model --evaluate
-
Show model predictions on the test set
detectron2_model_train <coco-dataset> --output_dir 2022-05-11-trained-model --predict
Leverage the provided Docker environment to run Facebook's detectron2 library.
-
Setup the environment by executing the following inside a spinning container:
cd ~/workspace/src/detectron2_repo wget http://images.cocodataset.org/val2017/000000439715.jpg -O input.jpg mkdir -p outputs
-
Execute the following to run the demo on a pre-trained COCO model and perform instance segmentation on the previously downloaded image:
cd ~/workspace/src/detectron2_repo python3 demo/demo.py --config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input input.jpg --output outputs --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
-
Use
feh
to display:sudo apt-get install feh feh ./outputs/input.jpg
-
Step inside the running Docker container:
docker exec -it rif_detectron2 /bin/bash
-
Setup BlenderProc in the container with the quickstart script
blenderproc quickstart
View the resulting image:
blenderproc vis hdf5 output/0.hdf5
-
Generate five synthetic images.
cd ./src/rif-python/scripts/blenderproc/random_placement blenderproc run main.py ./config.json \ ~/workspace/src/surgical-instrument-3D-models/library/models.json \ ~/workspace/data/blenderproc_output \ --runs 5
-
View a single synthetic data sample:
blenderproc vis hdf5 ~/workspace/data/blenderproc_output/0.hdf5
-
View synthetic data in fiftyone:
$ fiftyone_view_dataset coco \ ~/workspace/data/blenderproc_output/coco_data \ --images-dir . \ --labels-file coco_annotations.json
Point your browser at http://localhost:5151