A generalised pose estimation pipeline that can be trained and tested on any data in the given format.
This is the official code for the papers:
- "Satellite Pose Estimation Competition 2021: Results and Analyses" published in the journal Acta Astronautica (Team TangoUnchained in Kelvins@ESA SPEC 2021) - https://www.sciencedirect.com/science/article/abs/pii/S0094576523000048
- "Towards Bridging the Space Domain Gap for Satellite Pose Estimation using Event Sensing" published in the conference proceedings of ICRA 2023 - https://arxiv.org/abs/2209.11945
cd object_detection
Install pytoch using official instructions preferably using pip as conda can sometimes install the cpu-only version for whatever reason:
conda create -n pose-estimation-pipeline python=3.8
conda activate pose-estimation-pipeline
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
Install detectron2
python -m pip install -e detectron2
conda install protobuf
pip install pandas
First prepare your dataset in the COCO format (see for detailed format: https://cocodataset.org/#format-data)
You can use the frames_to_coco_dicts.py
script to convert an existing dataset to COCO format
You will need images with numbered naming (e.g %06d.png) in the following structure:
__ .
__ ..
__ train
__ validation
__ test
Ground truth should be in a directory with following files:
camera_intrinsics_<n>.txt
pose_<n>.json
for n in 1...N=<number of images>
e.g if the first image is 000000.png then we need an associated camera_intrinsics_0.txt and pose_0.json
each camera_intrinsics_<n>.txt
file will contain a 3x3 numpy matrix with the camera intrinsics of the camera in that frame in the OpenCV format
each pose_<n>.txt
file will contain a json: {"rotation":(3x3 rotation matrix), "translation":(3x1 vector)}
translation and pose are World2Camera
--landmarks_file
is a csv file with first row being the headers x,y,z with all other lines being the 3D landmark coordinates of the object in world/object frame
Then run:
python frames_to_coco_dicts.py --frames_dir <path_to_frames> \
--gt_dir <path_to_ground_truth_files> \
--landmarks_file <path_to_landmarks_csv> \
--output_prefix <output_prefix e.g synthetic> \
--output_dir <output_directory> \
--image_width <frame_width> --image_height <frame_height>
python train_object_detection.py --train_annotations <path to <prefix>_train.json> \
--validation_annotations <path to <prefix>_validation.json> \
--train_images_dir <path to training images directory> \
--validation_images_dir <path to validation images directory> \
--output_dir=<model output path> \
--config config_4 \
--image_width <frame_width> --image_height <frame_height>
python export_object_detection_bounding_boxes.py --frames_dir <path to testing frames> \
--model_file <path to trained detectron model file> \
--validation_annotations <path to <prefix>_validation.json> \
--landmarks_file <path_to_landmarks_csv> \
--output_dir <output directory> \
--config config_4 \
--batch_size 32 \
--image_width <frame_width> --image_height <frame_height>
Code for augmentations is in detectron2/detectron2/data/detection_utils.py -> build_augmentation function
cd ../landmark_regression
- Pre-setup:
cd landmark_regression
Ideally to a conda environment (or your favourite environment manager) install pytorch >=1.0.0 following [official instruction](https://pytorch.org/)
- HRNet setup:
pip install -r requirements.txt
cd lib
make
cd ../cocoapi/PythonAPI
python3 setup.py install --user
export COCOAPI=`pwd`
cd ../..
mkdir output
mkdir log
<put pretrained models in the models directory>
python tools/train.py --cfg <path to config file e.g experiments/events/events-config.yaml> \
DATA_DIR <path to frames> \
OUTPUT_DIR <output directory path> \
DATASET.ROOT <path to directory COCO format jsons generated by frames_to_coco_dicts.py> \
DATASET.TEST_SET <name of test COCO json file e.g synthetic_test> \
DATASET.TRAIN_SET <name of train COCO json file e.g synthetic_train> \
DATASET.IMAGE_WIDTH <frame width> DATASET.IMAGE_HEIGHT <frame height> MODEL.NUM_JOINTS <number of 3D landmarks e.g 11>
Note: If you get the following error:
ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found
Execute the following command before running again:
export LD_LIBRARY_PATH=/home/<user-name>/anaconda3/envs/pose-estimation-pipeline/lib/:$LD_LIBRARY_PATH
python tools/test.py --cfg experiments/events/events-config.yaml \
DATA_DIR <path to frames> \
OUTPUT_DIR <output directory path> \
DATASET.ROOT <path to directory testing COCO format jsons generated by export_object_detection_bounding_boxes.py> \
DATASET.TEST_SET <name of test COCO json file e.g real_test> \
DATASET.TRAIN_SET <optional but just give it what you gave it above for training e.g synthetic_train> \
DATASET.IMAGE_WIDTH <frame width> DATASET.IMAGE_HEIGHT <frame height> MODEL.NUM_JOINTS <number of 3D landmarks e.g 11> \
TEST.MODEL_FILE <path to trained hrnet model e.g hubble_without_base_trained/EventsDataset/pose_hrnet/events-config/final_state.pth>
Augmentations are configured in tools/train.py by passing in a numpy_transform
cd ../pose_estimation
pip install kornia
python export_predicted_poses_real.py --frames_dir <path to frames directory> \
--detection_annotations <path to testing object detection annotations e.g test.json> \
--pose_annotations <path to hrnet output mat file ../landmark_regression/hrnet/hubble/EventsDataset/pose_hrnet/events-config/pred.mat> \
--landmarks_file <path to 3D landmarks csv>\
--calibration_file_path <path to calibration.json file>
--output_dir <path to output directory>
example: calibration.json:
{"intrinsics":{"camera_matrix": [
[
2988.5795163815555,
0,
960
],
[
0,
2988.3401159176124,
600
],
[
0,
0,
1
]
],
"distortion_coefficients": [
-0.22383016606510672,
0.51409797089106379,
-0.00066499611998340662,
-0.00021404771667484594,
-0.13124227429077406
]}
}
Special case: Training and Testing on SEENIC dataset (https://zenodo.org/record/7214231)
Follow the installation and setup instructions for the phases above
Follow official V2E instructions (https://github.com/SensorsINI/v2e) that are included in the v2e
directory.
Generate event frames from V2E e.g
v2e -i frames --overwrite --input_frame_rate=100 \
--timestamp_resolution=.01 --disable_slomo --auto_timestamp_resolution=False \
--dvs_exposure duration 0.2 --output_folder=output_0.2 --overwrite --pos_thres=.15 --neg_thres=.15 \
--sigma_thres=0.3 --dvs_text events.csv --output_width=640 --output_height=480 --cutoff_hz=30 --avi_frame_rate=10
Split frames into training/testing/validation. This can be done using the split_images.py
script provided in the object_detection
directory
Convert event frames plus accompanying ground truth data exported from Blender (or from any other source as long as it's in the format specified above) using the events_to_coco_dicts.py
script
The rest of the training steps are the same as above
A shell training script was created for our paper methodology which has been provided as train_pipeline_hubble_dvx.sh
A shell training script was created for our paper methodology which has been provided as evaluate_event_pipeline.sh
:TODO: 26-5-23 Mohsi Jawaid
Add Blender synthetic data generation scripts
Add more detailed instructions on training and testing with event data
Follow installation and setup instructions from the phases above
cd object_detection
Given that your SPEED+ dataset is located at ../datasets/
Run the folowing to generate COCO format jsons for the synthetic data for training:
python speedplus_to_coco_dicts.py
To generate COCO format jsons for the testing data:
Update --dataset_type
arg for speedplus_to_coco_dicts.py
to either lightbox
or sunlamp
Update --dataset_split
arg to test
Example test json generation command:
python speedplus_to_coco_dicts.py --dataset_type=lightbox --dataset_split=test
python train_object_detection.py --train_annotations speedplus_dicts/synthetic_train.json --validation_annotations speedplus_dicts/synthetic_validation.json --train_images_dir ../datasets/speedplus/synthetic/images/ --validation_images_dir ../datasets/speedplus/synthetic/images/ --output_dir speedplus_model --image_width 1900 --image_height 1200
- Lightbox:
python export_object_detection_bounding_boxes.py \
--frames_dir ../datasets/speedplus/lightbox/images/ \
--model_file speedplus_model/faster_rcnn_X_101_32x8d_FPN_3x_model_final_lightbox_2.pth \
--validation_annotations speedplus_dicts/synthetic_validation.json \
--landmarks_file speed_plus_utils/landmarks.csv \
--output_dir lightbox_test \
--config config_4 \
--batch_size 24 \
--image_width 1900 --image_height 1200
- Sunlamp:
python export_object_detection_bounding_boxes.py \
--frames_dir ../datasets/speedplus/sunlamp/images/ \
--model_file speedplus_model/faster_rcnn_X_101_32x8d_FPN_3x_model_final_sunlamp_1.pth \
--validation_annotations speedplus_dicts/synthetic_validation.json \
--landmarks_file speed_plus_utils/landmarks.csv \
--output_dir sunlamp_test \
--config config_4 \
--batch_size 24 \
--image_width 1900 --image_height 1200
Note: Adjust --batch_size for your own available gpu memory
cd ../landmark_regression
Train adversarial landmark regression ensemble for one of the real datasets:
- Lightbox
python tools/train_da_ms.py --cfg experiments/lit_hpc_001.yaml \
DATA_DIR ../datasets/speedplus/synthetic/images \
DATA_DIR_ADVERSARIAL ../datasets/speedplus/lightbox/images \
OUTPUT_DIR lightbox_model \
DATASET.ROOT ../object_detection/speedplus_dicts \
DATASET.ROOT_ADVERSARIAL ../object_detection/lightbox_test \
DATASET.TRAIN_SET synthetic_train \
DATASET.TRAIN_SET_ADVERSARIAL real_test \
DATASET.TEST_SET synthetic_validation \
DATASET.IMAGE_WIDTH 1900 DATASET.IMAGE_HEIGHT 1200 MODEL.NUM_JOINTS 11
(repeat for other models in the k-fold cross validation set)
- Sunlamp
python tools/train_da_ms.py --cfg experiments/sun_hpc_001.yaml \
DATA_DIR ../datasets/speedplus/synthetic/images \
DATA_DIR_ADVERSARIAL ../datasets/speedplus/sunlamp/images \
OUTPUT_DIR sunlamp_model \
DATASET.ROOT ../object_detection/speedplus_dicts \
DATASET.ROOT_ADVERSARIAL ../object_detection/sunlamp_test \
DATASET.TRAIN_SET synthetic_train \
DATASET.TRAIN_SET_ADVERSARIAL real_test \
DATASET.TEST_SET synthetic_validation \
DATASET.IMAGE_WIDTH 1900 DATASET.IMAGE_HEIGHT 1200 MODEL.NUM_JOINTS 11
(repeat for other models in the k-fold cross validation set)
Testing is currently configured to support upto 6 trained models See below for example on testing with 3 models
Provided that the model final state dict files (.pth) are in checkpoints directory
- Lightbox
python tools/test_cv_ensemble.py --cfg experiments/lit_hpc_001.yaml \
DATA_DIR ../datasets/speedplus/lightbox/images \
OUTPUT_DIR lightbox_test \
DATASET.ROOT ../object_detection/lightbox_test \
DATASET.TEST_SET real_test \
TEST.MODEL_FILE checkpoints/lightbox_models/lit1.pth \
TEST.MODEL_FILE2 checkpoints/lightbox_models/lit2.pth \
TEST.MODEL_FILE3 checkpoints/lightbox_models/lit3.pth \
DATASET.IMAGE_WIDTH 1900 DATASET.IMAGE_HEIGHT 1200 MODEL.NUM_JOINTS 11
- Sunlamp
python tools/test_cv_ensemble.py --cfg experiments/sun_hpc_001.yaml \
DATA_DIR ../datasets/speedplus/sunlamp/images \
OUTPUT_DIR sunlamp_test \
DATASET.ROOT ../object_detection/sunlamp_test \
DATASET.TEST_SET real_test \
TEST.MODEL_FILE checkpoints/sunlamp_models/sun1.pth \
TEST.MODEL_FILE2 checkpoints/sunlamp_models/sun2.pth \
TEST.MODEL_FILE3 checkpoints/sunlamp_models/sun3.pth \
DATASET.IMAGE_WIDTH 1900 DATASET.IMAGE_HEIGHT 1200 MODEL.NUM_JOINTS 11
cd ../pose_estimation
- Lightbox
python export_predicted_poses_real.py \
--frames_dir ../datasets/speedplus/lightbox/images \
--detection_annotations ../object_detection/lightbox_test/real_test.json \
--pose_annotations ../landmark_regression/lightbox_test/PEdataset/hrnet_cms_384/sun_hpc_001/pred_real.mat \
--landmarks_file ../object_detection/speed_plus_utils/landmarks.csv \
--calibration_file_path ../object_detection/speed_plus_utils/calibration.json \
--output_dir lightbox_poses
- Sunlamp
python export_predicted_poses_real.py \
--frames_dir ../datasets/speedplus/sunlamp/images \
--detection_annotations ../object_detection/sunlamp_test/real_test.json \
--pose_annotations ../landmark_regression/sunlamp_test/PEdataset/hrnet_cms_384/sun_hpc_001/pred_real.mat \
--landmarks_file ../object_detection/speed_plus_utils/landmarks.csv \
--calibration_file_path ../object_detection/speed_plus_utils/calibration.json \
--output_dir sunlamp_poses
You will now have two directories lightbox_poses and sunlamp_poses each with images with the reprojected landmarks and a file opencv_poses.json which has a list of poses in the format: { "image_name": "img%06d.jpg", "T": (3x1 translation vector) "rotation_matrix": (3x3 rotation matrix) }
Please cite the following if you use this code in your work:
@inproceedings{jawaid2023eventspacepose,
title= {Towards Bridging the Space Domain Gap for Satellite Pose Estimation using Event Sensing},
author = {Jawaid, Mohsi and Elms, Ethan and Latif, Yasir and Chin, Tat-Jun},
booktitle= {IEEE International Conference on Robotics and Automation (ICRA)},
year={2023}
}
@article{park2023satellite,
title={Satellite Pose Estimation Competition 2021: Results and analyses},
author={Park, Tae Ha and M{\"a}rtens, Marcus and Jawaid, Mohsi and Wang, Zi and Chen, Bo and Chin, Tat-Jun and Izzo, Dario and D’Amico, Simone},
journal={Acta Astronautica},
year={2023},
publisher={Elsevier}
}
Happy Pose Estimation!