Skip to content

Latest commit

 

History

History
621 lines (340 loc) · 81.4 KB

readme_details.md

File metadata and controls

621 lines (340 loc) · 81.4 KB

Contents

List for public implementation of various algorithms

1) Pubilc Datasets and Challenges

Slim and Simple

Mixed, Synthetic and Complicated

2) Pioneers and Experts

👍Alejandro Newell 👍Jia Deng 👍Zhe Cao 👍Tomas Simon 👍tensorboy 👍murdockhou 👍张兆翔

3) Blogs, Videos and Applications

4) Papers and Sources Codes

▶ Related Survey

  • ComputingSurveys 2022 Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective [paper link][arxiv link][JD + HIT]

▶ Single Person Pose Estimation

  • Modeep(ACCV2014)(video based) MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation [arxiv link]

  • (NIPS2014)(heatmaps) Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation [arxiv link]

  • PoseMachines(ECCV2014)(regression) Pose Machines: Articulated Pose Estimation via Inference Machines [paper link][project link]

  • DeepPose(CVPR2014)(AlexNet based)(regression) DeepPose: Human Pose Estimation via Deep Neural Networks [arxiv link][Codes|OpenCV(unoffical)]

  • (ICCV2015)(video based) Flowing ConvNets for Human Pose Estimation in Videos [arxiv link]

  • (ECCV2016)(heatmaps) Human Pose Estimation using Deep Consensus Voting [arxiv link]

  • (CVPR2016)(structure information) End-To-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation [paper link]

  • (CVPR2016)(structure information) Structured Feature Learning for Pose Estimation [paper link]

  • IEF(CVPR2016)(GoogleNet Based)(regression) Human Pose Estimation with Iterative Error Feedback [arxiv link]

  • CPM(CVPR2016)(heatmaps) Convolutional Pose Machines [arxiv link][Codes|Caffe(offical)][Codes|Tensorflow(unoffical)]

  • StackedHourglass(ECCV2016)(heatmaps) Stacked Hourglass Networks for Human Pose Estimation [arxiv link][Codes|Torch7(offical old)][Codes|PyTorch(offical new)][Codes|Tensorflow(unoffical)]

  • HourglassResidualUnits(HRUs)(CVPR2017)(heatmaps) Multi-context Attention for Human Pose Estimation [arciv link]

  • PyraNet(ICCV2017)(heatmaps) Learning Feature Pyramids for Human Pose Estimation [arxiv link][Codes|Torch(offical)]

  • (ICCV2017)(ResNet-50 Based)(regression) Compositional Human Pose Regression [arxiv link]

  • Adversarial-PoseNet(ICCV2017)(GAN) Adversarial PoseNet: A Structure-aware Convolutional Network for Human Pose Estimation [arxiv link][Codes|PyTorch(unoffical)]

  • (ECCV2018)(structure information) Multi-Scale Structure-Aware Network for Human Pose Estimation [arxiv link]

  • (ECCV2018)(structure information) Deeply Learned Compositional Models for Human Pose Estimation [paper link]

  • (CVPR2018)(multi-task/video based)(regression) 2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning [arxiv link]

  • (CVPR2019)(structure information) Does Learning Specific Features for Related Parts Help Human Pose Estimation? [paper link]

  • (arxiv2020)(video based) Key Frame Proposal Network for Efficient Pose Estimation in Videos [arxiv link]

  • UniPose(CVPR2020)(video based) UniPose: Unified Human Pose Estimation in Single Images and Videos [arxiv link][Codes|PyTorch(offical)]

▶ Two-Stage [Top-Down] Multiple Person Pose Estimation

▶ Two-Stage [Bottom-Up] Multiple Person Pose Estimation

▶ Single-Stage Multiple Person Pose Estimation

  • DirectPose(arxiv2019) DirectPose: Direct End-to-End Multi-Person Pose Estimation [arxiv link][DirectPose proposes to directly regress the instance-level keypoints by considering the keypoints as a special bounding-box with more than two corners.]

  • SPM(ICCV2019) Single-Stage Multi-Person Pose Machines [arxiv link][Codes|PyTorch(offical not released)][Codes|Tensorflow(unoffical)][CSDN blog]

  • CenterNet(arxiv2019) Objects as Points [arxiv link]

  • Point-Set Anchors(ECCV2020) Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation [paper link]

  • POET(arxiv2021) End-to-End Trainable Multi-Instance Pose Estimation with Transformers [arxiv link][DETR-based, regression]

  • TFPose(arxiv2021) TFPose: Direct Human Pose Estimation with Transformers [arxiv link][project link][It adopts Detection Transformers to estimate the cropped single-person images as a query-based regression task][end2end top-down]

  • InsPose(ACMMM2021) InsPose: Instance-Aware Networks for Single-Stage Multi-Person Pose Estimation [paper link][code|official][It designs instance-aware dynamic networks to adaptively adjust part of the network parameters for each instance]

  • DeepDarts(CVPRW2021) DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single Camera [paper link]

  • FCPose(CVPR2021) FCPose: Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions [paper link][codes|official]

  • PRTR(CVPR2021) Pose Recognition With Cascade Transformers [paper link][codes|official][transformer-based, high input resolution and stacked attention modules, high complexity and require huge memory during the training phase][end2end top-down]

  • KAPAO(ECCV2022) Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation [arxiv link][codes|(official pytorch using YOLOv5)]

  • YOLO-Pose(CVPRW2022) YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss [paper link][codes|official edgeai-yolox][codes|official edgeai-yolov5]

  • AdaptivePose(AAAI2022) AdaptivePose: Human Parts as Adaptive Points [paper link][codes|official PyTorch]

  • AdaptivePose++(TCSVT2022) AdaptivePose++: A Powerful Single-Stage Network for Multi-Person Pose Regression [paper link][codes|official PyTorch]

  • LOGO-CAP(CVPR2022) Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation [paper link][codes|official PyTorch]

  • PETR(CVPR2022) End-to-End Multi-Person Pose Estimation With Transformers [paper link][codes|official PyTorch][transformer-based, high input resolution and stacked attention modules, high complexity and require huge memory during the training phase][fully end2end]

  • QueryPose(NIPS2022) QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query [openreview link][arxiv link][code|official][fully end2end]

  • ED-Pose(ICLR2023) Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation [arxiv link][openreview link][code|official][IDEA-Research][fully end2end]

  • PolarPose(TIP2023) PolarPose: Single-Stage Multi-Person Pose Estimation in Polar Coordinates [paper link]

  • 👍GroupPose(ICCV2023) Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation [paper link][arxiv link][code|official Paddle][code|official PyTorch]

  • 👍RTMO(arxiv2023.12) RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation [arxiv link][code|official][Tsinghua Shenzhen International Graduate School and Shanghai AI Laboratory; the code is released by Open-MMLab]

▶ Simultaneous Multiple Person Pose Estimation and Instance Segmentation

  • Mask R-CNN(ICCV2017)(multi-task) Mask R-CNN [paper link]

  • PersonLab(ECCV2018)(multi-task) PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model [arxiv link][Codes|Keras&Tensorflow(unoffical by octiapp)][Codes|Tensorflow(unoffical)]

  • ACPNet(ICME2019) ACPNet: Anchor-Center Based Person Network for Human Pose Estimation and Instance Segmentation [paper link][based on Mask R-CNN]

  • Pose2Seg(CVPR2019) Pose2Seg: Detection Free Human Instance Segmentation [paper link][codes|official]

  • PointSetNet(ECCV2020) Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation [paper link][Not a multi-task end-to-end network, The proposed Point-Set Anchors can be applied to object detection, instance segmentation and human pose estimation tasks separately]

  • MG-HumanParsing(CVPR2021) Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing [paper link][code|official]

  • Multitask-CenterNet(ICCVW2021) MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning Using an Anchor Free Approach [paper link][based on the CenterNet]

  • MDSP(IVS2022 Oral) Multitask Network for Joint Object Detection, Semantic Segmentation and Human Pose Estimation in Vehicle Occupancy Monitoring [paper link]

  • PosePlusSeg(AAAI2022) Joint Human Pose Estimation and Instance Segmentation with PosePlusSeg [paper link][codes|official tensorflow][similarly with the PersonLab, Niaz Ahmad, suspected of plagiarism]

  • MultiPoseSeg(ICPR2022) MultiPoseSeg: Feedback Knowledge Transfer for Multi-Person Pose Estimation and Instance Segmentation [paper link][code|official][similarly with the PersonLab, Niaz Ahmad, suspected of plagiarism]

  • HCQNet(Human-Centric Query)(arxiv2023.03) Object-Centric Multi-Task Learning for Human Instances [paper link][based on the Mask2Former (CVPR2022) (Masked-attention mask transformer for universal image segmentation)]

▶ 3D Multiple Person Pose Estimation

▶ Special Multiple Person Pose Estimation

  • PoseTrack(CVPR2017) PoseTrack: Joint Multi-Person Pose Estimation and Tracking [arxiv link][Codes|Matlab&Caffe]

  • Detect-and-Track(CVPR2018) Detect-and-Track: Efficient Pose Estimation in Videos [arxiv link][project link][Codes|Detectron(offical)][codes|official]

  • PoseFlow(BMVC2018) Pose Flow: Efficient Online Pose Tracking [arxiv link][Codes|AlphaPose(offical)]

  • DensePose(CVPR2018) DensePose: Dense Human Pose Estimation In The Wild [arxiv link][project link][Codes|Caffe2(offical)]

  • RF-Pose(CVPR2018)(radio frequency) Through-Wall Human Pose Estimation Using Radio Signals [paper link][project link]

  • 👍LIP_JPPNet(TPAMI2019) Look into Person: Joint Body Parsing & Pose Estimation Network and a New Benchmark [paper link][Lab Homepage][code|official][Joint Body Parsing & Pose Estimation]

  • DoubleFusion(TPAMI2019)(3D single-view real-time depth-sensor) DoubleFusion: Real-time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor [arxiv link]

  • Keypoint-Communities(ICCV2019) Keypoint Communities [paper link][Model all keypoints belonging to a human or an object (the pose) as a graph]

  • BlazePose (CVPRW2020) BlazePose: On-device Real-time Body Pose tracking [paper link][project link]

  • ODKD(arxiv2021) Orderly Dual-Teacher Knowledge Distillation for Lightweight Human Pose Estimation [paper link][Knowledge Distillation of MPPE based on HRNet]

  • DDP(3DV2021) Direct Dense Pose Estimation [paper link][Dense human pose estimation]

  • MEVADA(ICCV2021) Single View Physical Distance Estimation using Human Pose [paper link][project link]

  • Unipose+(TPAMI2022) UniPose+: A Unified Framework for 2D and 3D Human Pose Estimation in Images and Videos [paper link][author given link]

  • 👍HTCorrM(Human Task Correlation Machine)(TPAMI2022) On the Correlation among Edge, Pose and Parsing [paper link][pdf link][Multi-tasks Learning]

  • PoseTrack21(CVPR2022) PoseTrack21: A Dataset for Person Search, Multi-Object Tracking and Multi-Person Pose Tracking [paper link][codes|official][jointly person search, multi-object tracking and multi-person pose tracking]

  • PoseTrans(ECCV2022) PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation [paper link]

  • DeciWatch(ECCV2022) DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation [paper link][code|official][project link][Video based human pose estimation]

  • PPT(ECCV2022) PPT: Token-Pruned Pose Transformer for Monocular and Multi-view Human Pose Estimation [paper link][code|official]

  • QuickPose(SIGGRAPH2022) QuickPose: Real-time Multi-view Multi-person Pose Estimation in Crowded Scenes [paper link][ZJU]

  • TDMI-ST(CVPR2023) Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video [paper link][PoseTrack2017, PoseTrack2018, and PoseTrack21, video-based HPE]

  • MG-HumanParsing(TPAMI2023) Differentiable Multi-Granularity Human Parsing [paper link][code|official][Human Parsing]

  • Obj2Seq(NIPS2022) Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks [openreview link][arxiv link][code|official][ViT-based, Multi-task model]

  • 👍AutoLink(NIPS2022) AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines by Linking Keypoints [arxiv link][openreview link][project link]

▶ Transfer Learning of Multiple Person Pose Estimation

Domain Adaptive / Unsupervised / Self-Supervised / Semi-Supervised / Weakly-Supervised / Generalizable

※ Active Learning for Pose

  • VL4Pose(BMVC2022) VL4Pose: Active Learning Through Out-Of-Distribution Detection For Pose Estimation [arxiv link][code|official][with tasks of single human pose and hand pose]

※ Pose in Real Classroom

  • 👍SynPose(ICASSP2022) Synpose: A Large-Scale and Densely Annotated Synthetic Dataset for Human Pose Estimation in Classroom [paper link][project link][Based on GTA-V, CycleGAN, ST-GCN and DEKR]

  • 👍CC-PoseNet(ICASSP2023) CC-PoseNet: Towards Human Pose Estimation in Crowded Classrooms [paper link]

※ Animal Pose Estimation

  • WS-CDA(ICCV2019) Cross-Domain Adaptation for Animal Pose Estimation [paper link][arxiv link][project link][code|official][Animal Pose Dataset, Leverages human pose data and a partially annotated animal pose dataset to perform semi-supervised domain adaptation]

  • 👍CC-SSL(CVPR2020) Learning From Synthetic Animals [paper link][arxiv link][code|official][Animal Pose][It proposed invariance and equivariance consistency learning with respect to transformations as well as temporal consistency learning with a video; It employs a single end-to-end trained network]

  • 👍MDAM, UDA-Animal-Pose(CVPR2021) From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation [paper link][codes|PyTorch][Animal Pose][ResNet + Hourglass][It proposed a refinement module and a self-feedback loop to obtain reliable pseudo labels; It addresses the teacher-student paradigm alongside a novel pseudo-label strategy]

  • DeepLabCut (Nature Methods 2022) Multi-animal pose estimation, identification and tracking with DeepLabCut [paper link]

  • Social LEAP Estimates Animal Poses (SLEAP) (Nature Methods 2022) SLEAP: A deep learning system for multi-animal pose tracking [paper link]

  • SemiMultiPose(arxiv2022) SemiMultiPose: A Semi-supervised Multi-animal Pose Estimation Framework [paper link][Semi-Supervised Keypoint Localization]

  • AnimalKingdom (CVPR2022) Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding [paper link][project link][arxiv link][code|official]

  • ScarceNet(CVPR2023) ScarceNet: Animal Pose Estimation With Scarce Annotations [paper link][arxiv link][code|official][Animal Pose, Semi-Supervised Keypoint Localization, based on HRNet][small-loss trick for reliability check + agreement check to identify reusable samples + student-teacher network (Mean Teacher) to enforce a consistency constraint]

  • AnimalTrack (IJCV2023) AnimalTrack: A Benchmark for Multi-Animal Tracking in the Wild [arxiv link][project link][download page][Animal dataset]

  • LoTE-Animal (ICCV2023) LoTE-Animal: A Long Time-span Dataset for Endangered Animal Behavior Understanding [paper link][project link][Animal dataset]

  • Animal3D (ICCV2023) Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape [paper link][arxiv link][project link][based on the SMAL model, Animal dataset]

  • Social Behavior Atlas (SBeA) (Nature Machine Intelligence 2024) Multi-animal 3D social pose estimation, identification and behaviour embedding with a few-shot learning framework [paper link]

※ Hand Pose Estimation

  • (CVIU2017) Hand Pose Estimation through Semi-Supervised and Weakly-Supervised Learning [paper link][arxiv link][Universite de Lyon; using the depth input]

  • (ECCV2018) Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images [paper link][No code is available, Nanyang Technological University, a weakly-supervised method with the aid of depth images, 3D Hand Pose Estimation, Keypoints]

  • SO-HandNet(ICCV2019) SO-HandNet: Self-Organizing Network for 3D Hand Pose Estimation With Semi-Supervised Learning [paper link][No code is available, Wuhan University, 3D Hand Pose Estimation, Keypoints, based on SO-Net and 3D point clouds]

  • weak_da_hands(CVPR2020) Weakly-Supervised Domain Adaptation via GAN and Mesh Model for Estimating 3D Hand Poses Interacting Objects [paper link][code|official (not available)]

  • SemiHand(ICCV2021) SemiHand: Semi-Supervised Hand Pose Estimation With Consistency [paper link][No code is available, semi-supervised hand pose estimation]

  • MarsDA(TCSVT2022) Multibranch Adversarial Regression for Domain Adaptative Hand Pose Estimation [paper link][based on RegDA, hand datasets (RHD→H3D), It applies a teacher-student approach to edit RegDA]

  • 👍C-GAC(ECCV2022) Domain Adaptive Hand Keypoint and Pixel Localization in the Wild [paper link][arxiv link][project link][based on Stacked Hourglassall compared methods are reproduced by the author, no code is available]

  • DM-HPE(CVPR2023) Cross-Domain 3D Hand Pose Estimation With Dual Modalities [paper link][No code is available, cross-domain semi-supervised hand pose estimation, Dual Modalities]

※ Head Pose Estimation / Eye Gaze Estimation

belonging to the Domain Adaptive Regression (DGA) or Semi-Supervised Rotation Regression problem

  • PADACO(ICCV2019) Deep Head Pose Estimation Using Synthetic Images and Partial Adversarial Domain Adaption for Continuous Label Spaces [paper link][code|official][An adversarial training approach based on domain adversarial neural networks is used to force the extraction of domain-invariant features]

  • 👍Gaze360(ICCV2019) Gaze360: Physically Unconstrained Gaze Estimation in the Wild [paper link][arxiv link][project link][dataset Gaze360, Domain Adaptive Gaze Estimation]

  • few_shot_gaze(ICCV2019 oral) Few-Shot Adaptive Gaze Estimation [paper link][arxiv link][code|official][Domain Adaptive Gaze Estimation]

  • DeepDAR(SpringerBook2020) Deep Domain Adaptation for Regression [paper link][pdf link][Domain Adaptive Regression (DGA) theory, Age Estimation and Head Pose Estimation][book title Development and Analysis of Deep Learning Architectures]

  • DAGEN(ACCV2020) Domain Adaptation Gaze Estimation by Embedding with Prediction Consistency [paper link][arxiv link][Eye Gaze Estimation]

  • (FG2021) Relative Pose Consistency for Semi-Supervised Head Pose Estimation [paper link][pdf link][Semi-Supervised]

  • PnP-GA(ICCV2021) Generalizing Gaze Estimation With Outlier-Guided Collaborative Adaptation [paper link][arxiv link][code|official][Domain Adaptive Gaze Estimation]

  • 👍RSD(ICML2021) Representation Subspace Distance for Domain Adaptation Regression [paper link][code|official][Domain Adaptive Regression (DGA) theory, Mingsheng Long, datasets dSprites(a standard 2D synthetic dataset for deep representation learning) and MPI3D(a simulation-to-real dataset of 3D objects)]

  • DINO-INIT & DINO-TRAIN(NIPS2022) Distribution-Informed Neural Networks for Domain Adaptation Regression [paper link][Domain Adaptive Regression (DGA) theory]

  • SynGaze(CVPRW2022) Learning-by-Novel-View-Synthesis for Full-Face Appearance-Based 3D Gaze Estimation [paper link][arxiv link][The University of Tokyo, Eye Gaze Estimation, No code]

  • RUDA(CVPR2022) Generalizing Gaze Estimation With Rotation Consistency [paper link][Eye Gaze Estimation, No code]

  • CRGA(CVPR2022) Contrastive Regression for Domain Adaptation on Gaze Estimation [paper link][SJTU, Eye Gaze Estimation, No code]

  • (TBIOM2023) Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency [paper link]

  • AdaptiveGaze(arxiv2023.05) Domain-Adaptive Full-Face Gaze Estimation via Novel-View-Synthesis and Feature Disentanglement [arxiv link][code|official][The University of Tokyo, Eye Gaze Estimation]

  • 👍DARE-GRAM(CVPR2023) DARE-GRAM: Unsupervised Domain Adaptation Regression by Aligning Inverse Gram Matrices [paper link][code|official][HPE domain transfer test for Male --> Female on BIWI dataset]

  • (AAAI2023) Learning a Generalized Gaze Estimator from Gaze-Consistent Feature [paper link]

  • 👍UnReGA(CVPR2023) Source-Free Adaptive Gaze Estimation by Uncertainty Reduction [paper link][paperswithcode link][code|official (not released)]

  • PnP-GA+(TPAMI2023) PnP-GA+: Plug-and-Play Domain Adaptation for Gaze Estimation using Model Variants [paper link][Domain Adaptive Gaze Estimation, extended based on PnP-GA(ICCV2021)]

※ 3D Human Pose Estimation

  • pose-hg-3d(ICCV2017) Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach [paper link][code|official][3D keypoints detection, weakly-supervised domain adaptation with a 3D geometric constraint-induced loss]

  • 3DKeypoints-DA(ECCV2018) Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency [paper link][arxiv link][code|official][It utilizes view-consistency to regularize predictions from unlabeled target domain in 3D keypoints detection, but depth scans and images from different views are required on the target domain]

  • (ACMMM2019) Unsupervised Domain Adaptation for 3D Human Pose Estimation [paper link][3D keypoints detection]

  • (CVPR2020) Weakly-Supervised 3D Human Pose Learning via Multi-View Images in the Wild [paper link][arxiv link][NVIDIA, It focuses on unlabelled multi-view images, Self-supervised learning for 3D human pose estimation]

  • AdaptPose(CVPR2022) AdaptPose: Cross-Dataset Adaptation for 3D Human Pose Estimation by Learnable Motion Generation [paper link][3D keypoints detection]

  • FewShot3DKP(CVPR2023) Few-Shot Geometry-Aware Keypoint Localization [paper link][project link][Few-Shot Learning, 3D Keypoint Localization, human faces, eyes, animals, cars, and never-before-seen mouth interior (teeth) localization tasks]

  • ACSM-Plus(CVPR2023) Learning Articulated Shape With Keypoint Pseudo-Labels From Web Images [paper link][2D Keypoints for downstream application, 3D Reconstruction / Shape Recovery from 2D images]

  • PoseDA (ICCV2023) Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation [paper link][arxiv link][code|official][ZJU]

  • 👍3D-Pose-Transfer (ICCV2023) Weakly-supervised 3D Pose Transfer with Keypoints [paper link][arxiv link][project link][code|official][National University of Singapore]

  • UAO(arxiv2024.02) Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation [arxiv link][Peking University, Shenzhen]

※ 2D Human Pose Estimation (Single and Multiple)

  • DataDistill, Pseudo-Labeling, PL(CVPR2018) Data Distillation: Towards Omni-Supervised Learning [paper link][arxiv link][Omni-Supervised Learning, a special regime of semi-supervised learning, with tasks human keypoint detection and general object detection]

  • MONET(ICCV2019) MONET: Multiview Semi-Supervised Keypoint Detection via Epipolar Divergence [paper link][arxiv link][code|official][University of Minnesota, multi-view inputs]

  • Pose_DomainAdaption(ACMMM2020) Alleviating Human-level Shift: A Robust Domain Adaptation Method for Multi-person Pose Estimation [paper link][Codes|PyTorch (not available)][(TMM2022 extended journal version) Structure-enriched Topology Learning for Cross-domain Multi-person Pose estimation]

  • SSKL(ICLR2021) Semi-supervised Keypoint Localization [openreview link][arxiv link][code|official][author Olga Moskvyak's homepage][single hand datasets, single person datasets, Semi-Supervised Keypoint Localization]

  • Semi_Human_Pose(ICCV2021) An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation [paper link][arxiv link][codes|official PyTorch][Semi-Supervised 2D Human Pose Estimation]

  • 👍❤RegDA(CVPR2021) Regressive Domain Adaptation for Unsupervised Keypoint Detection [paper link][project library][code|official][hand datasets (RHD→H3D), human datasets (SURREAL→Human3.6M, SURREAL→LSP)][ResNet101 + Simple Baseline][based on the DA classification method disparity discrepancy (DD) (ICML2019, authors including Mingsheng Long and Michael Jordan)][It utilizes one shared feature extractor and two separate regressors; It made changes in DD for human and hand pose estimation tasks, which measures discrepancy by estimating false predictions on the target domain]

  • 👍HPE-AdaptOR(arxiv2021.08)(Medical Image Analysis2022) Unsupervised domain adaptation for clinician pose estimation and instance segmentation in the operating room [paper link][arxiv link][code|official]

  • TransPar(TIP2022) Learning Transferable Parameters for Unsupervised Domain Adaptation [paper link][arxiv link][evaluation on tasks image classification and regression tasks (keypoint detection)][hand datasets (RHD→H3D), It emphasizes transferable parameters using a similar structure as RegDA which has one shared feature extractor and two separate regressors]

  • 👍❤UniFrame, UDA_PoseEstimation(ECCV2022) A Unified Framework for Domain Adaptive Pose Estimation [paper link][arxiv link][code|official][hand datasets (RHD→H3D), human datasets (SURREAL→Human3.6M, SURREAL→LSP), animal datasets (SynAnimal→TigDog, SynAnimal→AnimalPose), based on RegDA][ResNet101 + Simple Baseline][AdaIN (ICCV2017) for image style transfer + Mean Teacher for student model updating; It modifies the classic Mean-Teacher model by combining it with style transfer AdaIN]

  • iart-semi-pose(ACMMM2022) Semi-supervised Human Pose Estimation in Art-historical Images [arxiv link][code|official][Germany, Semi-Supervised 2D Human Pose Estimation]

  • PLACL(ICLR2022) Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint Localization [openreview link][arxiv link][author Sheng Jin's homepage][Semi-Supervised Keypoint Localization, backbone HRNet-w32, Curriculum Learning + Reinforcement Learning, slightly better than SSKL(ICLR2020)][largely based on (Curriculum-Labeling, AAAI2021) Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning]

  • ADHNN(AAAI2022) Adaptive Hypergraph Neural Network for Multi-person Pose Estimation [paper link][Codes|PyTorch (not available)]

  • (WACV2022) Transfer Learning for Pose Estimation of Illustrated Characters [paper link][arxiv link][codes|official PyTorch]

  • CD_HPE(ICASSP2022) Towards Accurate Cross-Domain in-Bed Human Pose Estimation [paper link][arxiv link][code|official]

  • EdgeTrans4Mark(ECCV2022) One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement [paper link][arxiv link][code|official][PKU, Landmark Localization, Medical Image]

  • SSPCM(CVPR2023) Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsistency Pseudo Label Correction Module [paper link][arxiv link][code|official][Semi-Supervised 2D Human Pose Estimation]

  • SCAI(self-correctable and adaptable inference)(CVPR2023) Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation [paper link][arxiv link][Domain Generalization][It works as a play-in-plug for top-down human pose estimation methods like SimpleBaseline and HRNet, the same author of SCIO]

  • Full-DG(full-view data generation)(TNNLS2023) Overcoming Data Deficiency for Multi-Person Pose Estimation [paper link][Full-DG can help improve pose estimators’ robustness and generalizability]

  • MAPS(arxiv2023.02) MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [arxiv link][code|official][hand datasets (RHD→H3D), human datasets (SURREAL→LSP), animal datasets (SynAnimal→TigDog, SynAnimal→AnimalPose), based on RegDA and UniFrame]

  • ImSty(Implicit Stylization)(ICLRW2023) Implicit Stylization for Domain Adaptation [openreview link][pdf link][workshop homepage]

  • SF-DAPE(ICCV2023) Source-free Domain Adaptive Human Pose Estimation [paper link][arxiv link][code|official][Source-free Domain Adaptation, hand datasets (RHD→H3D, RHD→FreiHand), human datasets (SURREAL→Human3.6M, SURREAL→LSP)]

  • POST(ICCV2023) Prior-guided Source-free Domain Adaptation for Human Pose Estimation [paper link][arxiv link][Source-free Domain Adaptation, Self-training, human datasets (SURREAL→Human3.6M, SURREAL→LSP)]

  • Pseudo-Heatmaps(arxiv2023.10) Denoising and Selecting Pseudo-Heatmaps for Semi-Supervised Human Pose Estimation [arxiv link][based on the DualPose (ICCV2021), do not compare with SSPCM(CVPR2023)]

  • MDSs(arxiv2023.10)(under review in ICLR2024) Modeling the Uncertainty with Maximum Discrepant Students for Semi-supervised 2D Pose Estimation [arxiv link][code|official][based on the DualPose (ICCV2021), do not compare with SSPCM(CVPR2023)]

▶ Keypoints Meet Large Language Model

Large Language Model / Large Vision Model / Vision-Language Model for Human / Animals / Anything

  • 👍CLAMP(CVPR2023) CLAMP: Prompt-Based Contrastive Learning for Connecting Language and Animal Pose [paper link][arxiv link][code|official][CLIP, Tao Dacheng, trained and tested on dataset AP-10K, also see APT-36K]

  • PoseFix(ICCV2023) PoseFix: Correcting 3D Human Poses with Natural Language [paper link][arxiv link][code|official]

  • UniAP(arxiv2023.08) UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning [arxiv link][CLIP, ZJU, Few-shot Learning, various perception tasks including pose estimation, segmentation, and classification tasks]

  • KDSM(arxiv2023.10) Language-driven Open-Vocabulary Keypoint Detection for Animal Body and Face [arxiv link][CLIP, XJU + Shanghai AI Lab, Open-Vocabulary Keypoint Detection]

  • UniPose(arxiv2023.10)(under review in ICLR2024) UniPose: Detecting Any Keypoints [openreview link][arxiv link][project link][code|official][IDEA-Research, using visual or textual prompts]

  • VLPosee(arxiv2024.02) VLPose: Bridging the Domain Gap in Pose Estimation with Language-Vision Tuning [arxiv link][by CUHK, Language-Vision Model, on datasets COCO and HumanArt][VLPose leverages the synergy between language and vision to extend the generalization and robustness of pose estimation models beyond the traditional domains.]

▶ Keypoints for Human Motion Generation

Motion Synthesis / Motion Diffusion Model