Skip to content
classify human actions using pose estimation with tflite/pytorch and scikit-learn
Branch: master
Clone or download

ActionAI 🤸

Python 3.x stars forks license twitter

ActionAI is a python library for training machine learning models to classify human action. It is a generalization of our yoga smart personal trainer, which is included in this repo as an example.


Getting Started

These instructions will show how to prepare your image data, train a model, and deploy the model to classify human action from image samples. See deployment for notes on how to deploy the project on a live stream.



We recommend using a virtual environment to avoid any conflicts with your system's global configuration. You can install the required dependencies via pip:

Jetson Nano Installation

We use the trt_pose repo to extract pose estimations. Please look to this repo to install the required dependencies.

# Assuming your python path points to python 3.x 
pip install -r requirements.txt

All preprocessing, training, and deployment configuration variables are stored in the file in the config/ directory. You can create your own files and store them in this directory for fast experimentation.

The file included imports a LinearRegression model as our classifier by default.


After proprocessing your image data using the script, you can create a model by calling the actionModel()function, which creates a scikit-learn pipeline. Then, call the trainModel() function with your data to train:

# Stage your model
pipeline = actionModel(config.classifier())

# Train your model
model = trainModel(config.csv_path, pipeline)

Data processing

Arrange your image data as a directory of subdirectories, each subdirectory named as a label for the images contained in it. Your directory structure should look like this:

├── images_dir
│   ├── class_1
│   │   ├── sample1.png
│   │   ├── sample2.jpg
│   │   ├── ...
│   ├── class_2
│   │   ├── sample1.png
│   │   ├── sample2.jpg
│   │   ├── ...
.   .
.   .

Samples should be standard image files recognized by the pillow library.

To generate a dataset from your images, run the script.


This will stage the labeled image dataset in a csv file written to the data/ directory.


After reading the csv file into a dataframe, a custom scikit-learn transformer estimates body keypoints to produce a low-dimensional feature vector for each sample image. This representation is fed into a scikit-learn classifier set in the config file.

Run the script to train and save a classifier


The pickled model will be saved in the models/ directory


We've provided a sample inference script,, that will read input from a webcam, mp4, or rstp stream, run inference on each frame, and print inference results.


Please read for details on our code of conduct, and the process for submitting pull requests to us.


This project is licensed under the GNU General Public License v3.0 - see the file for details


You can’t perform that action at this time.