Skip to content

Small Library using NumPy to extract many features from both temporal and frequential domain from CSV Files. It must contains inertial data.

Notifications You must be signed in to change notification settings

LIARALab/Python-FeatureExtractor

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Python FeatureExtractor

Languages Licence Apache2

Description

This project is a Python3 application designed to be used with CSV files. It extracts features from any given dataset recorded with inertial sensors like Accelerometer, Gyroscope or Magnetometer. This library is adaptable to be able to extract features from any amount of devices with any amount of axes (accelerometer 2-axes and gyroscope 3-axes, or a single accelerometer 3-axes for example).

Features

Any instance of the FeatureExtractor object has an instance of FeatureManagement containing the features. For each sensor, the following features can be extracted :

AVERAGE                     = 0x01
AVERAGE_TOTAL               = 0x02
STANDARD_DEVIATION          = 0x04
STANDARD_DEVIATION_TOTAL    = 0x08
SKEWNESS                    = 0x10
SKEWNESS_TOTAL              = 0x20
KURTOSIS                    = 0x40
KURTOSIS_TOTAL              = 0x80
ZERO_CROSSING_RATE          = 0x100
ZERO_CROSSING_RATE_TOTAL    = 0x200
CORRELATION                 = 0x400
CORRELATION_TOTAL           = 0x800
DC_COMPONENT                = 0x1000
DC_COMPONENT_TOTAL          = 0x2000
ENERGY                      = 0x4000
ENERGY_TOTAL                = 0x8000
ENTROPY                     = 0x10000
ENTROPY_TOTAL               = 0x20000

Every features activated results in a single number easily understandable with binary reading. The management of these features is really simple : Just use the method "Add" or "Remove" to do so ! Also, if a feature need its first form to be computed and it is not activated, it won't be computed (for example ENERGY to compute ENERGY_TOTAL ; if ENERGY is disabled, ENERGY_TOTAL is also disabled).

Check this following example to see how it is managed !

from Feature import Feature, ReturnType
from FeatureExtractor import FeatureExtractor

Extractor = FeatureExtractor()

print(Extractor.FeatureManagement.GetAll(return_type=ReturnType.BIN))
>>> 0b111111111111111111

Extractor.FeatureManagement.Remove(Feature.DC_COMPONENT_TOTAL)
Extractor.FeatureManagement.Remove(Feature.ENERGY_TOTAL)
Extractor.FeatureManagement.Remove(Feature.ENTROPY_TOTAL)

print(Extractor.FeatureManagement.GetAll(return_type=ReturnType.BIN))
>>> 0b010101111111111111

Extractor.FeatureManagement.Add(Feature.ENTROPY_TOTAL)

print(Extractor.FeatureManagement.GetAll(return_type=ReturnType.BIN))
>>> 0b110101111111111111

Is is good to notice that 3 features come from the frequential domain, needing FFT Data (increase of computation number). If you want to spare them, just Remove them as presented in the previous example !

Installation

Here is the list of dependencies needed to make the library work :

To install this library, just clone this Git Repo using this command :

git clone https://github.com/kevinchapron/Python-FeatureExtractor.git

Usage

To run example.py, we use the dataset of activities created by the LIARA Laboratory. The example will automatically download the dataset from the Git repo.

The FeatureExtractor class has many methods :

Extractor.AddDevice({"name":"Accelerometer","tab":["ax","ay","az"]})
  • name is the full name of the sensor
  • tab is a list containing the columns associated with the device
DATA = {"c1":[0,1,2,3],"c2":[1,2,3,4],"c3":[2,3,4,5]}
Extractor.ExtractFeatures(DATA)

DATA is a formatted dict containing each column name as key, and data related to this column in it.
The "DATA" variable is an example of structure for this CSV File :

c1 c2 c3
0 1 2
1 2 3
2 3 4
3 4 5
Extractor.ExtractDataFromFile(FILENAME)

The result of this method is the CSV Data formatted as used in this library (can be combined with ExtractFeatures !!)

Extractor.ExtractFeaturesFromFolder(FOLDER,output_file=None,class_added=None)
  • This method returns a CSV-Like dataset with every features asked (one line is computed from one file).
  • FOLDER is the full path to the folder containing every CSV files.
  • output_file is optional, and has three possible values :
    • "None" → Output file won't be created.
    • "auto" → Output file will be created in "output" folder, with the name of the folder.
    • "custom_string" → Output file will be created at the location specified.
  • class_added is optional, and has two possible values :
    • "None" → Class won't be added.
    • "custom_string" → Class will be added at the end of the resulting data with this custom string as value and "class" as column name.
Extractor.MergeFiles(FOLDER,output_file=None)
  • This method returns a CSV-Like dataset with the merge of the files in specified folder.
  • FOLDER is the full path to the folder containing every CSV files.
  • output_file is optional, and has three possible values :
    • "None" → Output file won't be created.
    • "auto" → Output file will be created in the folder specified, with the name "merging.csv".
    • "custom_string" → Output file will be created at the location specified.

More information

This project has been created to extract features of real-time data, in the LIARA laboratory (Laboratoire d'Intelligence Ambiante pour la Reconnaissance d'Activités), at the « Université du Québec À Chicoutimi (UQAC) »

Author

Kévin CHAPRON - 2018

License

Copyright 2016 Kévin Chapron

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

About

Small Library using NumPy to extract many features from both temporal and frequential domain from CSV Files. It must contains inertial data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%