This repository aims to present the development of research in computer vision and information processing applied to the detection of human biosignals. As a first milestone, the proposal, and development of a dataset are presented to enable the training of neural networks from signal signatures in time series.
To perform the extraction, analysis, and interpretation of human biosignals using a technological route that standardizes visual acquisition via video cameras in normalized measures that describe behavior signatures from multivariate time series.
In the context of video image processing, the conventional technological route is presented in Figure 1 by flow 1. This approach is the most used because it focuses on multilayer detection and the use of neural networks in the image domain. On the other hand, this research addresses another route, 2, which creates computational structures that extract and normalize the information of interest and perform analyses in the domain of time series.
Figure 1 - Approach 1 deals with neural networks in the image domain and 2 focuses on neural network processing in the domain of multivariate time series.
In general terms, the ideal computational solution in the approach presented above would have the following functionalities as listed below, enumerated according to Figure 2:
- Detect and crop the human face in the image, separating it from other elements in the scene.
- Identify the points of interest that describe the face.
- Calibrate, normalize and standardize observations to generate measurable and comparable criteria.
- Extract measurements from spatial and temporal descriptors, and generate multivariate time series.
- Perform analyses, cross-links, and correlations with real-world events.
The first principle for this previously mentioned proposal to become feasible is the availability of time series data extracted from videos with diverse conditions. Thus, it is necessary to have organized datasets that represent various classes of possibilities that enable the training of the neural networks necessary to meet the functionalities presented in Figure 2.
In Figure 3 below, the main modules comprising the two stages of the project (A and B) are indicated.
Figure 3 - Dataset Stages, A represents the raw data generation and B the Labeling and Qualification flow.
For a more detailed description of each block:
If you want to just run the application and see how it works:
If you want to use your own source of videos:
- Create a better normalization process, for the Yaw and Pitch head movements to be normalized.
- Improve the existent neural network for a better performance.