FixItFelix

This little helper repairs data collected by our project partner with NI Hardware. A bug in the data collection software adds additional values copied from a previous slice at the end of each data chunk before writing it down to disk.

Installation

You may use Poetry to install all needed packages in a fresh environment or check if you have the needed packages in pyproject.toml.

Theory

We are lucky, because the bug behavior is well understood and follows simple patterns. Additionally, no random values are added, but copies of previous data. So we are able to test if we have found (and deleted) the right values.

You can see an example in the image below:

As you can see, three variables are needed to describe the ongoing:

chunk_size: Length of a chunk of good data, each written to disk one after another
recurrence_size: Length of a bad data chunk, copied from a position before
recurrence_distance: Distance from the bad data to the position they are taken from

Description of our Data Correction Process

The data is stored in a National Instruments TDMS binary file. This file is then read with nptdms and converted to numpys memory maps, e.g. written to disk and accessable with numpy.

Alternatively the input data can be stored in a directory. This directory has to contain exclusively TDMS binary files. If the directory is chosen as input, all files are corrected at once.

The three variables that describe the error pattern are than used to make a list of index pairs that describe the "good data" chunks. Those chunks are then written to a new, corrected TDMS file. There are two methods of reading out the data before writing to the the corrected TDMS file. The default method is to read out each chunk individually. This is a reliable method which works for any data. The second method is to read out a whole channel and slice it into chunks. This method should be used, if the chunk size is especially low to avoid excessive reads and increase performance. You should also only use this method, if each of the channels fits into memory.

Because we do not want to rely on correct pattern variables, we employed an error monade to do several tests on the data and check if the recurrences in the data are described correctly. In case of a directory as input, all files are checked first before written to disk. At the moment, you have to find the correct variables on your own. We may build an algorithm to automate the pattern recognition later.

CLI Usage

For easy usage of the underlying algorithms we provide a command line interface. After installation, it can be called by fixit [OPTIONS] FILENAME.

Type fixit --help for additional information.

FILENAME is the path to the file you want to correct. The result is marked with a suffix _corrected and placed into the same folder. The input file is not changed. Instead of a path to a single file a path to a directory can be given. The resulting directory and all included files will also have the _corrected suffix after correction.

The [OPTIONS] can be provided in the call, but fixitfelix is able to ask for all needed parameters afterwards. If available, previously used parameters are provided as default options.

Always make sure to have free diskspace for the resulting corrected file.

Smart Erosion

Point 8 is a partner in the research project SmartErosion. This tool was created as part of the research project. The project is supported by funds from the European Regional Development Fund (ERDF) 2014-2020 "Investment for Growth and Jobs".

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
fixitfelix		fixitfelix
tests		tests
.gitignore		.gitignore
EFRE_Foerderhinweis_englisch_farbig.jpg		EFRE_Foerderhinweis_englisch_farbig.jpg
LICENSE		LICENSE
README.md		README.md
Ziel2NRW_RGB_1809_jpg.jpg		Ziel2NRW_RGB_1809_jpg.jpg
fife_logo.png		fife_logo.png
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
visualization_of_recurrence_pattern.png		visualization_of_recurrence_pattern.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FixItFelix

Installation

Theory

Description of our Data Correction Process

CLI Usage

Smart Erosion

About

Releases

Packages

Languages

License

LarsHenrichvark/fixitfelix

Folders and files

Latest commit

History

Repository files navigation

FixItFelix

Installation

Theory

Description of our Data Correction Process

CLI Usage

Smart Erosion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages