Skip to content

v0.0.1

Latest
Compare
Choose a tag to compare
@tengjuilin tengjuilin released this 20 Aug 05:52
· 120 commits to main since this release

v0.0.1 Release Notes

vampire-analysis is a package based on vampireanalysis GUI (https://doi.org/10.1038/s41596-020-00432-x). The algorithmic operations are isolated from the GUI component and grouped into modules to encourage reuse and improve reproducibility. With extensive documentation and tutorial, vampire-analysis provides a flexible alternative to the GUI.

Changes

Interface changes from GUI to package

vampire-analysis provides package API instead of graphical user interface (GUI).

Model information stored in files

Information used to build and apply model are now stored in a .csv or .xlsx file or in a DataFrame, instead of being manually inputted when prompted by GUI.

New Features

Option for random state

Option for random state of K-means clustering and plotting representative contours are now to the user for reproducible testing.

AND filtering of image

Image filename can be screened using AND filtering when building and applying models with optional columns, being more flexible than tags.

New PCA implementation

Principal component analysis is widely used in this package. PCA is implemented using singular value decomposition (SVD) and eigen-decomposition, depending on the input matrix. The implementation is faster than the past and sklearn.

More plotting options

The package comes with plotting of shape mode distribution, dengrogram, and mean shape mode, in the form of isolated plots and combined plots.

Improvements

Defaults for model parameters

Parameters such as output_path, model_name, num_points, and num_clusters are given default values. Default values are used when corresponding value is left black in .csv/.xlsx or being None/np.NaN in DataFrame.

Performance Improvements

For an image set of 221 images that contains 11173 segmented cells, the performance is as follows:

Build model [s] Apply model [s]
vampireanalysis GUI 517 98
vampire-analysis package 80 26
Improvement 85% faster 73% faster

TODO

The very first release of vampire-analysis aims to reproduce the result of the vampireanalysis GUI. There are a few improvements that can be made in future releases.

Flexible num_pc

Currently, the number of principal component used, num_pc is hardcoded as 20, as seen in the GUI implementation. Ideally, the value should change based on the explained variance of the principal components, as described in the paper.

We could also allow the option for user input num_pc, where integer in the range (0, 2*num_points] specifies the truncation, and float in the range (0, 1) specifies the percent total variance captured.

Scree plot for PCA

When using principal component analysis, we usually need scree plot to observe the amount of variance captured in the top few principal components. Support for plotting scree plot and incorporation into the API is needed.