giExtract

A universal framework for the extracting features from digital H&E images using multiple CNN pretrained models. Extracting features from multiple CNNs models captures a wider range of functionally relevant features.

The core of this tool is built in python3.8 with tensorflow backend and keras functional API, while the downstream analysis is implemented in R programming language.

Installation and running the tool

The best way to get giExtract along with all the dependencies is to install the release from python package installer (pip).

pip install giExtract This will add two command line scripts:

Script	Context	Usage
giCube	Create image patches	`giCube -h`
giExtract	Extract features from patches	`giExtract -h`

Utility functions can be imported using conventional python system like from giExtract.util import generator

Input giCube

The main input here is the path to the H&E images slides (in .jpg or .png), specified by -p to load and create patches. All other arguments are optional and have been set to reasonable default. User can use giCube -h to show the options and the default settings.

Output giCube

Image patches from the H&E slides, which will be saved in "cubes" directory at the path provided in the input.

Input giExtract

The two main inputs are the path to the H&E cubes generated by giCube (.jpg), specified by -p and path to the meta file (in .csv) to flow the patches during feature extraction -c. The context file must have a column with file names matching the patches in the path. All other arguments are optional and have been set to reasonable default. Use giExtract -h to see options and default settings.

Output giExtract

A table of features extracted by the different CNN models, with patches as rows and features as columns. The columns in the output file is named to indicate CNN origin of the feature example "inception_46".

Name	feature 1	feature 2	feature 2
patch 1	0.2	0.1	0.6
patch 2	5.2	0.14	0.6
patch 3	0.6	0.1	0.7

Extras

An R script for analysing the output of giExtract and identifying differential features (see Manuscript) is included under R/ directory, with a README file on usage. The script giFeature.R script requires two mandatory inputs:

Path to a csv file with meta information (must have only three columns: Name, slide and Group).
Path to csv file with cnn features to analyse (must be an output of giExtract). Details about the optional arguments and the requirement for R and tidyverse package are given inside the README file.

Manuscript analysis

To reproduce the analysis reported in the manuscript user can execute run.sh script inside the manuscript folder. This assumes giExtract has been installed via pip as stated above, and R is installed on your system. The run.sh script will perform the three core analysis 1) patch generation 2) feature extraction and 3) differential feature analysis. To generate the plots and automatically extract images, user can run the codes in downstream.R.

Example data

Example datasets are provided inside manuscript/data. It these give visuals of what to expect for the input/output files. Note, only a subset of the data is provided due size requirement and access control. Full dataset used for our computational histology subtype inference analysis can be requested from the corresponding authors.

To clone the source repository

git clone https://github.com/caanene1/giExtract

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
R		R
bin		bin
build		build
dist		dist
giExtract.egg-info		giExtract.egg-info
giExtract		giExtract
manuscript		manuscript
.DS_Store		.DS_Store
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

License

caanene1/giExtract

Folders and files

Latest commit

History

Repository files navigation

giExtract

Installation and running the tool

Input giCube

Output giCube

Input giExtract

Output giExtract

Extras

Manuscript analysis

Example data

To clone the source repository

About

Resources

License

Stars

Watchers

Forks

Languages