Here are the source codes of cell image analysis methods presented in the research.
Related work: Link
-----Last Update Time: 2019/11/19 -----
Test Environment : Win 10 Version 1909
GPU:Nvidia GTX-1080 Ti (Not necessary but highly recommended)
CPU:Inten i7-8700K
RAM:32GB
Environment Tool : Anaconda 3
Python Version : 3.6
Tensorflow Version : 2.0.0 (1.x should be fine)
Keras Version : 2.3.1
Pandas Version : 0.24.2
Open-CV Version : 4.1.0
Scipy Version : 1.2.1
sklearn Version : 0.21.2
Numpy Version : 1.16.4
matplotlib Version : 3.1.2
PIL Version : 6.2.1
Use the package manager pip to install all the libraries mentioned above.
pip install library_name
or
pip3 install library_name
Step 2. Make sure you already install all the libraries you saw above and those in the import session of the code.
Step 3. Make sure the codes, "Euglena" folder and "Whitecell" folder are in the same path. (this is the default data path in the code)
It should be like this:
- Source Codes (.py files)
- Euglena
- N-
- Image_0.tif
- Image_1.tif ...
- N+
- Image_0.tif
- Image_1.tif ...
- N-
- Whitecell
- lymphocyte
- Image_0.tif
- Image_1.tif ...
- neutrophyl
- Image_0.tif
- Image_1.tif ...
- lymphocyte
- README.md
For a non-python computer:1.5 hours
For a python-ready computer:less than 15 minutes
You don't need to change any parameter in the codes, just make sure the codes, "Euglena" folder and "Whitecell" folder are in the same path.
Expected run time on a GPU support environment for all the codes is less than 30 minutes.
Expected output:
- Fig4_20xx_xx_xx
(t-SNE has a cost function that is not convex, i.e. with different initializations we can get different results.)
Expected output:
- Fig4_20xx_xx_xx
- Fig4c_Whitecell_tSNE.png
- Fig4c_Whitecell_VGG16_tSNE_Result.xlsx
- Whitecell_VGG16_training_history.xlsx
- .h5 model file. (It's more than 600 MB so I didn't upload to here.)
Expected output:
- Fig5_20xx_xx_xx
Expected output:
- Fig5_20xx_xx_xx
Expected output:
- SFig11_20xx_xx_xx
(t-SNE has a cost function that is not convex, i.e. with different initializations we can get different results.)
Expected output:
- SFig12_20xx_xx_xx
- Euglena_tSNE.png
- SFig12_Euglena_VGG16_tSNE_Result.xlsx
- Euglena_VGG16_training_history.xlsx
- .h5 model file. (It's more than 600 MB so I didn't upload to here.)
These codes were written for 2 classes and the images are TIFF format. If you want to use your data, follow the steps below:
It should be like:
- Cell
- cell_type_a
- Image_0.png
- Image_1.png ...
- cell_type_b
- Image_0.png
- Image_1.png ...
- cell_type_a
from
labels=['neutrophyl','lymphocyte']
base_path = r'.\Whitecell'
to
labels=['cell_type_a','cell_type_b']
base_path = r'G:\Data\Cell'
from
entry1=r'.\Euglena\N-'
entry2=r'.\Euglena\N+'
to
entry1=r'G:[\]Data\Cell\cell_type_a'
entry2=r'G:\Data\Cell\cell_type_b'
from
if file.endswith(".tif"):
to
if file.endswith(".png"):
from
fnamelist1 = glob.glob(os.path.join(entry1, '*.tif'))
fnamelist2 = glob.glob(os.path.join(entry2, '*.tif'))
to
fnamelist1 = glob.glob(os.path.join(entry1, '*.png'))
fnamelist2 = glob.glob(os.path.join(entry2, '*.png'))
For VGG16 classification codes, you can use any type of image data, but the recommened shape of the image is better less than ( 250 x 250 ).
Otherwise you will need a lot of time and RAM for training the model.
**Make sure all the images in the folder are in the same shape.
For other codes, the main logic is to use the first two layers of the image to do the image processing.
normal RGB image:
- Layer 1 -R (red color)
- Layer 2 -G (green color)
- Layer 3 -B (blue color)
demo image: Whitecell image (8 bit):
- Layer 1 - cytoplasm
- Layer 2 - nucleus
- Layer 3 - not used (abandon)
Euglena image (8 bit):
- Layer 1 - chlorophyll
- Layer 2 - lipids
- Layer 3 - not used (abandon)
Make sure your information are in the right layer before you start the processing.
We put the original data (monochromatic tiff files) in Google Drive, please download them and unzip in same folder with the mono_to_RGB_generator.py.
See readme.txt for the details.
Download Link:
- Whitecell 385 MB
- 20,000 TIFF files for each type of the whitecells (Totoal 40,000 images)
- Euglena 1.67 GB
- ~25,000 TIFF files for each type of the euglena (Totoal ~50,000 images)
You can change the variable raw_cell and raw_cell_type to generate different types of data.
#%% Set the RAW data type
raw_cell = "Euglena" # "Whitecell" or "Euglena"
raw_cell_type = "N-" # in Whitecell are "lymphocyte" and "neutrophyl", in Euglena are "N+" and "N-"
The code will create a folder named "generated" and all the images will be stored in it.
You don't have to process all the data, you have a choice.
#%%Generate and save
image_number = 100 # generate 100 images you can use len(file_list_ch1) to generate all
To generate 10,000 images, it tooks less than 5 minutes.