Skip to content

swarnimshukla/Supervised-learning-of-Plasmodium-falciparum-life-cycle-stages-using-single-cell-transcriptomes-iden

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Supervised learning of Plasmodium falciparum life cycle stages using single-cell transcriptomes identifies crucial proteins


Table of Contents
  1. About The Project
  2. Getting Started
  3. File details
  4. License
  5. Contact

About The Project

Malaria, spread by the female Anopheles mosquito, is a highly fatal disease widespread in many parts of the world, causing 0.4 million deaths globally. Vital gene expressions form the basis in the detection of malaria infection levels. Quantification of malaria parasite infected RBCs and classification of its life cycle stages are done at macroscopic level by experts, for making informed decisions. Off late multiple computational ap- proaches have been proposed to circumvent the problem of dimensionality leading to accurate predicted results. In this work a dimensionality reduction technique based on Genetic Algorithm (GA) is applied on P. falciparum single-cell transcriptomics to arrive at an optimized subset of features from the larger dataset. Features are chosen based on their class variants considering increased efficiency and accuracy, to sepa- rately transform the selected elements into a lower dimension. For the classification of the life cycle of malaria parasite based on single cell transcriptome data, a three- pronged approach employing the multiclass Support Vector Machine (SVM), Logistic Regression (LR) and Random Forest (RF) techniques is used. Distribution of cells was visualised and mapped using the R-based Seurat package. Further, we constructed pro- tein interaction networks of the genes identified by the feature selection method and elucidated the role of the proteins in progression of the parasite through it’s life cycle. Our approach presents a novel protocol to implement ML techniques on scRNA seq datasets and subsequently harnessing the extracted information for biomarker/drug target detection.

(back to top)

Built With

  • Python 3.5
  • sklearn
  • sklearn-genetic

(back to top)

Getting Started

These are the steps to run the code locally on your pc:

Prerequisites

  • pip install all the required libraries.

Installation

  1. Clone the repo
    git clone https://github.com/swarnimshukla/Machine-learning-approaches-for-classification-of-Plasmodium-falciparum-life-cycle.git
  2. Install pip packages
    pip3 install ....

How to run

   Run ga_feature_selection.ipynb on jupyter notebook after installing all the libraries.

(back to top)

File Details

  • Data.zip -> input data
  • ExploratoryDataAnalysis.ipynb -> input data analysis
  • ga_feature_selection.ipynb -> main file with feature selection code
  • classification_without_feature_selection.ipynb -> code for classification without feature selection
  • Classification_of_selected_features.ipynb -> code for classification with feature selection
  • random-378-features.ipynb -> randomly 378 features classifcation
  • MI_bar_graph.ipynb -> bar plot generated in the paper

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Contact

Your Name - Swarnim Shukla - swarnim.shukla@research.iiit.ac.in

Project Link: https://github.com/swarnimshukla/Supervised-learning-of-Plasmodium-falciparum-life-cycle-stages-using-single-cell-transcriptomes-iden

(back to top)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published