Copy number variations (CNVs) describe a subset of the wide variety of genetic modifications that occur in humans. However, it remains difficult for researchers to predict the effects a CNV will have on an individual. CNVs exhibit a spectrum of phenotypic effects ranging from benign to pathogenic to even beneficial. This project aims to detect pathogenic CNVs, while safely discarding CNVs that are confidently predicted to be benign.
This repository contains the code and datasets required to replicate the results of the project. Furthermore, the libraries used for Feature Extraction can be repurposed for any project involving regions of genetic data aligned to the hg19 reference genome. Every top level folder contains a descriptive README
. The following links are example notebooks from the project.
This project depends on Python 3. The Python 3 libraries needed are listed in requirements.txt
.