Skip to content

In this project, we have a dataset of Cancer genes found in different Cancer cell lines/tissues. I tried to clustered the similar cancer genes using hierarchical and k-means clustering in R

License

Notifications You must be signed in to change notification settings

Vasi00/Cancer-Genes-Clustering

Repository files navigation

Cancer-Genes-Clustering

In this project, we have a dataset of Cancer genes found in different Cancer cell lines/tissues. I clustered the similar cancer genes using hierarchical and k-means clustering.

Data source: Platform file for row names(genes name): https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL14924 dataset: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE30034 Download Series File(s) from Download family available at this link

Cancer data.txt file is extracted from the GSE30034-GPL14924_series_matrix.txt

Genes Expression profiling by RT-PCR

row-names and col-names are taken from source files GPL14924-tbl-1.txt and GSE30034-GPL14924_series_matrix.txt respectively, and merged them in our Cancer data.txt to create a clean and complete copy of dataset, which was not available from source. I have uploaded them here suffixing row-names and colnames before source file name respectively.

I belive applying data analytics techniques in medical and healthcare field can help save lives and because of my interest in healthcare analytics, I did this project to cluster cancer genes. This is an original work. More research work is needed be done in this regard. Connect with me on https://www.linkedin.com/in/vasi-rahman

About

In this project, we have a dataset of Cancer genes found in different Cancer cell lines/tissues. I tried to clustered the similar cancer genes using hierarchical and k-means clustering in R

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages