Skip to content

Side-by-side Python and R code for general exploratory data analysis. Genetic data from The Cancer Genome Atlas is used as a test case.

License

Notifications You must be signed in to change notification settings

carMartinez/eda_python_vs_r

Repository files navigation

A reference guide for exploratory data analysis (EDA) in Python and R. Genetic data from the breast cancer project of The Cancer Genome Atlas (TCGA) is used to walk through some routine tasks when you first get your hands on a new (semi-clean) dataset. See the blog post for explanations and side-by-side Python vs R comparisons.

To download the data, first download the GDC Data Transfer Tool and the MANIFEST.txt file in this repo, then simply run

./gdc-client download -m MANIFEST.txt

from a terminal. Any dataset from TCGA could be used instead of breast cancer as well.

About

Side-by-side Python and R code for general exploratory data analysis. Genetic data from The Cancer Genome Atlas is used as a test case.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published