Skip to content

Project portfolio for Statistical Learning for Big Data containing the written report and respective scripts.

Notifications You must be signed in to change notification settings

wrosko/statistical_learning_for_big_data

Repository files navigation

statistical_learning_for_big_data

Project portfolio for Statistical Learning for Big Data containing the written report and respective scripts.

Overview

This repository contains my analyses and exploratory methods for the Statistical Learning for Big Data spring 2018 class with Prof. Rebecka Jörnsten at Chalmers University of Technology/Gothenburg University.

The exam was broken into three sections:

  1. MINI review and assignments
  2. TCGA genomics data and mislabeled data
  3. Simulation studies of K clusters and L classes

The assignment questions are available in: Exam2018.pdf

Further information is available at http://www.math.chalmers.se/Stat/Grundutb/GU/MSA220/S18/

MINI (Question 1)

The MINI assignments were weekly or bi-weekly and were aimed to help introduce various statistical/machine learning methods. I used various datasets, and also generated some of my own data. The R scripts are in this main directory, with associated data in the data/ directory.

data/mini3/ contains artificial data and the Python script used to generate it.

MINI3 script is quite messy

TCGA (Question 2)

About

Project portfolio for Statistical Learning for Big Data containing the written report and respective scripts.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published