Skip to content

YangZhou0417/CSX460

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSX460

This repository contains materials for Practical Machine Learning with R (CSX460) at the University of California, Berkeley. The most recent class is/was Fall 2015.

Course Description

This course provides an introduction to machine learning using R, the open source, statistical programming language. Once a niche set of tools for statisticians, programmers and quants, machine learning (sometimes also called data mining, statistical learning) has spread in popularity to a wide variety of applications and disciplines. This course teaches the fundamentals of machine learning without delving into the theory. The course will teach practical aspects of machine learning so that the students will be able to apply lessons to solve problems using machine learning in their own fields.

Course Learning Objectives

When students have completed this course, they will know how to:

  • Distinguish fundamental aspects of machine learning algorithms
  • Frame problems to make the suitable for solution via machine learning
  • Train machine learning models
  • Evaluate machine learning models
  • Deploy machine learning models in to operations
  • Build models for prediction, categorization and recommendations
  • Collaborate in a group using tools for collaborative/social programming
  • Generate high quality, graphical and textual results

Intended Audience

  • Anyone who wishes to learn the fundamentals of machine learning
  • Anyone who wants to learn about using R to build, evaluate or deploy machine learning models.
  • Scientists, engineers, business analysts, research who explore and analyze data and wish to present their findings in well-formatted textual and graphical forms. Anyone wishing to get hands-on experience building machine learning models.

Prerequisites

  • Experience programming in at least one high-level programming language such as BASIC, PASCAL, C, Java, Python, Perl, or Ruby.
  • Familiarity with R such as that gained through the Programming with R course.
  • Basic knowledge of statistics as covered in a first-semester undergraduate statistics course. There will be some coverage of basic statistical techniques as part of covering core elements of the Machine Learning.
  • Personal laptop for completing in class assignments.

Text/Required Reading

Reading Requirements for the Course

Applied Predicative Modeling  
ISBN-13: 978-1461468486 ISBN-10: 1461468485 
Kuhn, Max and Johnson, Kjell
Springer Science+Business
2013 

Google Group

There is an analogous group for this class: CSX460

Session by Session Summary

  1. Introduction to R, setting up the ML developers environment a. Installing R b. Installing R Studio c. Installing packages from CRAN, Bioconductor and Github d. Exercises

  2. Fundamentals of Machine Learning a. Machine learning overview b. Regression and classification c. Supervised, unsupervised, and semi-supervised d. Algorithm types and requirements e. Exercises

  3. Linear Regression (2 sessions) a. OLS Regression b. Data partitioning c. Model evaluation and tuning d. Exercises

  4. Logistic Regression a. Logistic Regression b. Exercises

  5. Advanced Techniques: Partitioning Methods a. CART/Regression Trees b. Clustering c. K Nearest Neighbors d. Exercises

  6. Advanced Techniques a. Bagging b. Bagged Trees / Random Forests c. Exercises

  7. Advanced Techniques: Boosting a. Boosting b. Neural Networks c. Support Vector Machines d. Exercises

  8. Deployment a. Diving into the data lake b. Optimization c. Delivery and Production

  9. Final Lecture a. Exercises b. Exam

About

CSX460 Entire Class

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 99.0%
  • R 1.0%