# CS345 Course Introduction

This notebook is part of  course materials for CS 345: Machine Learning Foundations and Practice at Colorado State University.
Original versions were created by Asa Ben-Hur with updates by Ross Beveridge.

Last updated 8/23/2021

*The text is released under the [CC BY-SA license](https://creativecommons.org/licenses/by-sa/4.0/), and code is released under the [MIT license](https://opensource.org/licenses/MIT).*



### Course Basics

Instuctor: Ross Beveridge

GTA: Yongxin Liu

Lecture: Tuesday / Thursday 3:30 to 4:45 Engineering 120 and on Zoom

In doubt, start at the [Public Website](https://www.cs.colostate.edu/~cs345/yr2021fa/)

*We will now review briefly the course Syllabus and semester topics and then return here*

### What is machine learning

**Machine learning:**  the construction and study of systems that learn from data.


Machine learning is an interdisciplinary field that requires background in multiple areas:

* Linear algebra for working with vectors and matrices
* Statistics and probability for reasoning about uncertainty
* Calculus for optimization
* Programming for efficient implementation of the algorithms


### Eric Grimson's Excellent Introduction

We will not take a lot of time hear selling you on the significance of Machine Learning, it not overstating the case to claim that the material you will start to learn in this course has come to play a role in almost all significant scientific and engineering endeavors. 

It should not come as any surprise that MIT has a excellent introduction to this material and I draw your attention to [Lecture 11 by Eric Grimson](https://www.youtube.com/watch?v=h0e2HAPTGF4). Put simply, I don't really think I can improve upon this lecture by [Eric Grimon](https://www.csail.mit.edu/person/eric-grimson) - I encourage you all to watch it.  

### A Classic Supervised Learning Example

Example problem:  handwritten digit recognition.

Some examples from the [MNIST dataset](https://en.wikipedia.org/wiki/MNIST_database):

<img style="padding: 10px; float:center;" alt="MNIST dataset by Josef Steppan CC BY-SA 4.0" src="https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png" width="350">

### Course objectives

The machine learning toolbox:

* Formulating a problem as an ML problem
* Understanding a variety of ML algorithms
* Running and interpreting ML experiments
* Understanding what makes ML work – theory and practice


### Python

Why Python?

<img style="float: right;" src="https://www.python.org/static/community_logos/python-logo.png" width="200">

* A concise and intuitive language
* Simple, easy to learn syntax
* Highly readable, compact code
* Supports object oriented and functional programming
* Strong support for integration with other languages (C,C++,Java)
* Cross-platform compatibility
* Free
* Makes programming fun!

**We assume you already know the basics of Python**. That said, most of us can benefit from a little review, so expect us shortly to do a quick guided walkthrough of some Python basics in [module00_01_python_intro.ipynb](https://drive.google.com/file/d/1jIIntlf2RWcfnRr1kHuIBqfDZbuaBKww/view?usp=sharing). You may also want to take advantage of :
[A Whirlwind Tour of the Python Language](https://github.com/jakevdp/WhirlwindTourOfPython).

### Why Python for machine learning?

Different researchers, and age may matter, may answer this question differently. One answer takes us back to the earliest days of computer science the first two high-level programming langagues: [Fortran](https://en.wikipedia.org/wiki/Fortran) and [Lisp](https://en.wikipedia.org/wiki/Lisp_(programming_language)). This history is important because it highlights that the needs of AI programmers have historically differed significantly from other areas of CS and Lisp accomdated these differences. The two important features are: an interpreted language easily extended and elegant handling of collections, e.g. lists. 

From the perspect of a researcher who has done significant AI development work in Lisp, Python is the closest we have come to a language with the same flexibility and ease of use.  Indeed, for those not found of many nested parentheses, Python may be argued to be an improvement. Without question what has happened over the past decade is that Python has emerged as one of the primary data science / machine learning languages.  In addition to the points mentioned above, here are a few additional aspects of Python that make it great for data science:

* An interpreted language – allows for interactive data analysis
* Libraries for plotting and vector/matrix computation
* Many machine learning packages available:  scikit-learn, TensorFlow, PyTorch
* Language of choice for many ML practitioners (other options: R)

![image](https://scikit-learn.org/stable/_images/sphx_glr_plot_classifier_comparison_001.png)


### The tools we will cover in this course:

* ``Numpy``:  highly efficient manipulation of vectors and matrices
* ``Matplotlib``: data visualization

### Python version and Anaconda

<img style="float: right;" src="https://upload.wikimedia.org/wikipedia/en/c/cd/Anaconda_Logo.png" alt="drawing" width="150"/>

Use version 3.X of Python.

If setting up Python on your personal machine, we recommend the [anaconda](https://www.anaconda.com/distribution/) Python distribution which is a data-science oriented distribution that includes all the tools we will use in this course.

A discussion thread for help installing Anaconda on your machine of choice is now up on MS Teams.


### IPython and the Jupyter Notebook

The Jupyter notebook is a browser-based interface to the ``IPython`` Python shell.
In addition to executing Python/IPython statements, the notebook allows the user to include formatted text, static and dynamic visualizations, mathematical equations, and much more. 
**It is the standard way of sharing data science analyses.**

<img style="float: right;" src="https://jupyter.org/assets/main-logo.svg" width="100">


To invoke the jupyter notebook use the command:

```bash
jupyter notebook
```

which brings up the Jupyter notebook browser.  To open a specific notebook:

```bash 
jupyter notebook notebook_name.ipynb
```

### Google Colab

<img style="float: right;" src="https://miro.medium.com/max/502/1*sXs3TvhjvXcVCTldKnwMpA.png" alt="drawing" width="250"/>


Please also consider setting yourself up to use [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb). As I am demonstrating during this lecture, if you store your notebooks on your Google Drive (acessible from you personal machines file system) then the two are entirely compatible with each other - although I don't recommend openning the same notebook at the same time using both tools.  

### A Taste of Jupyter

There are two primary types of cells, good ones and bad ones,  in Jupyter:

In [1]:
# this is a code cell

This is a **markdown** cell.  You can type *text* and good looking equations 

$$f(x) = \frac{1}{2\pi} e^{-2 x^2 / \sigma^2}$$


In [3]:
print("Hello Daniela!")
print("Hello Daniela!")
print("Hello Daniela!")
print("Hello Daniela!")

Hello Daniela!
Hello Daniela!
Hello Daniela!
Hello Daniela!


In [3]:
2 + 3

5

You can run shell commands:

In [4]:
!ls

total 20128
drwxr-xr-x@ 40 ross  staff     1280 Aug 24 16:44 [34m.[m[m
drwxr-xr-x@ 12 ross  staff      384 Aug 24 09:28 [34m..[m[m
-rw-r--r--@  1 ross  staff     6148 Aug 23 11:47 .DS_Store
drwxr-xr-x@ 45 ross  staff     1440 Aug 24 09:48 [34m.ipynb_checkpoints[m[m
-rw-r--r--@  1 ross  staff        0 Aug 24 09:28 Icon?
drwxr-xr-x@  7 ross  staff      224 Aug 24 09:28 [34mdatasets[m[m
drwxr-xr-x@ 14 ross  staff      448 Aug 24 09:28 [34mfigures[m[m
-rwxr-xr-x   1 ross  staff    26338 Aug 24 16:44 [31mmodule00_01_intro.ipynb[m[m
-rwxr-xr-x   1 ross  staff    46485 Aug 23 10:52 [31mmodule00_02_python_intro.ipynb[m[m
-rwxr-xr-x   1 ross  staff   264684 May  5 13:53 [31mmodule01_01_labeled_data.ipynb[m[m
-rwxr-xr-x@  1 ross  staff    78199 May  5 13:46 [31mmodule01_02_numpy.ipynb[m[m
-rwxr-xr-x   1 ross  staff   293876 May  5 15:08 [31mmodule01_03_vectors_dot_products.ipynb[m[m
-rwxr-xr-x   1 ross  staff   366708 May  5 16:20 [31mmodule01_04_matpl

The `%` sign is used for *magic commands*, which are iPython shell commands.  For example, to find what other magic commands there are

In [5]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%

The `%timeit` magic is one that we will use quite regularly.  Let's learn what it's for:

In [6]:
%timeit?

Now try this:

In [4]:
import antigravity
# TODO: jiojcoid

### Mastering the Jupyter notebook

To be more productive in using notebooks, I highly recommend exploring the notebook keyboard shortcuts.
Here is a useful [blog post](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/) that provides a detailed overview.
You will also need to know the basics of Markdown syntax.

One of the nice features of the Jupyter notebook is that it supports writing mathematical equation using LaTex.  Here are a couple of examples of what you can do with LaTex:

$$
\sum_{i=1}^N x_i^2 + \alpha 
$$


And here is the markup that generated this formula
```latex
$$
\sum_{i=1}^N x_i^2 + \alpha
$$
```
All LaTex commands are preceded by a `\`, and as you can see, it is quite intuitive!