# Overview of topics in MSDS593

<img src="https://mlbook.explained.ai/images/tools/lab1.png" align="right" width="150">These notes are implemented as a live text / code entity called a notebook. In particular, I am running a [Jupyter Notebook](http://jupyterlab.readthedocs.io/en/latest/getting_started/overview.html) that I started by  typing `jupyter lab` from the command prompt in `Terminal`. That command should bring up a browser window that looks something like the image to the right. Clicking on the “Python 3” icon under the “Notebook” category, will create and open a new notebook window.

During this course, some all of these notebooks will refer to data files on the web that you have to download or small data files in this course repository at `github.com`, which is where you are looking at this file from your browser. The data sits in the `data` subdirectory of the `notebooks` in the main directory of this repository.

## numpy

Pandas uses NumPy in its implementation so it makes sense to take a look at numpy first. Numpy is how we will do most of our numerical computing. It provides flexible implementations of vectors and matrices (`ndarray`) and fast operations and functions for linear algebra, random number generation, etc..

I always start with the following preamble in my notebooks or Python files:

In [1]:
import numpy as np
import pandas as pd

That allows us to refer to numpy package elements with the shorthand `np` and the shorthand for pandas as `pd`.

To give you a taste,  here's how we would create two vectors, add them together, and display the result:

In [24]:
a = np.array([1,2,3])
b = np.array([4,5,6])
print(type(a))
print(a.shape)
a + b   # or print(a + b)

<class 'numpy.ndarray'>
(3,)


array([5, 7, 9])

Here's a 2 x 3 matrix with random elements:

In [23]:
C = np.random.rand(2,3)
print(type(C))
print(C.shape)
print(C)

<class 'numpy.ndarray'>
(2, 3)
[[0.47412309 0.5095976  0.19151133]
 [0.5659787  0.00395192 0.82814404]]


And the following `@` operator does a matrix multiply $Ca$:

In [22]:
C @ a

array([2.65771993, 3.47739585])

## pandas

In [30]:
df_cars = pd.read_csv("data/cars.csv")
print(type(df_cars))
df_cars

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,MPG,CYL,ENG,WGT
0,18.0,8,307.0,3504
1,15.0,8,350.0,3693
2,18.0,8,318.0,3436
3,16.0,8,304.0,3433
4,17.0,8,302.0,3449
...,...,...,...,...
387,27.0,4,140.0,2790
388,44.0,4,97.0,2130
389,32.0,4,135.0,2295
390,28.0,4,120.0,2625


In [31]:
mpg = df_cars['MPG']   # Get a column
print(type(mpg))
print(mpg)

<class 'pandas.core.series.Series'>
0      18.0
1      15.0
2      18.0
3      16.0
4      17.0
       ... 
387    27.0
388    44.0
389    32.0
390    28.0
391    31.0
Name: MPG, Length: 392, dtype: float64


## matplotlib 

<img src="https://mlbook.explained.ai/images/first-taste/first-taste_go_5.svg" width="200"> <img src="https://mlbook.explained.ai/images/bulldozer-feateng/bulldozer-feateng_eng_36.svg" width="200"> <img src="https://mlbook.explained.ai/images/bulldozer-testing/bulldozer-testing_trend_1.svg" width="350">

<img src="https://mlbook.explained.ai/images/first-taste/first-taste_class_6.svg" width="200"> <img src="https://mlbook.explained.ai/images/first-taste/first-taste_mnist_10.svg" width="200"> <img src="https://mlbook.explained.ai/images/prep/prep_logs_2.svg" width="200"> <img src="https://github.com/parrt/dtreeviz/raw/master/testing/samples/iris-TD-2.svg" width="200">

<img src="https://mlbook.explained.ai/images/bulldozer-intro/bulldozer-intro_sniff_42.svg" width="200"> <img src="https://user-images.githubusercontent.com/178777/49105085-9792c680-f234-11e8-8af5-bc2fde950ab1.png" width="200"> <img src="https://mlbook.explained.ai/images/intro/mindist-decision-lines.svg" width="200">

<img src="https://github.com/parrt/dtreeviz/raw/master/testing/samples/regr-leaf.png" width="150"> <img src="https://explained.ai/decision-tree-viz/images/bubble.png" width="520">

<img src="https://mlbook.explained.ai/images/first-taste/first-taste_mnist_3.svg" width="150"> <img src="https://mlbook.explained.ai/images/first-taste/first-taste_mnist_6.svg" width="200">

In [None]:
import matplotlib.pyplot as plt

In [None]:
fig, ax = plt.subplots()  # make one subplot (ax) on the figure


plt.show()