In [None]:
# Basic imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import ticker

# Display options
from IPython.display import display
pd.options.display.max_columns = None

np.set_printoptions(threshold=30)

# Plots style
import matplotlib as mpl
from cycler import cycler

mpl.rcParams['lines.linewidth'] = 3
mpl.rcParams['lines.markersize'] = 10

mpl.rcParams['xtick.labelsize'] = 12
mpl.rcParams['xtick.color'] = '#A9A9A9'
mpl.rcParams['ytick.labelsize'] = 12
mpl.rcParams['ytick.color'] = '#A9A9A9'

mpl.rcParams['grid.color'] = '#ffffff'

mpl.rcParams['axes.facecolor'] = '#ffffff'

mpl.rcParams['axes.spines.left'] = False
mpl.rcParams['axes.spines.right'] = False
mpl.rcParams['axes.spines.top'] = False
mpl.rcParams['axes.spines.bottom'] = False

mpl.rcParams['axes.prop_cycle'] = cycler(color=['#2EBCE7', '#84EE29', '#FF8177'])

$$
\def\var{{\text{Var}}} % Variance
\def\corr{{\text{Corr}}} % Correlation
\def\cov{{\text{Cov}}} % Covariance
\def\expval{{}}
\newcommand\norm[1]{\lVert#1\rVert} % norm
\def\setR{{\rm I\!R}} % Sets
\def\rx{{\textrm{X}}} % Scalar random variables
\def\ry{{\textrm{Y}}}
\def\rz{{\textrm{Z}}}
\def\rvx{{\textbf{X}}} % Vector random variables
\def\rvy{{\textbf{Y}}}
\def\rvz{{\textbf{Z}}}
\def\vtheta{{\boldsymbol{\theta}}} % Vectors
\def\va{{\boldsymbol{a}}}
\def\vb{{\boldsymbol{b}}}
\def\vi{{\boldsymbol{i}}}
\def\vj{{\boldsymbol{j}}}
\def\vp{{\boldsymbol{p}}}
\def\vq{{\boldsymbol{q}}}
\def\vu{{\boldsymbol{u}}}
\def\vv{{\boldsymbol{v}}}
\def\vw{{\boldsymbol{w}}}
\def\vx{{\boldsymbol{x}}}
\def\vy{{\boldsymbol{y}}}
\def\vz{{\boldsymbol{z}}}
\def\evu{{u}} % Elements of vectors
\def\evv{{v}}
\def\evw{{w}}
\def\evx{{x}}
\def\evy{{y}}
\def\evz{{z}}
\def\mA{{\boldsymbol{A}}} % Matrices
\def\mB{{\boldsymbol{B}}}
\def\mC{{\boldsymbol{C}}}
\def\mD{{\boldsymbol{D}}}
\def\mI{{\boldsymbol{I}}}
\def\mQ{{\boldsymbol{Q}}}
\def\mS{{\boldsymbol{S}}}
\def\mT{{\boldsymbol{T}}}
\def\mU{{\boldsymbol{U}}}
\def\mV{{\boldsymbol{V}}}
\def\mW{{\boldsymbol{W}}}
\def\mX{{\boldsymbol{X}}}
\def\mLambda{{\boldsymbol{\Lambda}}}
\def\mSigma{{\boldsymbol{\Sigma}}}
\def\emA{{A}} % Elements of matrices
\def\emB{{B}}
\def\emX{{X}}
\def\tT{{T}} % Transformations
$$



Conclusion
==========

Congratulation! You reach the end of this book, designed to help you
improve your math skills and increase your efficiency in data science
and machine learning. I must admit that these last chapters were not
easy, but they’ll give you solid foundations for data related fields.

Learning the topics covered here will help you to understand what’s
under the hood of tools that you may already use. Let’s take for
instance the Support Vector Machines algorithm (SVM) for binary
classification. If you read how it works, you’ll see that the goal is to
find hyperplanes separating the data with a maximum distance between
them. To follow the description, you’ll need for instance to understand
vector equations, or the concept of distance expressed as norms. The
purpose of *Essential Math for Data Science* is to give you all you need
to dig more deeply about the algorithms you’re interested in.

For this reason, the core math topics that you’ll need are covered in
this book: calculus, statistics and probability theory, and linear
algebra. You can note that there is an emphasis on linear algebra: this
content is more suited for people wanting to become data scientist or
machine learning scientist than data analyst (for instance, data
analysts might need more details about inference, samples
vs. population, etc.) .

As a side note, I think that even a short acculturation to the math
concepts, the symbols, and the vocabulary encountered in data science
and machine learning can stimulate your practice in the field and help
you to dive into more specific resources if you need it.

##### Next Steps

Now that you have learned about calculus, statistics, probability and
linear algebra, you should be good to dive into more specialized content
like seminal books in machine and deep learning (for instance, Bishop,
Christopher M. Pattern recognition and machine learning. springer, 2006,
or Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning.
MIT press, 2016) .

However, I recommend to keep a focus on practice while you continue to
learn the theory. I like the vision of Rachel Thomas, co-founder and
researcher at [Fast.ai](https://www.fast.ai/) about the ‘top-down’
approach to learn deep learning (e.g.:
https://www.fast.ai/2016/10/08/teaching-philosophy/): you start
experimenting and building, and then you dive into the theory when you
encounter limitations. This is opposed to the more traditional
‘bottom-up’ approach where you start with the foundations and then go to
the applications.

This book is a bit between these two approaches: we start from the
basics and move to more advanced concepts, but we try to use code and
get our hands dirty from the beginning. The hands-on projects that you
can find at the end of each chapter are also designed to show that you
can use code to get more insights. The idea is that, when you study a
theoretical concept, you can ask yourself: ‘is there a way to use code
to verify or illustrate this concept?’

My opinion is that you should go forth and back between theory (and the
math behind) and practical applications. Don’t wait to master all the
math to start building data science pipelines or machine learning
algorithms in your own projects. I think that it will help you to stay
motivated while you learn and to see the big picture more easily.

