### Crash Course in Python and SciPy

You do not need to be a Python developer to get started using the Python ecosystem for machine
learning. As a developer who already knows how to program in one or more programming
languages, you are able to pick up a new language like Python very quickly. You just need to
know a few properties of the language to transfer what you already know to the new language.
After completing this lesson you will know:
1. How to navigate Python language syntax.
2. Enough NumPy, Matplotlib and Pandas to read and write machine learning Python
scripts.
3. A foundation from which to build a deeper understanding of machine learning tasks in
Python.
If you already know a little Python, this chapter will be a friendly reminder for you. Let's
get started.

#### Python Crash Course

When getting started in Python you need to know a few key details about the language syntax
to be able to read and understand Python code. This includes:
 Assignment.
 Flow Control.
 Data Structures.
 Functions.
We will cover each of these topics in turn with small standalone examples that you can type
and run. Remember, whitespace has meaning in Python.

##### Assignment

As a programmer, assignment and types should not be surprising to you.

In [None]:
# Strings
data = 'hello world'
print(data[0])
print(len(data))
print(data)

In [None]:
# Numbers
value = 123.1
print(value)
value = 10
print(value)

In [None]:
# Boolean
a = True
b = False
print(a, b)

In [None]:
# Multiple Assignment
a, b, c = 1, 2, 3
print(a, b, c)

In [None]:
# No value
a = None
print(a)

##### Flow Control

There are three main types of 
ow control that you need to learn: If-Then-Else conditions,
For-Loops and While-Loops.

In [None]:
# If-Then-Else Conditional
value = 99
if value == 99:
print 'That is fast'
elif value > 200:
print 'That is too fast'
else:
print 'That is safe'

In [None]:
# For-Loop
for i in range(10):
print i

In [None]:
# While-Loop
i = 0
while i < 10:
print i
i += 1

##### Data Structures

There are three data structures in Python that you will nd the most used and useful. They
are tuples, lists and dictionaries.

In [None]:
#Tuple
#Tuples are read-only collections of items.
a = (1, 2, 3)
print a

In [None]:
#List
#Lists use the square bracket notation and can be index using array notation.
mylist = [1, 2, 3]
print("Zeroth Value: %d") % mylist[0]
mylist.append(4)
print("List Length: %d") % len(mylist)
for value in mylist:
print value

In [None]:
#Dictionary
#Dictionaries are mappings of names to values, like key-value pairs. Note the use of the curly bracket and colon notations when dening the dictionary.
mydict = {'a': 1, 'b': 2, 'c': 3}
print("A value: %d") % mydict['a']
mydict['a'] = 11
print("A value: %d") % mydict['a']
print("Keys: %s") % mydict.keys()
print("Values: %s") % mydict.values()
for key in mydict.keys():
print mydict[key]

Functions
The biggest gotcha with Python is the whitespace. Ensure that you have an empty new line
after indented code. The example below denes a new function to calculate the sum of two
values and calls the function with two arguments.

In [None]:
# Sum function
def mysum(x, y):
return x + y
# Test sum function
result = mysum(1, 3)
print(result)

#### NumPy Crash Course

NumPy provides the foundation data structures and operations for SciPy. These are arrays
(ndarrays) that are ecient to dene and manipulate.

##### Create Array

In [None]:
# define an array
import numpy
mylist = [1, 2, 3]
myarray = numpy.array(mylist)
print(myarray)
print(myarray.shape)

##### Access Data

Array notation and ranges can be used to eciently access data in a NumPy array.

In [None]:
# access values
import numpy
mylist = [[1, 2, 3], [3, 4, 5]]
myarray = numpy.array(mylist)
print(myarray)
print(myarray.shape)
print("First row: %s") % myarray[0]
print("Last row: %s") % myarray[-1]
print("Specific row and col: %s") % myarray[0, 2]
print("Whole col: %s") % myarray[:, 2]

##### Arithmetic

NumPy arrays can be used directly in arithmetic.

In [None]:
# arithmetic
import numpy
myarray1 = numpy.array([2, 2, 2])
myarray2 = numpy.array([3, 3, 3])
print("Addition: %s") % (myarray1 + myarray2)
print("Multiplication: %s") % (myarray1 * myarray2)

There is a lot more to NumPy arrays but these examples give you a 
avor of the eciencies
they provide when working with lots of numerical data. See Chapter 24 for resources to learn
more about the NumPy API.

#### Matplotlib Crash Course

Matplotlib can be used for creating plots and charts. The library is generally used as follows:
 Call a plotting function with some data (e.g. .plot()).
 Call many functions to setup the properties of the plot (e.g. labels and colors).
 Make the plot visible (e.g. .show()).

##### Line Plot

The example below creates a simple line plot from one dimensional data.

In [None]:
# basic line plot
import matplotlib.pyplot as plt
import numpy
myarray = numpy.array([1, 2, 3])
plt.plot(myarray)
plt.xlabel('some x axis')
plt.ylabel('some y axis')
plt.show()

##### Scatter Plot

Below is a simple example of creating a scatter plot from two dimensional data.

In [None]:
# basic scatter plot
import matplotlib.pyplot as plt
import numpy
x = numpy.array([1, 2, 3])
y = numpy.array([2, 4, 6])
plt.scatter(x,y)
plt.xlabel('some x axis')
plt.ylabel('some y axis')
plt.show()

There are many more plot types and many more properties that can be set on a plot to
congure it. See Chapter 24 for resources to learn more about the Matplotlib API.

#### Pandas Crash Course

Pandas provides data structures and functionality to quickly manipulate and analyze data. The
key to understanding Pandas for machine learning is understanding the Series and DataFrame
data structures.

##### Series

In [None]:
# series
import numpy
import pandas
myarray = numpy.array([1, 2, 3])
rownames = ['a', 'b', 'c']
myseries = pandas.Series(myarray, index=rownames)
print(myseries)

You can access the data in a series like a NumPy array and like a dictionary, for example:

In [None]:
print(myseries[0])
print(myseries['a'])

##### DataFrame

A data frame is a multi-dimensional array where the rows and the columns can be labeled.

In [None]:
# dataframe
import numpy
import pandas
myarray = numpy.array([[1, 2, 3], [4, 5, 6]])
rownames = ['a', 'b']
colnames = ['one', 'two', 'three']
mydataframe = pandas.DataFrame(myarray, index=rownames, columns=colnames)
print(mydataframe)

Data can be index using column names.

In [None]:
print("method 1:")
print("one column: %s") % mydataframe['one']
print("method 2:")
print("one column: %s") % mydataframe.one

Pandas is a very powerful tool for slicing and dicing you data. See Chapter 24 for resources
to learn more about the Pandas API.

In [None]:
%reload_ext watermark
%watermark -a "Caique Miranda" -gu "caiquemiranda" -iv

### End.