# Python for Datascience: Part 1

Adapted from `CS231n` Python tutorial by Justin Johnson (http://cs231n.github.io/python-numpy-tutorial/).

## Introduction

Python is a general-purpose programming language which is commonly used for scientific computing. Some of you may have experience with Matlab. Please refer to [Numpy for Matlab users page](https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html) for additional information.

In this tutorial, we will cover:

* [Basic Python](#Basics-of-Python): Data types, Functions, Classes

## Basics of Python

Python is a high-level and dynamically typed programming language. Python code is almost like pseudocode, since it allows you to express ideas in few lines of readable code. As an example, here is an implementation of the quicksort algorithm in Python:

In [None]:
def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)

print quicksort([3,6,8,10,1,2,1])

This section has the following subsection:
* [Python versions](#Python-versions)
* [Data types](#Data-types)
* [Contrainers](#Containers)
* [Functions](#Functions)
* [Classes](#Classes)

## Python versions

There are currently two different supported versions of Python, 2.7 and 3.6. For this class all code will use Python 3.6. You can check your Python version at the command line by running `python --version`.

## Data types

This subsection covers the following data types:
* [Numbers](#Numbers)
* [Booleans](#Booleans)
* [Strings](#Strings)

### Numbers

Integers and float operations are similar to other programming languages. `print` function can provide the output of different operations and variables.

In [None]:
# assign a variable
x = 3

# view variable value and type
print(x, type(x))

In [None]:
# perform arithematic operation on variable
# print can directly output value of operations

# addition
print(x+1)   

# subtraction
print(x-1)   

# multiplication
print(x*2)

# exponentiation
print(x ** 2)  

In [None]:
# assign output to variable
x = 1

# similar to x=x+1
x+=1
print(x)  

# similar to x=x*2
x *= 2
print(x) 

# Note that Python does not have unary increment (x++) or decrement (x--) operators.

In [None]:
# similar operations can also be performed on float variables
y = 2.5
print(type(y)) 
print(y, y+1, y*2, y**2) 

Python also has built-in types for long integers and complex numbers; you can find all of the details in the [documentation](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-long-complex).

### Booleans

Python implements all of the usual operators for Boolean logic, but uses English words rather than symbols (`and` instead of `&&`, `or` instead of `||`, etc.):

In [None]:
# create boolean variables
t, f = True, False
print(type(t))

In [None]:
# examples of boolean operations

# logical AND
print(t and f)

# logical OR
print(t or f)

# logical NOT
print(not t)

# logical XOR
print(t != f)

### Strings

In [None]:
# initialize string objects

# string literals can use single quotes
hello = 'hello'   

# or double quotes; it does not matter.
world = "world"   

# output string variables
print(hello, len(hello), type(hello))

# output individual characters in string
print(hello[0], type(hello[0]))

In [None]:
# string concatenation is similar to arithematic
hw = hello + ' ' + world  

# prints 'hello world'
print(hw)  

In [None]:
# strings can also be formatted using this layout
# this formatting is only valid for python 3.6
hw12 = f'{hello} {world} {12}'

# string formatting can also be done in the sprintf format
hw13 = '%s %s %d' % (hello, world, 13)
hw14 = '{} {} {}'.format(hello,world,14)

# prints all the string outputs
print(hw12, hw13, hw14)  

String objects have several useful methods. For example:

In [None]:
s = "hello"

# capitalize a string; prints "Hello"
print(s.capitalize())

# convert a string to uppercase; prints "HELLO"
print(s.upper())

# right-justify a string, padding with spaces; prints "  hello"
print(s.rjust(7))

# center a string, padding with spaces; prints " hello "
print(s.center(7))

# replace all instances of one substring with another; prints "he(ell)(ell)o"
print(s.replace('l', '(ell)'))

# strip leading and trailing whitespace; prints "world"
print('  world '.strip())

You can find a list of all string methods in the [documentation](https://docs.python.org/2/library/stdtypes.html#string-methods).

## Containers

Python includes several built-in container types. The following are covered in this subsection: 
* [Lists](#Lists) 
* [Dictionaries](#Dictionaries)

### Lists

A list is the Python equivalent of an array, but is resizeable and can contain elements of different types:

In [None]:
# create a list with numbers
xs = [3, 1, 2]   

# print the contents of the list
print(xs, xs[2])

# negative indices count from the end of the list; prints "2"
print(xs[-1])     

In [None]:
# lists can contain elements of different types
xs[2] = 'foo'    
print(xs)

In [None]:
# add a new element to end of list using append
xs.append('bar') 
print(xs)

In [None]:
# remove and return last element of list using pop
x = xs.pop()     
print(x, xs)

Detailed information about lists is available in the [documentation](https://docs.python.org/2/tutorial/datastructures.html#more-on-lists).

In addition to accessing list elements one at a time, Python provides concise syntax to access sublists; this is known as **slicing**:

In [None]:
# range is a built-in function that creates a list of integers from 0 to 4
nums = range(5)    

# prints "[0, 1, 2, 3, 4]"
print(nums)         

# get a slice from index 2 to 4 (exclusive); prints "[2, 3]"
print(nums[2:4])

# get a slice from index 2 to the end; prints "[2, 3, 4]"
print(nums[2:])   

# get a slice from the start to index 2 (exclusive); prints "[0, 1]"
print(nums[:2])    

# get a slice of the whole list; prints ["0, 1, 2, 3, 4]"
print(nums[:])    

# slice indices can be negative; prints ["0, 1, 2, 3]"
print(nums[:-1])

# assign a new sublist to a slice; prints "[0, 1, 8, 9, 4]"
nums = [0,1,2,3,4]

# range object does not allow sublist assignment so initializes new list
nums[2:4] = [8, 9] 
print(nums)        

It is also possible to **loop** over the elements of an array. This is done using the `for` keyword:

In [None]:
# define list of animals
animals = ['cat', 'dog', 'monkey']

# create for loop on elements of list
for animal in animals:
    # for each animal print its name
    print(animal)

In [None]:
# to access index of each element within loop, use the built-in `enumerate` function:
animals = ['cat', 'dog', 'monkey']

# idx is the index of animal in the animals list
for idx, animal in enumerate(animals):
    print(f'#{idx+1}: {animal}')

When programming, frequently we want to transform one type of data into another. As a simple example, consider the following code that computes square numbers:

In [None]:
# this code snippet computes square of every number in a list
nums = [0, 1, 2, 3, 4]

# new list for squares
squares = []

# for loop on nums list
for x in nums:
    squares.append(x ** 2)
print(squares)

You can make this code simpler using a **list comprehension**:

In [None]:
nums = [0, 1, 2, 3, 4]

# this list comprehension replaces the for loop
squares = [x ** 2 for x in nums]
print(squares)

List comprehensions can also contain conditions:

In [None]:
nums = [0, 1, 2, 3, 4]

# only compute the square if the list element is an even number
even_squares = [x ** 2 for x in nums if x % 2 == 0]
print(even_squares)

### Dictionaries

A dictionary stores (key, value) pairs, similar to a `Map` in Java or an object in Javascript:

In [None]:
# create a new dictionary with some data
d = {'cat': 'cute', 'dog': 'furry'}  

# get the value for the key "cat" from dictionary; prints "cute"
print(d['cat'])

# check if dictionary has the key "cat"; prints "True"
print('cat' in d)

In [None]:
# set a new entry in the dictionary
d['fish'] = 'wet'    
print(d['fish'])      

In [None]:
# checking for a key that is not in the dictionary causes an error
# KeyError: 'monkey'
print(d['monkey'])

In [None]:
# use get function to get the value for a key with a default option; prints "N/A"
# this will not cause an error
print(d.get('monkey', 'N/A'))  
print(d.get('fish', 'N/A'))

In [None]:
# the del keyword removes an element from a dictionary
del d['fish']        
print(d.get('fish', 'N/A'))

Additional information about dictionaries is available in the [documentation](https://docs.python.org/2/library/stdtypes.html#dict).

It is easy to iterate over the keys in a dictionary:

In [None]:
# create example dictionary
d = {'person': 2, 'cat': 4, 'spider': 8}

# loop over the keys in dictionary
for animal in d:
    legs = d[animal]
    print('A %s has %d legs' % (animal, legs))

If you want access to keys and their corresponding values, use the items method:

In [None]:
# create example dictionary
d = {'person': 2, 'cat': 4, 'spider': 8}

# use items to get key and value
for animal, legs in d.items():
    print('A %s has %d legs' % (animal, legs))

Dictionary comprehensions: These are similar to list comprehensions, but allow you to easily construct dictionaries. For example:

In [None]:
# create list of numbers
nums = [0, 1, 2, 3, 4]

# create dictionary with list using comprehension
even_num_to_square = {x: x ** 2 for x in nums if x % 2 == 0}
print(even_num_to_square)

### Functions

Python functions are defined using the `def` keyword. For example:

In [None]:
# function to check the sign of a number
def sign(x):
    # if-elif-else condition
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'zero'

# loop over list to apply function
print([sign(x) for x in [-1,0,1]])

We will often define functions to take optional keyword arguments, like this:

In [None]:
# function with an optional argument and default value as False
def hello(name, loud=False):
    if loud:
        print('HELLO, %s' % name.upper())
    else:
        print('Hello, %s!' % name)

# call function without opt argument
hello('Bob')

# function call with opt argument
hello('Fred', loud=True)

### Classes

The syntax for defining classes in Python is straightforward:

In [None]:
class Greeter:
    # Constructor
    def __init__(self, name):
        self.name = name  # Create an instance variable

    # Instance method
    def greet(self, loud=False):
        if loud:
            print('HELLO, %s!' % self.name.upper())
        else:
            print('Hello, %s' % self.name)

g = Greeter('Fred')  # Construct an instance of the Greeter class
g.greet()            # Call an instance method; prints "Hello, Fred"
g.greet(loud=True)   # Call an instance method; prints "HELLO, FRED!"