<a id="syntax"></a>

## Python Programming Language

Why Python? 
    1. It’s easy to learn 
        • Now the language of choice for 8 of 10 top US computer science programs (Philip Guo, CACM) 
    2. Full featured 
        • Not just a statistics language, but has full capabilities for data acquisition, cleaning, databases, high performance  computing, and more 
    3. Strong Data Science Libraries 
        • The SciPyEcosystem

1. Function
2. Data types
3. Loops and control structures
4. Reading and Writing CSV files

### Variables

In [None]:
2 + 2

In [None]:
x = 5
y = 2
x * y

In [None]:
x ** y

In [None]:
print('Hello, world!')

In [None]:
print('%s raised to power of %s equals %s' % (x, y, x ** y))

<a id="functions"></a>

### Functions

Functions allow you to carry out the same task multiple times. This reduces the amount of code you write, reduces mistakes, and makes your code easier to read.

In [None]:
def say_hello():
    print('Hello, world!')

In [None]:
say_hello()

In [None]:
def print_a_string(foo):
    print('%s' % foo)

In [None]:
print_a_string('Here is a string.')

<br>
`add_numbers` is a function that takes two numbers and adds them together.

In [None]:
def add_numbers(x,y):
    return x + y

add_numbers(2.5,3)

<br>
`add_numbers` updated to take an optional 3rd parameter. Using `print` allows printing of multiple expressions within a single cell.

In [None]:
def add_numbers(x,y,z = None):
    if(z == None):
        return x+y
    else: 
        return x+y+z
    
print(add_numbers(12,3))
print(add_numbers(12,3,4))

<br>
`add_numbers` updated to take an optional flag parameter.

In [None]:
def add_numbers(x, y, z=None, flag=False):
    if (flag):
        print('Flag is true!')
    if (z==None):
        return x + y
    else:
        return x + y + z
    
print(add_numbers(1, 2, False))

<br>
Assign function `add_numbers` to variable `a`.

In [None]:
def add_numbers(x,y):
    return x+y

a = add_numbers
a(1,2)

<a id="types"></a>

### Data Types

#### Booleans

'True' and 'False' have special meaning in Python.

In [None]:
a = True
b = False

In [None]:
a == True

In [None]:
b == True

In [None]:
a or b

In [None]:
a and b

#### Numbers: integers and floats

Numbers are pretty straightforward, especially in Python 3.

In [None]:
1 + 2

In [None]:
1.0 + 2.0

In [None]:
1 / 2

In [None]:
1.0 / 2.0

In [None]:
type(1)

In [None]:
type(1/2)

#### Strings

The next four data types -- strings, lists, tuples, arrays -- are all sequences.

Strings are sequences of characters.

In [None]:
s = 'Hello, world'

In [None]:
type(s)

In [None]:
s[0:4]

In [None]:
s + '!'

In [None]:
s

In [None]:
s = s + '!'

In [None]:
s

#### Lists

Lists are _mutable_ sequences of anything.

In [None]:
l = [0, 1, 1, 2, 3, 5, 8]

In [None]:
m = [5, 2, 'a', 'xxx', True, [0, 1]]

In [None]:
l[0:3]

In [None]:
m[4]

In [None]:
m[4] = False

In [None]:
m[4:]

#### Tuples

Tuples are immutable sequences of anything (similar to lists except you can't change them).

In [None]:
n = (3, 5, 6)

In [None]:
n[0]

In [None]:
#n[0] = 2

#### Arrays (numpy)

In [1]:
# Import modules to use
import math
import numpy as np

?np.linspace

In [4]:
mylist = [0, 2, 4]
np.array(mylist)

array([0, 2, 4])

In [None]:
np.zeros(5)

In [None]:
np.arange(5)

In [5]:
np.arange(4, 10)

array([4, 5, 6, 7, 8, 9])

In [6]:
np.arange(0, 10, 2)

array([0, 2, 4, 6, 8])

In [7]:
np.linspace(0, 10, 5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [8]:
np.linspace(0, 10, 11)

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [2]:
np.random.rand()

0.8048628574525497

In [3]:
np.random.rand(5)

array([0.66711027, 0.69504665, 0.48330328, 0.44259834, 0.53596042])

#### Sets

Sets are unordered collections of unique objects.

In [9]:
s1 = {'a', 'b', 'c'}
s2 = {'a', 'd', 'e'}

In [10]:
s1 & s2

{'a'}

In [11]:
s1 | s2

{'a', 'b', 'c', 'd', 'e'}

In [12]:
s3 = set(l)
s4 = set(m[0:2])

NameError: name 'l' is not defined

In [13]:
s3 & s4

NameError: name 's3' is not defined

In [None]:
s3 | s4

In [None]:
s3 - s4

#### Dictionaries

Dictionaries or 'dicts' are hash tables, where a key points to a value.

In [None]:
d = {'name': 'John Doe', 'age': 27, 'dob': '7/20/1989'}

In [None]:
d

In [None]:
d['name']

In [None]:
d['zip'] = 92039

<a id="control"></a>

### Loops and Control Structures

#### Boolean and comparison operations

In [None]:
x = 5
(x < 6) and (x > 4)

In [None]:
x != 4

In [None]:
5 in [3, 4, 5]

In [None]:
'ell' in 'Hello'

In [None]:
len('Hello') >= 5

#### if tests

In [14]:
if 'd' in 'abc':
    print('Learn your alphabet.')
elif (2 + 2 == 5):
    print('Sometimes yes.')
else: 
    print('Nothing is true.')    

Nothing is true.


#### while loops

In [None]:
i = 0
while (i < 5):
    print(i)
    i += 1

In [None]:
i

#### for loops

In [None]:
for x in [0, 1, 2, 3, 4]:
    print(x**2)

#### Lambda and List Comprehensions
<br>
A lambda function is a small anonymous function.
A lambda function can take any number of arguments, but can only have one expression.

Here's an example of lambda that takes in three parameters and adds the first two.

In [None]:
my_function = lambda a, b, c : a + b

In [None]:
my_function(1, 2, 3)

<br>
Let's iterate from 0 to 999 and return the even numbers.

In [None]:
my_list = []
for number in range(0, 1000):
    if number % 2 == 0:
        my_list.append(number)
my_list[:-1]

<br>
Now the same thing but with list comprehension.

In [None]:
my_list = [number for number in range(0,1000) if number % 2 == 0]
my_list

#### Reading and Writing CSV files

<br>
Let's import our datafile mpg.csv, which contains fuel economy data for 234 cars.

* mpg : miles per gallon
* class : car classification
* cty : city mpg
* cyl : # of cylinders
* displ : engine displacement in liters
* drv : f = front-wheel drive, r = rear wheel drive, 4 = 4wd
* fl : fuel (e = ethanol E85, d = diesel, r = regular, p = premium, c = CNG)
* hwy : highway mpg
* manufacturer : automobile manufacturer
* model : model of car
* trans : type of transmission
* year : model year

<br>
1. Find the average cty fuel & hwy economy across all cars
<br>
2. Grouping the cars by number of cylinder, and find the average cty mpg for each group
<br>
3. Find the average hwy mpg for each class of vehicle

In [None]:
import csv

%precision 2 
input_file_path = r"C:\Users\Asus\Documents\GitHub\DA-in-Python\data\mpg.csv"

with open(input_file_path) as csvfile:
    mpg = list(csv.DictReader(csvfile))
    
type(mpg[:])

<br>
`csv.Dictreader` has read in each row of our csv file as a dictionary. `len` shows that our list is comprised of 234 dictionaries.

In [None]:
len(mpg)

<br>
`keys` gives us the column names of our csv.

In [None]:
mpg[0].keys()