## **Reading csv files with python**

In [None]:
import csv

with open('../datasets/mpg.csv') as csvfile:
  mpg = list(csv.DictReader(csvfile))

mpg[:3] # The first three dictionaries in our list.

**Summing data in a csv file**

In [3]:
sum(float(d['cty']) for d in mpg)

3945.0

## **Numpy**

The Numpy module is mainly used for working with numerical 
data. It provides us with a powerful object known as an Array.
With Arrays, we can perform mathematical operations on 
multiple values in the Arrays at the same time, and also
perform operations between different Arrays, similar to 
matrix operations. 

In [1]:
import numpy as np
import math 

**Array creation**

In [2]:
a = np.array([1,2,3,4])
print(a.ndim)

1


Two dimensional array

In [3]:
b = np.array([[1,2,3,4],[5,6,7,8]])
print(b.ndim)

2


In [4]:
print(b.shape)
# 2 by 4

(2, 4)


Adding default values to an array

In [7]:
c = np.zeros((2,4))
print(c)
d = np.ones((3,2))
print(d)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]
[[1. 1.]
 [1. 1.]
 [1. 1.]]


**Array operations**

In [14]:
e = np.arange(0, 10, 2)
f = np.linspace(0, 10, 5)

print(e)
print(f)
print(e+f)
print(e*f)
print(e**2)
print(e-f)
print(e/f)


[0 2 4 6 8]
[ 0.   2.5  5.   7.5 10. ]
[ 0.   4.5  9.  13.5 18. ]
[ 0.  5. 20. 45. 80.]
[ 0  4 16 36 64]
[ 0.  -0.5 -1.  -1.5 -2. ]
[nan 0.8 0.8 0.8 0.8]


  print(e/f)


In [15]:
# Use reshape to change the shape of an array

g = np.array([1,2,3,4])
print(g.reshape(2,2))

[[1 2]
 [3 4]]


In [16]:
g=g*2
print(g)

[2 4 6 8]


## **REGEX**

### **Patterns and character classes**

#### **Set Operators**

In [19]:
# Set Operator

import re

grades="ACAAAABCBCBAA"
re.findall("[AB]", grades)

['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'A', 'A']

In [24]:
re.findall("[A][B-C]", grades)

['AC', 'AB']

In [26]:
# You could use the or operator | to get the same result
re.findall("AB|AC", grades)

['AC', 'AB']

**Caret - ^**

In [29]:
# Check if the 1st character is A
re.findall("^A", grades)

['A']

In [30]:
# Check if the first character is B
re.findall("^B", grades)

[]

In [None]:
# Since it is not B, it will return an empty list

**Dollar Sign - $**

In [32]:
# Check if the last character is A
re.findall("A$", grades)

['A']

In [33]:
# Check if the last character is B
re.findall("B$", grades)

[]

In [None]:
# Since the last character is not B, it will return an empty list

In [35]:
# Return a list excluding A
re.findall("[^A]", grades)

['C', 'B', 'C', 'B', 'C', 'B']

In [36]:
# Return an empty list
re.findall("^[^A]", grades)

[]

#### **Quantifiers**


In [40]:
re.findall("A{2,10}", grades)

['AAAA', 'AA']