###**Indexing, Slice indexing**

We can use slice indexing to pull out sub-regions of ndarrays

More documentation can be found at https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html

In [2]:
#create 1D array
import numpy as np
my_array_1d = np.array([1, 2, 3, 4, 5, 6, 7])   
print (my_array_1d)

[1 2 3 4 5 6 7]


In [3]:
#1D array: we need only one index to access element at any position
#call the value at index 5
print ('element at index 5: ', my_array_1d[5])

#ndarrays are mutable(changeable), here we change an element at index 5 
my_array_1d[5] = 15
print ('element at index 5 after change: ', my_array_1d[5])

element at index 5:  6
element at index 5 after change:  15


In [4]:
print (my_array_1d)

[ 1  2  3  4  5 15  7]


In [5]:
#get the values in the range
print ('elements in the range: ', my_array_1d[2:5]) #a:b - including a, until (and excluding) b

#we can change values in the range as well
my_array_1d[2:5] = 20
print ('elements in the range after change: ', my_array_1d[2:5])

elements in the range:  [3 4 5]
elements in the range after change:  [20 20 20]


In [47]:
#2D arrays(two indexing): first one for the row and second one for the column

#create 2D array
my_2d_array = np.array([[1, 2, 3, 4, 5], [11, 12, 13, 14, 15], [21, 22, 23, 24, 25]])
print ('original array:', my_2d_array, '\n')

#slicing: generates an array of the same rank
my_array_copy = my_2d_array.copy()
row_slice = my_array_copy[1:3, :] #a:b - including a, until (and excluding) b
print ('sliced array on [1:3, :]: \n', row_slice)

original array: [[ 1  2  3  4  5]
 [11 12 13 14 15]
 [21 22 23 24 25]] 

sliced array on [1:3, :]: 
 [[11 12 13 14 15]
 [21 22 23 24 25]]


In [None]:
#if we change the sliced array it will changed the original array too
row_slice[1:4, :] = 12

print ('sliced array: \n', row_slice, '\n')
print ('original array: \n', my_2d_array)


sliced array: 
 [[11 12 13 14 15]
 [12 12 12 12 12]] 

original array: 
 [[ 1  2  3  4  5]
 [11 12 13 14 15]
 [21 22 23 24 25]]


In [7]:
#we can do the slicing for columns as well
slice_col = my_2d_array[:, 1:5]
print (slice_col)

[[ 2  3  4  5]
 [12 13 14 15]
 [22 23 24 25]]


In [8]:
# filters to select just those elements which meet certain criteria 
#select the elements that are greater than 10
slice_col[slice_col>10]

array([12, 13, 14, 15, 22, 23, 24, 25])

###**Arithmetic array operations**

More documentation at https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html

In [9]:
#addition can be done in two ways: with the plus sign, with the 'add' numpy function
arr_a = np.array([[10, 20, 30], [40, 50, 60]])
arr_b = np.array([[11, 21, 31], [41, 51, 61]])

print("Array A")
print(arr_a, '\n')

print("Array B")
print(arr_b, '\n')

print("Direct Addition")
print (arr_a + arr_b, '\n')

print("Numpy Addition")
print (np.add(arr_a, arr_b))   

Array A
[[10 20 30]
 [40 50 60]] 

Array B
[[11 21 31]
 [41 51 61]] 

Direct Addition
[[ 21  41  61]
 [ 81 101 121]] 

Numpy Addition
[[ 21  41  61]
 [ 81 101 121]]


In [10]:
#the same is with the subtraction 
print("Direct Subtraction")
print (arr_b - arr_a, '\n')

print("Numpy Subtraction")
print (np.subtract(arr_b, arr_a))

Direct Subtraction
[[1 1 1]
 [1 1 1]] 

Numpy Subtraction
[[1 1 1]
 [1 1 1]]


In [11]:
#multiplication
print("Direct Multiplication")
print (arr_a * arr_b, '\n')

print("Numpy Multiplication")
print (np.multiply(arr_a, arr_b))

Direct Multiplication
[[ 110  420  930]
 [1640 2550 3660]] 

Numpy Multiplication
[[ 110  420  930]
 [1640 2550 3660]]


In [12]:
#division
print("Direct Division")
print (arr_b / arr_a, '\n')

print("Numpy Division")
print (np.divide(arr_b, arr_a))

Direct Division
[[1.1        1.05       1.03333333]
 [1.025      1.02       1.01666667]] 

Numpy Division
[[1.1        1.05       1.03333333]
 [1.025      1.02       1.01666667]]


In [13]:
#square root

print (np.sqrt(arr_a))

[[3.16227766 4.47213595 5.47722558]
 [6.32455532 7.07106781 7.74596669]]


In [14]:
#exponent (e**x)

print (np.exp(arr_a))

[[2.20264658e+04 4.85165195e+08 1.06864746e+13]
 [2.35385267e+17 5.18470553e+21 1.14200739e+26]]


In [15]:
#power

print (np.power(arr_a, 2))

[[ 100  400  900]
 [1600 2500 3600]]


**Assignment 1**

1. Create two different array( x and y) from a list
1. Add array, x and y. Print the result.
1. Subtract array, x from y. Print the result.
1. Multiply array, x by 5. Print the result.
1. Divide array, y and 10. Print the result.

##**Pandas**
Pandas is an open source library for numerical computations, it is built on numpy.

Pandas helps you to carryout analysis faster and efficiently and it can work with excel sheet, csv and some other file types.
Documentation - https://pandas.pydata.org/pandas-docs/version/0.25.3/

###**Pandas Series**


In [None]:
#creating panda series from list
lts = ['a', 'b', 'c', 'd', 'e']
ser_1 = pd.Series(lts)

print (ser_1)

0    a
1    b
2    c
3    d
4    e
dtype: object


In [17]:
#creating panda series from array
import pandas as pd
arr = np.array([10, 20, 30, 40, 50])
ser_2 = pd.Series(arr)

print (ser_2)

0    10
1    20
2    30
3    40
4    50
dtype: int64


In [29]:
#create series with specific indexing
ser_3 = pd.Series(arr, index = ['a', 'b', 'c', 'e', 'f'])

print (ser_3)

a    10
b    20
c    30
e    40
f    50
dtype: int64


In [25]:
#accessing an element using the index 
print (ser_3[3], '\n\n')
print (ser_3[[0, 2, 4]], '\n\n')
print (ser_3[:3])

40 


a    10
c    30
f    50
dtype: int64 


a    10
b    20
c    30
dtype: int64


In [33]:
#accessing element using an index label
print (ser_3['a'])
print (ser_3[['b', 'a']])

10
b    20
a    10
dtype: int64


### Create a Pandas DataFrame

Pandas DataFrame is a two-dimensional labeled data structure.

In [35]:
#Creating from a list of lists
pet_info = [['Howard', 10, 'Goat'], ['Lucy', 4 , 'Fish'], ['Roco', 8, 'Lizard']]
pet_info

[['Howard', 10, 'Goat'], ['Lucy', 4, 'Fish'], ['Roco', 8, 'Lizard']]

In [36]:
pet_df = pd.DataFrame(pet_info, columns = ['Name', 'Age', 'Type'],
                      index = ['A', 'B', 'C'])
                     
pet_df

Unnamed: 0,Name,Age,Type
A,Howard,10,Goat
B,Lucy,4,Fish
C,Roco,8,Lizard


In [37]:
pet_df['Name']

A    Howard
B      Lucy
C      Roco
Name: Name, dtype: object

In [40]:
pet_df[['Name','Age']]

Unnamed: 0,Name,Age
A,Howard,10
B,Lucy,4
C,Roco,8


### Excercise 2!

1. Print out the 'Type' of the columns in pet_df

### iloc vs. loc

1. iloc - position-based indexing (index always starts with 0)

1. loc - label-based indexing

NB: If you do not specify indexes, then iloc and loc will be the same.

In [44]:
pet_df.iloc[1]

Name    Lucy
Age        4
Type    Fish
Name: B, dtype: object

In [45]:
pet_df.loc['B']

Name    Lucy
Age        4
Type    Fish
Name: B, dtype: object

**Excercise 3**

1. Download the weight-height.csv dataset.
1. Read the dataset using pandas.
1. using pandas positioning to display the fifth datapoint.
1. Print a column from the dataset.
