# Lesson 1 Practice: NumPy Part 1
Use this notebook to follow along with the lesson in the corresponding lesson notebook: [L01-Numpy_Part1-Lesson.ipynb](./L01-Numpy_Part1-Lesson.ipynb).  



## Instructions
Follow along with the teaching material in the lesson. Throughout the tutorial sections labeled as "Tasks" are interspersed and indicated with the icon: ![Task](http://icons.iconarchive.com/icons/sbstnblnd/plateau/16/Apps-gnome-info-icon.png). You should follow the instructions provided in these sections by performing them in the practice notebook.  When the tutorial is completed you can turn in the final practice notebook. For each task, use the cell below it to write and test your code.  You may add additional cells for any task as needed or desired.  

## Task 1a: Setup

In the practice notebook, import the following packages:
+ `numpy` as `np`

In [2]:
import numpy as np


## Task 2a: Creating Arrays

In the practice notebook, perform the following.  
- Create a 1-dimensional numpy array and print it.
- Create a 2-dimensional numpy array and print it.
- Create a 3-dimensional numpy array and print it.

In [3]:
one_d = np.array([1,2,3])
two_d = np.array([[1,2,3],[2,3,4]])
three_d = np.array([[1,2,3],[2,3,4],[3,4,5]])

print("one_d",one_d)
print("two_d",two_d)
print("three_d",three_d)


one_d [1 2 3]
two_d [[1 2 3]
 [2 3 4]]
three_d [[1 2 3]
 [2 3 4]
 [3 4 5]]


In [4]:
one_d

array([1, 2, 3])

In [5]:
two_d

array([[1, 2, 3],
       [2, 3, 4]])

In [6]:
three_d

array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5]])

## Task 3a: Accessing Array Attributes

In the practice notebook, perform the following.

- Create a NumPy array.
- Write code that prints these attributes (one per line): `ndim`, `shape`, `size`, `dtype`, `itemsize`, `data`, `nbytes`.
- Add a comment line, before each line describing what value the attribute returns. 


In [7]:
# creating a NumPy array
my_array   = np.array([1,2])
my_array2 = np.array([[1.2,2.3,3.4],[4.5,5.6,6.7]])
                       

# finding the dimensions
print(my_array.ndim)
print(my_array2.ndim)

#finding the shape
print(my_array.shape)
print(my_array2.shape)

#finding the size - total number of elements
print(my_array.size)
print(my_array2.size)

#finding the dtype - data type
print(my_array.dtype)
print(my_array2.dtype)

#finding the itemsize - length of one array element in bytes
print(my_array.itemsize)
print(my_array2.itemsize)
                       
#finding the data - pointer to the start of the array
print(my_array.data)
print(my_array2.data)

#finding the nbytes - total number of bytes for the array
# basically this is [size*itemsize]
print(my_array.nbytes)
print(my_array2.nbytes)

1
2
(2,)
(2, 3)
2
6
int64
float64
8
8
<memory at 0x7f76b96a8f40>
<memory at 0x7f76f8385040>
16
48


## Task 4a: Initializing Arrays

In the practice notebook, perform the following.

+ Create an initialized array by using these functions:  `ones`, `zeros`, `empty`, `full`, `arange`, `linspace` and `random.random`. Be sure to follow each array creation with a call to `print()` to display your newly created arrays. 
+ Add a comment above each function call describing what is being done.  

In [8]:
ones = np.ones((3,4),int)  #default dtype = float
print(ones)

zeroes = np.zeros((2,3)) #default dtype = float
print(zeroes)

empty = np.empty((2,2)) # default dtype = float
print(empty)

full = np.full((2,2),1) # default dtype = None - determines the dtype from the input
print(full)
print(full.dtype)

full2 = np.full((2,3),[1.0,2,3]) 
print(full2)
print(full2.dtype)



[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]
[[0. 0. 0.]
 [0. 0. 0.]]
[[2.43613493e-316 0.00000000e+000]
 [4.74303020e-322 6.92428728e-310]]
[[1 1]
 [1 1]]
int64
[[1. 2. 3.]
 [1. 2. 3.]]
float64


In [9]:
# numpy.arange([start, ]stop, [step, ]dtype=None, *, like=None)
# default start is 0. default step size is 1. default dtype is None
# If step is specified, start should also be specified. 
# Better to not use when dtype = int and step is float

# Returns array

arange = np.arange(5)
print(arange)

# It does not include the STOP in the output array
arange = np.arange(10,20,2)
print(arange)

arange = np.arange(10,21,2)
print(arange)
arange = np.arange(11,20,3)
print(arange)

# error prone
arange = np.arange(0,10,0.5, dtype = int)
print(arange)


arange = np.arange(0,10,0.5, dtype = float)
print(arange)
print(arange.shape)
print(arange.ndim)


[0 1 2 3 4]
[10 12 14 16 18]
[10 12 14 16 18 20]
[11 14 17]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0.  0.5 1.  1.5 2.  2.5 3.  3.5 4.  4.5 5.  5.5 6.  6.5 7.  7.5 8.  8.5
 9.  9.5]
(20,)
1


In [10]:
# numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)
# equally spaced values over a closed interval [start,stop]
# num determines the spacing between values
# endpoint = False makes the interval half-open interval [start,stop)
# retstep = True returns the calculated step size

# Returns tuple

#linspace = np.linspace(1,3)
#print(linspace)

# includes the stop in the output
linspace = np.linspace(1,3,num = 5,retstep = True)
print(linspace)

# excludes the stop in the output
linspace = np.linspace(1,3,num = 5,endpoint = False, retstep = True)
print(linspace)

# The step sizes are different in both of the above

# trying the linspace for array input for start and stop
linspace = np.linspace((2,3),(4,4),num = 5,retstep = True)





(array([1. , 1.5, 2. , 2.5, 3. ]), 0.5)
(array([1. , 1.4, 1.8, 2.2, 2.6]), 0.4)


In [11]:
# returns a random value between [0,1)

print(np.random.random())
print(np.random.random(2))
print(np.random.random((2,3)))

print(np.random.random((2,3)).shape)

0.6360840693647614
[0.70437364 0.2559916 ]
[[0.77052062 0.80101797 0.52311164]
 [0.75831325 0.36612732 0.93494791]]
(2, 3)


## Task 5a:  Broadcasting Arrays

In the practice notebook, perform the following.

+ Create two arrays of differing sizes but compatible with broadcasting.
+ Perform addition, multiplication and subtraction.
+ Create two additional arrays of differing size that do not meet the rules for broadcasting and try a mathematical operation.  

In [12]:


array_1 = np.array([[0,0,0,0],[1,1,1,1],[2,2,2,2]])
array_2 = np.random.random((2,4))
array_3 = np.ones((3,2,4))

print(f"array_1 shape: {array_1.shape}")
print(f"array_2 shape: {array_2.shape}")
print(f"array_3 shape: {array_3.shape}")

print(array_1)
print(array_2)
print(array_3)



array_1 shape: (3, 4)
array_2 shape: (2, 4)
array_3 shape: (3, 2, 4)
[[0 0 0 0]
 [1 1 1 1]
 [2 2 2 2]]
[[0.28927564 0.65764496 0.5705427  0.55124274]
 [0.16987286 0.34262658 0.39031002 0.42618431]]
[[[1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]]]


array_1 and array_2 are NOT broadcastable

array_1 and array_3 are NOT broadcastable

array_2 and array_3 are broadcastable

In [13]:
result_12 = array_1 + array_2
print(result)

ValueError: operands could not be broadcast together with shapes (3,4) (2,4) 

In [14]:
result_13 = array_1 + array_3
print(result_13)

ValueError: operands could not be broadcast together with shapes (3,4) (3,2,4) 

In [15]:
result_23 = array_2 + array_3
print(result_23)



[[[1.28927564 1.65764496 1.5705427  1.55124274]
  [1.16987286 1.34262658 1.39031002 1.42618431]]

 [[1.28927564 1.65764496 1.5705427  1.55124274]
  [1.16987286 1.34262658 1.39031002 1.42618431]]

 [[1.28927564 1.65764496 1.5705427  1.55124274]
  [1.16987286 1.34262658 1.39031002 1.42618431]]]


In [16]:
result_23 = array_2 - array_3
print(result_23)



[[[-0.71072436 -0.34235504 -0.4294573  -0.44875726]
  [-0.83012714 -0.65737342 -0.60968998 -0.57381569]]

 [[-0.71072436 -0.34235504 -0.4294573  -0.44875726]
  [-0.83012714 -0.65737342 -0.60968998 -0.57381569]]

 [[-0.71072436 -0.34235504 -0.4294573  -0.44875726]
  [-0.83012714 -0.65737342 -0.60968998 -0.57381569]]]


In [17]:
result_23 = array_2 * array_3
print(result_23)

[[[0.28927564 0.65764496 0.5705427  0.55124274]
  [0.16987286 0.34262658 0.39031002 0.42618431]]

 [[0.28927564 0.65764496 0.5705427  0.55124274]
  [0.16987286 0.34262658 0.39031002 0.42618431]]

 [[0.28927564 0.65764496 0.5705427  0.55124274]
  [0.16987286 0.34262658 0.39031002 0.42618431]]]


## Task 6a: Math/Stats Aggregate Functions

In the practice notebook, perform the following.

+ Create three to five arrays
+ Experiment with each of the aggregation functions: `sum`, `minimum`, `maximum`, `cumsum`, `mean`, `np.corrcoef`, `np.std`, `np.var`. 
+ For each function call, add a comment line above it that describes what it does.  
```


In [18]:
## sum
## numpy.sum(a, axis=None, dtype=None, out=None, keepdims=<no value>, initial=<no value>, where=<no value>)
## An array with the same shape as a, with the SPECIFIED AXIS REMOVED. 
## If a is a 0-d array, or if axis is None, a scalar is returned. 
## If an output array is specified, a reference to out is returned.

a = 2
print(np.sum(a))

print(array_1)
print(np.sum(array_1))
print(np.sum(array_1,initial = 10))

print(np.sum(array_1,axis = 0))
print(np.sum(array_1,axis = 1))




2
[[0 0 0 0]
 [1 1 1 1]
 [2 2 2 2]]
12
22
[3 3 3 3]
[0 4 8]


In [19]:
## Cumulative sum
## numpy.cumsum(a, axis=None, dtype=None, out=None)
## A new array holding the result is returned unless out is specified, 
## in which case a reference to out is returned. 
## The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.

## size of cumsum is same as 'a' or 'array_1' in all the outputs
a = np.array([1,2,3,4])

# shape of cumsum is same as array_1 since axis is specified
cumsum = np.cumsum(array_1,axis = 1)
print(cumsum)

cumsum = np.cumsum(array_1,axis = 0)
print(cumsum)

print(f"array_1 shape: {array_1.shape}")
print(f"array_1 size: {array_1.size}")
print(f"cumsum shape: {cumsum.shape}")
print(f"cumsum size: {cumsum.size}")

# shape of cumsum and a are different since axis is not specified
cumsum = np.cumsum(array_1)
print(cumsum)

print(f"array_1 shape: {array_1.shape}")
print(f"array_1 size: {array_1.size}")
print(f"cumsum shape: {cumsum.shape}")
print(f"cumsum size: {cumsum.size}")

# shape of cumsum are same as 'a' since a is an 1-d array
cumsum = np.cumsum(a)
print(cumsum)
print(f"a shape: {a.shape}")
print(f"a size: {a.size}")
print(f"cumsum shape: {cumsum.shape}")
print(f"cumsum size: {cumsum.size}")


[[0 0 0 0]
 [1 2 3 4]
 [2 4 6 8]]
[[0 0 0 0]
 [1 1 1 1]
 [3 3 3 3]]
array_1 shape: (3, 4)
array_1 size: 12
cumsum shape: (3, 4)
cumsum size: 12
[ 0  0  0  0  1  2  3  4  6  8 10 12]
array_1 shape: (3, 4)
array_1 size: 12
cumsum shape: (12,)
cumsum size: 12
[ 1  3  6 10]
a shape: (4,)
a size: 4
cumsum shape: (4,)
cumsum size: 4


In [20]:
# MINIMUM

# numpy.minimum(x1, x2, out=None)
# x1 and x2 should have same shape or broadcastable

a = np.random.random((2,2))
print(a)

b = np.random.random(2)
print(b)

min = np.minimum(a,b)
print(min)



[[0.04793667 0.29068603]
 [0.56783471 0.43055982]]
[0.21385907 0.2456882 ]
[[0.04793667 0.2456882 ]
 [0.21385907 0.2456882 ]]


In [21]:
# MAXIMUM

# numpy.maximum(x1, x2, out=None )
# x1 and x2 should have same shape or broadcastable

max = np.maximum(a,b)
print(max)

c = np.random.random((3,2))
print(c)
max = np.maximum(a,c)
print(max)

[[0.21385907 0.29068603]
 [0.56783471 0.43055982]]
[[0.23935695 0.17424096]
 [0.75921668 0.76587642]
 [0.34726925 0.39527941]]


ValueError: operands could not be broadcast together with shapes (2,2) (3,2) 

In [22]:
# np.mean()

# numpy.mean(a, axis=None, dtype=None, out=None)
# average taken over the flattened array by default if axis is not specified

print(array_1)

print(np.sum(array_1))
print(np.mean(array_1))  

# mean along an axis
print(np.sum(array_1,axis=0))
print(np.mean(array_1,axis=0))


print(np.sum(array_1,axis=1))
print(np.mean(array_1,axis=1))



[[0 0 0 0]
 [1 1 1 1]
 [2 2 2 2]]
12
1.0
[3 3 3 3]
[1. 1. 1. 1.]
[0 4 8]
[0. 1. 2.]


In [23]:
array_3 = np.array([ [[1,2,3,4], [1,2,3,4]], [[1,2,3,4], [5,6,7,8]], [[1,2,3,4], [9,10,11,12]]])   
print(array_3)
print(array_3.shape)

# print(np.sum(array_3,axis =0))
# print(np.mean(array_3,axis = 0))

# print(np.sum(array_3,axis =1))
# print(np.mean(array_3,axis =1))

# print(np.sum(array_3,axis =2))
# print(np.mean(array_3,axis = 2))

# The reduced axis is left in the dimension with size one
# print(np.mean(array_3,axis = 2,keepdims = True))

# print(np.mean(array_3,axis = (0,1)))
# print(np.mean(array_3,axis = (0,2)))
print(np.mean(array_3,axis = (1,2)))

[[[ 1  2  3  4]
  [ 1  2  3  4]]

 [[ 1  2  3  4]
  [ 5  6  7  8]]

 [[ 1  2  3  4]
  [ 9 10 11 12]]]
(3, 2, 4)
[2.5 4.5 6.5]


In [24]:
# np.median()

# numpy.median(a, axis=None, out=None, overwrite_input=False, keepdims=False)
# overwrite_input = True will rewrite 'a'
# other parameters are same as in np.mean() above

print(array_3)
print(np.median(array_3,axis = 2))

[[[ 1  2  3  4]
  [ 1  2  3  4]]

 [[ 1  2  3  4]
  [ 5  6  7  8]]

 [[ 1  2  3  4]
  [ 9 10 11 12]]]
[[ 2.5  2.5]
 [ 2.5  6.5]
 [ 2.5 10.5]]


In [25]:
# np.corrcoef()

# numpy.corrcoef(x, y=None, rowvar=True, dtype=None)
# When rowvar = True, each row represents a variable while the columns contains observations
# when rowvar = False, columns are considered as variables and the rows contain observations
# 
# A variable contains all values that measure the same underlying attribute (like height, temperature, duration) 
# across units. An observation contains all values measured on the same unit (like a person, or a day,
# or a city) across attributes.

array_4 = np.random.random((3,4))
print(array_4)
print(np.corrcoef(array_4))
#print(np.corrcoef(array_4,rowvar=False))

###############################################3

# np.std() and np.var()

# numpy.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>)
# numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>)
# default is to find the standard deviance or variance of the flattened array

print(array_4)
print(np.std(array_4))
print(np.var(array_4))


[[0.67433263 0.81684289 0.95831519 0.02411802]
 [0.8403148  0.76261433 0.75369882 0.73388704]
 [0.72522509 0.21613262 0.58740716 0.38089145]]
[[1.         0.31797838 0.20423242]
 [0.31797838 1.         0.66950147]
 [0.20423242 0.66950147 1.        ]]
[[0.67433263 0.81684289 0.95831519 0.02411802]
 [0.8403148  0.76261433 0.75369882 0.73388704]
 [0.72522509 0.21613262 0.58740716 0.38089145]]
0.26509055597407044
0.07027300286664179


## Task 6b: Logical Aggregate Functions

In the practice notebook, perform the following.

+ Create two arrays containing boolean values.
+ Experiment with each of the aggregation functions: `logical_and`, `logical_or`, `logical_not`. 
+ For each function call, add a comment line above it that describes what it does.  
```

In [38]:
# np.logical_or()

log_a =[True, False, True, False]
log_b =[True, True, False, False]



print(np.logical_or(log_a,log_b))

[ True  True  True False]


In [41]:
print(np.logical_and(log_a,log_b))



[ True False False False]


In [42]:
# '&' can be used as shortcut for 'np.logical_and' but the two inputs should be arrays

print(log_a&log_b)

TypeError: unsupported operand type(s) for &: 'list' and 'list'

In [49]:
a = np.array(log_a)
b = np.array(log_b)

print(a.shape)

print(a&b)

(4,)
[ True False False False]


In [58]:
c = np.array([[True,False,True,False],[True,True,True,True]])
print(c)

[[ True False  True False]
 [ True  True  True  True]]


In [61]:
print(a)
print(c)
print(np.logical_and(a,c))
print(np.logical_or(a,c))



[ True False  True False]
[[ True False  True False]
 [ True  True  True  True]]
[[ True False  True False]
 [ True False  True False]]
[[ True False  True False]
 [ True  True  True  True]]


In [28]:
print(np.logical_not(log_a))

[False  True False  True]
