## NumPy Worksheet

This worksheet has the following:

 - How to import NumPy?
 - How to create NumPy arrays
 - Indexing, Fancy Indexing
 - Combining arrays
 - Slicing
 - Functions 
 - Broadcasting
 - Masking, Sorting and Comparison
 - Vectorization


In [2]:
# import numpy
import numpy as np

# check the version of python and numpy
print('NumPy version:', np.__version__)

NumPy version: 2.1.1


## 2- How to create NumPy arrays

There are many ways to create arrays in NumPy. We will take a look at a few of them here.

In [3]:
# create one dimensional numpy array with [1,2,3]
np.array([1, 2, 3])

array([1, 2, 3])

In [4]:
# create an array of one dimensional zeros
np.zeros(3)

array([0., 0., 0.])

In [5]:
# create Array of one one dimensianal 1s
np.ones(3)

array([1., 1., 1.])

In [6]:
# create 1-d array of 3 random integers between 1 and 10
np.random.randint(1,10, 3)

array([2, 6, 9])

In [7]:
# create array with 5 linearly spaced values between 0 to 10 
# hint: linspace
np.linspace(0, 10, 5 )

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [8]:
# create 2-Dimensional 3x3 array with values 1 to 9
np.array([[1,2,3],
          [4,5,6],
          [7,8,9]])

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [9]:
#create a 2 by 2 array
np.array([np.arange(2), np.arange(2)])

array([[0, 1],
       [0, 1]])

## Test Yourself

What does arange(5) do?
1. Creates a Python list of 5 elements with the values 1-5.
2. Creates a Python list of 5 elements with the values 0-4.
3. Creates a NumPy array with the values 1-5.
4. Creates a NumPy array with the values 0-4. ✅
5. None of the above.


In [16]:
np.arange(5)

array([0, 1, 2, 3, 4])

## Test Yourself

1. Create a three-dimensional array of dimension 3,4,5 with random integers from 0-9

In [17]:
# create 3x4 array values between 0 and 1
np.random.random((3,4))

array([[0.90642542, 0.89800597, 0.71551696, 0.31703287],
       [0.66831033, 0.53845232, 0.20569387, 0.18315715],
       [0.19767461, 0.25539953, 0.22225043, 0.18385977]])

In [18]:
a = np.array([1,2,3])
# Extend a with another value
# hint: append

a = np.append(a, 4)
print(a)
a.dtype

[1 2 3 4]


dtype('int64')

In [19]:
b=np.array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [20]:
c=np.array([[[1,2],[3,4],[9,10]],[[5,6],[7,8],[11,12]]])

# print the shape and dimension of a
# print the shape and dimension of b
# print the shape and dimension of c
print("Shape of a:", np.shape(a))
print("Shape of b:", np.shape(b))
print('Dimension of a:', np.ndim(a))
print('Dimension of b:', np.ndim(b))

Shape of a: (4,)
Shape of b: (3, 3)
Dimension of a: 1
Dimension of b: 2


In [21]:
# print number of elements in the arrays in a
# print number of elements in the arrays in b
# print number of elements in the arrays in c
print('Number of elements in a:', np.size(a))
print('Number of elements in b:', np.size(b))

Number of elements in a: 4
Number of elements in b: 9


An ellipsis (…) replaces multiple colons, so, the preceding code is equivalent to this

In [22]:
b[0,...] #equal to b[0,:]

array([1, 2, 3])

## Test Yourself

Create an array:

```python
np.arange(24).reshape(2,3,4)
```
You have created a 2x3x4 three dimensional array. We can visualize this as a two-story building with 12 rooms on each floor laid out in 3 rows and 4 columns.

reshape() changes the shape (dimension) of the initial array.

1. Select all of the first floor
2. Select first column and row regardless of floor
3. Select second row of the first floor
4. Select all the rooms on the second row, regardless of floor and column
5. Select rooms on the ground floor second column

In [27]:
data = np.arange(24).reshape(2,3,4)
data

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [29]:
# 1. Select all of the first floor of data
data[0,...]

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [31]:
# 2. Select first column and row regardless of floor
data[...,0,0]

array([ 0, 12])

In [33]:
# 3. Select second row of the first floor
data[0, 1,...]

array([4, 5, 6, 7])

In [37]:
# 4. Select all the rooms on the second row, regardless of floor and column
data[0,:]

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [38]:
# 5. Select rooms on the ground floor second column
data[0,:,1]

array([1, 5, 9])

In [None]:
# print the first element of a 
# print the first element of a using a negative index
print(a[0])
print(a[-4])


In [None]:
# get the last element of a 
# print the last element of a using a negative index
print(a[-1]) 
print(a[3])

In [None]:
# print the first row of b, all cols
print(b[0]) 
print(b[0,:])

In [None]:
# print all rows, second column of b
b[:,1]

Fancy indexing allows you to index a numpy array using the following:

* Another numpy array
* Python list
* A sequence of integers

Fancy indexing is a concept in Python that means passing an array of indices to access multiple array elements at once. It is used to select a subset of an array based on specific conditions. For example, you can use fancy indexing to select all the elements in an array that are greater than a certain value. Fancy indexing is a powerful tool in Python that can be used to manipulate arrays in a variety of ways.

In [35]:
x = np.array(['a', 'b', 'c'])
y = np.array([['d','e','f'], 
              ['g', 'h', 'k']])

print(x)
print(y)

['a' 'b' 'c']
[['d' 'e' 'f']
 ['g' 'h' 'k']]


In [36]:
# fancy indexing on 1D array
# get the value of b and c in array x

x[[0,2]]

array(['a', 'c'], dtype='<U1')

In [37]:
# using tuples get the values  e,h in array y
ind2 = ((0,1),(1))
y[ind2]

array(['e', 'h'], dtype='<U1')

In [45]:
#can use arrays
row = np.array([0,1])
col = np.array([1])
y[row, col]

array(['e', 'h'], dtype='<U1')

In [30]:
rand = np.random.RandomState(42)

Z = rand.randint(100, size=12).reshape(3,4)
Z

array([[51, 92, 14, 71],
       [60, 20, 82, 86],
       [74, 74, 87, 99]])

We can combine fancy and simple indices:

In [33]:
Z[2, [2, 0, 1]]

array([87, 74, 74])

We can also combine fancy indexing with slicing:

In [34]:
Z[1:, [2, 0, 1]]


array([[82, 60, 20],
       [87, 74, 74]])

In [31]:
row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
Z[row, col]

array([14, 20, 99])

Fancy indexing with replacements

In [46]:
k = np.arange(10)
i = np.array([2, 1, 8, 4])
k[i] = 99
print(k)

[ 0 99 99  3 99  5  6  7 99  9]


## Test Yourself


Generate a a deck of cards:

```python
suites = ['Hearts', 'Diamonds', 'Clubs', 'Spades']
values = ['A', '2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K']

deck=[ [s,v] for s in suites for v in values]
```

Select a random subset of the cards, say 5 cards, using fancy indexing. You can use the random.choice function from numpy


## 4- Slicing


Just as we can use square brackets to access individual array elements, we can also use them to access subarrays with the slice notation, marked by the colon (:) character. The NumPy slicing syntax follows that of the standard Python list; to access a slice of an array x, use this:

```x[start:stop:step]```

If any of these are unspecified, they default to the values start=0, stop=size of dimension, step=1. We'll take a look at accessing sub-arrays in one dimension and in multiple dimensions.

In [None]:
# create an array X filled integer from 1 to 10
# hint: use arange
X = np.arange(1, 11, dtype=int)
X

In [None]:
# get the first two elements of X 
X[:2]

In [None]:
# get the number 3,4 and 5 
X[2:5]

In [None]:
# get the odd numbers 
X[::2]

In [None]:
# get the even numbers
X[1::2]

## Test Yourself

1. Reverse the array X. Hint use negative index

In [47]:
# create 2D array from 1 to 9
# hint: use arange and reshape
Y= np.arange(1,10).reshape(3,3)
Y

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [48]:
# get the first and second row
Y[:2,:]

array([[1, 2, 3],
       [4, 5, 6]])

In [49]:
# get the second and third column
Y[:, 1:]

array([[2, 3],
       [5, 6],
       [8, 9]])

In [50]:
#get the element of 5 and 6
Y[1,1:]

array([5, 6])

In [51]:
Y[::-1, ::-1]

array([[9, 8, 7],
       [6, 5, 4],
       [3, 2, 1]])

In [53]:
print(Y[:, 0])  # first column of x2

[1 4 7]


In [54]:
print(Y[0, :])  # first row of x2

[1 2 3]


In [55]:
#In the case of row access, the empty slice can be omitted for a more compact syntax:

print(Y[0])  # equivalent to x2[0, :]

[1 2 3]


Subarrays are not copies but references

In [57]:
Y_sub = Y[:2, :2]
Y_sub

array([[1, 2],
       [4, 5]])

In [59]:
Y_sub[0, 0] = 99
print(Y_sub)
print(Y) # original array is modified!!!

[[99  2]
 [ 4  5]]
[[99  2  3]
 [ 4  5  6]
 [ 7  8  9]]


To create copies use copy() function

In [63]:
Y_copy = Y[:2, :2].copy()

In [64]:
Y_copy[1, 1] = 99
print(Y_copy)
print(Y) # original array is NOT modified!!!

[[99  2]
 [ 4 99]]
[[99  2  3]
 [ 4  5  6]
 [ 7  8  9]]


### Concatenation

Combining np arrays in different ways

In [65]:
x = np.array([1, 2, 3])
y = np.array([3, 2, 1])
np.concatenate([x, y])

array([1, 2, 3, 3, 2, 1])

In [66]:
grid = np.array([[1, 2, 3],
                 [4, 5, 6]])
# concatenate along the first axis
np.concatenate([grid, grid])

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3],
       [4, 5, 6]])

In [67]:
# concatenate along the second axis (zero-indexed)
np.concatenate([grid, grid], axis=1)

array([[1, 2, 3, 1, 2, 3],
       [4, 5, 6, 4, 5, 6]])

#### Vertical Stack

In [68]:
x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7],
                 [6, 5, 4]])

# vertically stack the arrays
np.vstack([x, grid])

array([[1, 2, 3],
       [9, 8, 7],
       [6, 5, 4]])

#### Horizontal Stack

In [69]:
# horizontally stack the arrays
y = np.array([[99],
              [99]])
np.hstack([grid, y])

array([[ 9,  8,  7, 99],
       [ 6,  5,  4, 99]])

### Splitting arrays

The opposite of concatenation is splitting, which is implemented by the functions np.split, np.hsplit, and np.vsplit. For each of these, we can pass a list of indices giving the split points

In [70]:
x = [1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)



[1 2 3] [99 99] [3 2 1]


In [73]:
grid = np.arange(16).reshape((4, 4))
print(grid)

upper, lower = np.vsplit(grid, [2])
print(upper)
print(lower)


[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
[[0 1 2 3]
 [4 5 6 7]]
[[ 8  9 10 11]
 [12 13 14 15]]


In [72]:
left, right = np.hsplit(grid, [2])
print(left)
print(right)

[[ 0  1]
 [ 4  5]
 [ 8  9]
 [12 13]]
[[ 2  3]
 [ 6  7]
 [10 11]
 [14 15]]


## 5- Functions

In [None]:
# use the same array we created earlier 
X= array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])


In [None]:
#find the maximum element of X
np.max(X)

In [None]:
#mean of values in the X
np.mean(X)

In [None]:
# get the 4th power of each value
np.power(X, 4)

In [None]:
# multiply every element in 2D array by 2
# hint: np.multiply

In [None]:
np.multiply(Y, 2)

### 6- Broadcasting

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes. See more at https://numpy.org/doc/stable/user/basics.broadcasting.html

In [None]:
# use back  X= array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
# add 5 to each element
X + 5

In [None]:
# Initialize `k`
k = np.ones((3,4))

# Check shape of `k`
print(k.shape)

# Initialize `l`
l = np.arange(4)

# Check shape of `l`
print(l,l.shape)

# Subtract `k` and `l`
m=k-l
print(m,m.shape)

In [None]:
# Initialize `k` and `l`
k = np.ones((3,4))
l = np.random.random((5,1,4))
# what are the dimensions of k and l
print(k.shape,l.shape)

In [None]:
# add (+) k and l together, is the answer expected?
# what is the resultant dimension?
z=k+l
print(z, z.shape)

Note: The two arrays are said to be broadcast compatible if they have the same size in the dimension, or if one of the arrays has size 1 in that dimension

### 7- Sorting, Comparing and Masking

In [96]:
# create array x of  random 10 elements between 1 and 5
# hint : randint
x = np.random.randint(1,9, 10)
x

array([2, 4, 4, 6, 2, 1, 4, 6, 3, 5])

In [95]:
# create y a (3,3) size of array elements from 1 and 5
y = np.random.randint(1,9, (3,3))
y


array([[2, 7, 5],
       [6, 2, 7],
       [1, 2, 5]])

In [None]:
# sort elements in array x
# hint: sort
np.sort(x)

In [None]:
# sort y values along the rows
# hint: sort with axis
np.sort(y, axis=0)

In [None]:
# sort values along the columns
np.sort(y, axis=1)

#### Boolean operators and masking

```
==	np.equal		!=	np.not_equal
<	np.less		<=	np.less_equal
>	np.greater		>=	np.greater_equal

&	np.bitwise_and		|	np.bitwise_or
^	np.bitwise_xor		~	np.bitwise_not
```

In [86]:
# get the values of x which is greater than 3
# hint: == , !=, < , >, >=, <= operations on arrays
# hint: use masking feature to get the values of comparisons
x > 3

array([False, False, False,  True, False, False, False, False, False,
        True])

In [76]:
x == 3

array([False, False, False,  True, False, False,  True, False, False,
       False])

In [78]:
(2 * x) == (x ** 2)

array([False, False, False, False,  True,  True, False, False, False,
       False])

In [85]:
# how many values less than 3?
np.count_nonzero(x < 3)

5

In [87]:
np.sum(x < 3)

5

In [92]:
# are there any values greater than 3?
np.any(x > 3)

True

In [90]:
# are all values equal to 6?
np.all(x == 2)

False

### Boolean Mask

In [94]:
x[x>2]

array([3, 3, 4, 3, 4])

### Test Yourself

Get the count and values of x which is less than 5 but greater than 1. Hint: use a bitwise boolean operator

### Test Yourself

Let's generate some data to represent daily rainfall for a year as measured in cm (in integers)

```python
rain = np.random.randint(0,9, 365)
```
From the data calculate:
1. The number of days without rain
1. Days without rain
1. The number of days with rain
1. The number of days with more than 2 cm of rain
1. The number of days with less than 2 cm of rain
1. Days with less than 1 cm of rain
1. The average precipitation for rainy days the entire year. Rainy days are days where rain is greater than zero.
1. The median precipitation for the summer. Summer is defined as day 170 to day 260.
1. THe maximum precipitation for the summer
1. The average precipitation on non-summer rainy days.

You can use these functions:

```python
np.sum
np.average
np.median
    
```

### Advanced: Vectorizing Loops

Given a list of values, find the index of value that is closest to search_value
The values data store positions as x, y coordinates

```python
import random
values = [(random.random(), random.random()) for _ in range(100)]  
# randomly generate a pair of (x,y) values in a list
```

In [97]:
# NON-Vectorized version
# Given a list of values, find the index of value that is closest to search_value
# The values data store positions as x, y coordinates

def closest_index( search_value, values):
    x0, y0 = search_value
    dbest, ibest = None, None
    for i, (x, y) in enumerate(values):    # loop through the entire list
        d = (x - x0) ** 2 + (y - y0) ** 2  # euclidean distance
        if dbest is None or d < dbest:
            dbest, ibest = d, i            # store the best found closest distance and index so far  
    return ibest                           # return the index of the closest value to search_value




In [103]:
import random
values = [(random.random(), random.random()) for _ in range(100)]

In [99]:
z=closest_index((.51, .34), values) 
# return index position of closest value to (0.51,0.34)

# print the given index value in values, and the actual nearest values
print(z, values[z])

49 (0.4941306917558894, 0.37898187696574026)


## Test Yourself

Now convert the above find closest_index code to the vectorized version. This means use only numpy operations and without using any for or while loops.

Hint:
1. Generate (or separate) the values array as separate X and Y arrays
2. Calculate all the distances of the search_value, given as=(x0,y0) from the arrays X and Y in a single expression (one line of code) using the Euclidean formula.
3. Use the numpy function np.argmin to find the closest distance of X,Y to the search_value

Below is example how argmin is used.

In [107]:
#Example of argmin
Z=np.array([3,9,0,1,3,5,2,10])
print(np.argmin(Z))

2
