# The Basics of NumPy Arrays

Attributes of arrays <br>
    Determining the size, shape, memory consumption, and data types of arrays<br><br>
Indexing of arrays<br>
    Getting and setting the value of individual array elements<br><br>
Slicing of arrays<br>
    Getting and setting smaller subarrays within a larger array<br><br>
Reshaping of arrays<br>
    Changing the shape of a given array<br><br>
Joining and splitting of arrays<br>
    Combining multiple arrays into one, and splitting one array into many<br><br>

# NumPy Array Attributes

In [3]:
import numpy as np
np.random.seed(0) # seed for reproducibility

x1 = np.random.randint(10, size=6) # One-dimensional array
x2 = np.random.randint(10, size=(3, 4)) # Two-dimensional array
x3 = np.random.randint(10, size=(3, 4, 5)) # Three-dimensional array

In [8]:
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)

x3 ndim:  3
x3 shape: (3, 4, 5)
x3 size:  60


1

Question 1 : What does np.random.seed(0) do in the random number generation? <br>
- creates a seed from which the random numbers are generated so that you can rerun the script and get the same random numbers <br>

Question 2 : What is the expected output (ndim, shape, size) if size is changed to a 4 dimensional array?<br>
for size=(3, 4, 5, 6) <br>
- ndim = 4, shape = (3, 4, 5, 6), size = 360 <br>

Question 3 : How many elements are present in the 3 arrays?<br>
- x1 has 5 elements, x2 has 12 elements, and x3 has 60 elements<br>

Question 4 : Which paramter(s) allows me to change the range of values?<br>
- the first two parameters in np.random.randint(<low>, <high>...) allows for changing the range<br>

Question 5 : How do we tell the data types for x1, x2 and x3? <br>
- use the type function type(x1) etc...<br>

Question 6 : How do we tell the size of the individual elements and the overall size of all the elements?<br>
- x1[(element index)].size and x1.size<br>

# Array Indexing: Accessing Single Elements

In a one-dimensional array, you can access the ith value (counting from
zero) by specifying the desired index in square brackets

In [5]:
x1[6]

IndexError: index 6 is out of bounds for axis 0 with size 6

Question : What does x1[6] return?<br>
- it returns an error because the 6th element in that array does not exist<br>

Question : How do I get the last element? <br>
x1[x1.size-1]

In [6]:
x2[0,0]

3

In [7]:
x3[0,0,0]

8

Question : Change the value of x2[0,0] to 3.14? Verify the data type?
              

In [19]:
x2 = np.float32(x2)
x2[0,0] = 3.14
print(x2)
type(x2[0,0])

[[3.14 5.   2.   4.  ]
 [7.   6.   8.   8.  ]
 [1.   6.   7.   7.  ]]


numpy.float32

# Array Slicing: Accessing Subarrays

We can also use them to access subarrays with the slice notation, marked by the colon (:) character.<br>
x[start:stop:step]<br>

In [21]:
x = np.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [9]:
# first five elements
x[5:] 

array([5, 6, 7, 8, 9])

In [10]:
# elements from 4th to 7th index
x[4:7]

array([4, 5, 6])

In [11]:
# every other element
x[::2] 

array([0, 2, 4, 6, 8])

In [12]:
# every other element, starting at index 1 
x[1::2]

array([1, 3, 5, 7, 9])

Question : How do we reverse all the elements (clue : use :: and -1?

In [22]:
x[::-1]

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

Question : What is the output : x[5::-2] ?
- starting at 5 walk backwards at an interval of 2 

In [25]:
x[5::-2]

array([5, 3, 1])

# Multidimensional subarrays

In [27]:
x2 = np.int64(x2)

In [28]:
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [14]:
# two rows, three columns
x2[:2, :3] 

array([[3, 5, 2],
       [7, 6, 8]])

In [15]:
# all rows, every other column
x2[:3, ::2]

array([[3, 2],
       [7, 8],
       [1, 7]])

In [16]:
x2[::-1, ::-1]

array([[7, 7, 6, 1],
       [8, 8, 6, 7],
       [4, 2, 5, 3]])

# Accessing array rows and columns

Accessing single rows or columns of an array. You can do this by combining indexing and slicing,
using an empty slice marked by a single colon (:):<br>

In [17]:
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [18]:
print(x2[:, 0])

[3 7 1]


In [19]:
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

Question : Print all the elements of row 0. Simplify the syntax.

In [38]:
x2[:1,:4]

array([[3, 5, 2, 4]])

# Subarrays as no-copy views

Slices vs views.

In [20]:
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [39]:
x2_sub = x2[:2, :2]
print(x2_sub)
x2_sub[0, 0] = 99
print(x2)


[[3 5]
 [7 6]]
[[99  5  2  4]
 [ 7  6  8  8]
 [ 1  6  7  7]]


# Creating copies of arrays

We modify this subarray, we’ll see that the original array is changed!

In [40]:
x2_sub_copy = x2[:2, :2].copy()
print(x2_sub_copy)
x2_sub_copy[0, 0] = 42
x2

[[99  5]
 [ 7  6]]


array([[99,  5,  2,  4],
       [ 7,  6,  8,  8],
       [ 1,  6,  7,  7]])

Question : What is the difference, advantage and diadvantage of slices vs views
- If something is modified in a slice, it modifies the original array as well. If something is modified in a copy view, it does not modify the original array

# Reshaping of Arrays

For example, if you want to put the numbers
1 through 9 in a 3×3 grid,

In [23]:
grid = np.arange(1, 10)
print(grid)

[1 2 3 4 5 6 7 8 9]


In [24]:
grid = np.arange(1, 10).reshape((3, 3))
print(grid)

#the reshape method will use a no-copy view of the initial array, but with noncontiguous memory buffers this is not always the case.

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [25]:
x = np.array([1, 2, 3])
x

array([1, 2, 3])

In [26]:
# row vector via reshape
x.reshape((1, 3))

array([[1, 2, 3]])

In [27]:
# row vector via newaxis
x[np.newaxis, :]

array([[1, 2, 3]])

In [28]:
# column vector via reshape
x.reshape((3, 1))

array([[1],
       [2],
       [3]])

In [29]:
# column vector via newaxis
x[:, np.newaxis]

array([[1],
       [2],
       [3]])

# Array Concatenation and Splitting

# Concatenation of arrays

np.concatenate <br> np.vstack <br>  np.hstack<br> 

In [42]:
x = np.array([1, 2, 3])
y = np.array([3, 2, 1])
z = [99, 99, 99]
np.concatenate([x, y, z])

array([ 1,  2,  3,  3,  2,  1, 99, 99, 99])

Question : Concatenate one more array, z = [99, 99, 99]. <br> 
np.concatenate([x, y, z, z])

Question : Concatenate 2 2D arrays say grid.<br>
a = [[1, 2],[3,4]]<br>
b = [[5,6],[7,8]]<br>
grid = np.concatenate([a,b])<br>

In [48]:
grid = np.array([[1, 2, 3],
[4, 5, 6]])

In [49]:
grid


array([[1, 2, 3],
       [4, 5, 6]])

In [50]:
np.concatenate([grid, grid])

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3],
       [4, 5, 6]])

In [51]:
# concatenate along the second axis (zero-indexed)
np.concatenate([grid, grid], axis=1)

array([[1, 2, 3, 1, 2, 3],
       [4, 5, 6, 4, 5, 6]])

In [52]:
x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7], [6, 5, 4]])
# vertically stack the arrays
np.vstack([x, grid])

array([[1, 2, 3],
       [9, 8, 7],
       [6, 5, 4]])

In [53]:
#Try this : np.vstack([ grid,x])
np.vstack([grid, x])

array([[9, 8, 7],
       [6, 5, 4],
       [1, 2, 3]])

In [37]:
# horizontally stack the arrays
y = np.array([[99],[99]])
np.hstack([grid, y])

array([[ 9,  8,  7, 99],
       [ 6,  5,  4, 99]])

# Splitting of arrays

np.split<br> np.hsplit<br> np.vsplit<br>

In [38]:
x = [1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)

[1 2 3] [99 99] [3 2 1]


Question : Based on the result, describe the values [3, 5]. <br>
- x was split at position 3 and 5

In [39]:
grid = np.arange(16).reshape((4, 4))
grid

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [40]:
upper, lower = np.vsplit(grid, [2])
print(upper)
print(lower)

[[0 1 2 3]
 [4 5 6 7]]
[[ 8  9 10 11]
 [12 13 14 15]]


In [41]:
left, right = np.hsplit(grid, [2])
print(left)
print(right)

[[ 0  1]
 [ 4  5]
 [ 8  9]
 [12 13]]
[[ 2  3]
 [ 6  7]
 [10 11]
 [14 15]]


# Computation on NumPy Arrays: Universal Functions

Computation on NumPy arrays can be very fast, or it can be very slow. The key to
making it fast is to use vectorized operations, generally implemented through Num‐
Py’s universal functions (ufuncs).<br> 

NumPy’s ufuncs, which can be used to make repeated calculations on array elements much more efficient.
It then introduces many of the most common and useful arithmetic ufuncs
available in the NumPy package.<br>

# The Slowness of Loops

Compute the reciprocal of each element. 

In [42]:
import numpy as np
np.random.seed(0)
def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values[i]
        return output

values = np.random.randint(1, 10, size=5)
compute_reciprocals(values)

array([1.66666667e-001, 4.67296746e-307, 5.47424736e-321, 6.95314361e-310,
       0.00000000e+000])

Question : We’ll benchmark this
with IPython’s %timeit magic (discussed in “Profiling and Timing Code” on page 25):

In [43]:
big_array = np.random.randint(1, 100, size=1000000)
%timeit compute_reciprocals(big_array)

21.8 µs ± 956 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


It takes several seconds to compute these million operations and to store the result!
When even cell phones have processing speeds measured in Giga-FLOPS (i.e., billions
of numerical operations per second), this seems almost absurdly slow. It turns
out that the bottleneck here is not the operations themselves, but the type-checking
and function dispatches that CPython must do at each cycle of the loop. Each time
the reciprocal is computed, Python first examines the object’s type and does a
dynamic lookup of the correct function to use for that type. If we were working in
compiled code instead, this type specification would be known before the code executes
and the result could be computed much more efficiently.

# Lesson 3 <br>
Introducing UFuncs