# Introduction to NumPy

In [1]:
import numpy
numpy.__version__

'1.21.2'

In [2]:
import numpy as np

At the implementation level, the array essentially contains a single pointer to one contiguous block of data. The Python list, on the other hand, contains a pointer to a block of pointers, each of which in turn points to a full Python object like the Python integer we saw earlier. Again, the advantage of the list is flexibility: because each list element is a full structure containing both data and type information, the list can be filled with data of any desired type. Fixed-type NumPy-style arrays lack this flexibility, but are much more efficient for storing and manipulating data.

In [3]:
import array
L = list(range(10))
A = array.array('i', L)
A



array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Here 'i' is a type code indicating the contents are integers.

In [4]:
# integer array:
np.array([1, 4, 2, 5, 3])

array([1, 4, 2, 5, 3])

Remember that unlike Python lists, NumPy is constrained to arrays that all contain the same type. If types do not match, NumPy will upcast if possible (here, integers are up-cast to floating point):


In [5]:
np.array([3.14, 4, 2, 3])

array([3.14, 4.  , 2.  , 3.  ])

In [6]:
np.array([1, 2, 3, 4], dtype='float32')

array([1., 2., 3., 4.], dtype=float32)

In [7]:
# nested lists result in multi-dimensional arrays
np.array([range(i, i + 3) for i in [2, 4, 6]])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

In [12]:
# Create a length-10 integer array filled with zeros
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [11]:
# Create a 3x5 floating-point array filled with ones
np.ones((3, 5), dtype=float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [13]:
# Create a 3x5 array filled with 3.14
np.full((3, 5), 3.14)

array([[3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14]])

In [17]:
# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0,20,0.2)

array([ 0. ,  0.2,  0.4,  0.6,  0.8,  1. ,  1.2,  1.4,  1.6,  1.8,  2. ,
        2.2,  2.4,  2.6,  2.8,  3. ,  3.2,  3.4,  3.6,  3.8,  4. ,  4.2,
        4.4,  4.6,  4.8,  5. ,  5.2,  5.4,  5.6,  5.8,  6. ,  6.2,  6.4,
        6.6,  6.8,  7. ,  7.2,  7.4,  7.6,  7.8,  8. ,  8.2,  8.4,  8.6,
        8.8,  9. ,  9.2,  9.4,  9.6,  9.8, 10. , 10.2, 10.4, 10.6, 10.8,
       11. , 11.2, 11.4, 11.6, 11.8, 12. , 12.2, 12.4, 12.6, 12.8, 13. ,
       13.2, 13.4, 13.6, 13.8, 14. , 14.2, 14.4, 14.6, 14.8, 15. , 15.2,
       15.4, 15.6, 15.8, 16. , 16.2, 16.4, 16.6, 16.8, 17. , 17.2, 17.4,
       17.6, 17.8, 18. , 18.2, 18.4, 18.6, 18.8, 19. , 19.2, 19.4, 19.6,
       19.8])

In [18]:
# list(range(0,20,2))

In [19]:
# Create an array of five values evenly spaced between 0 and 1
np.linspace(0,1,5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [20]:
# Create a 3x3 array of uniformly distributed
# random values between 0 and 1
np.random.random((3,3))

array([[0.74479855, 0.96323654, 0.25302098],
       [0.78727019, 0.10752331, 0.65483123],
       [0.02079108, 0.28378828, 0.11428322]])

In [22]:
# Create a 3x3 array of normally distributed random values
# with mean 0 and standard deviation 1
np.random.normal(0,1,(3,3))

array([[ 0.66332972,  0.95168287, -0.13595152],
       [-0.96963211, -0.1208814 , -0.02353986],
       [ 0.29339372,  0.18453761, -1.33220852]])

In [23]:
# Create a 3x3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3, 3))

array([[7, 1, 3],
       [0, 9, 1],
       [5, 3, 5]])

In [24]:
# Create a 3x3 identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [25]:
# Create an uninitialized array of three integers
# The values will be whatever happens to already exist at that memory location
np.empty(3)

array([1., 1., 1.])

In [None]:
# NumPy Standard Data Types

![image.png](attachment:image.png)

# The Basics of NumPy Arrays

Data manipulation in Python is nearly synonymous with NumPy array manipulation:
This section will present several examples of using NumPy array manipulation to access data and subarrays, and to split, reshape, and join the arrays. While the types of operations shown here may seem a bit dry and pedantic, they comprise the building blocks 

We'll cover a few categories of basic array manipulations here:

    * Attributes of arrays: Determining the size, shape, memory consumption, and data types of arrays
    * Indexing of arrays: Getting and setting the value of individual array elements
    * Slicing of arrays: Getting and setting smaller subarrays within a larger array
    * Reshaping of arrays: Changing the shape of a given array
    * Joining and splitting of arrays: Combining multiple arrays into one, and splitting one array into many


In [26]:
np.random.seed(0)  # seed for reproducibility

In [27]:
x1 = np.random.randint(10, size=6)  # One-dimensional array
x2 = np.random.randint(10, size=(3, 4))  # Two-dimensional array
x3 = np.random.randint(10, size=(3, 4, 5))  # Three-dimensional array

Each array has attributes ndim (the number of dimensions), shape (the size of each dimension), and size (the total size of the array):

In [28]:
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)



x3 ndim:  3
x3 shape: (3, 4, 5)
x3 size:  60


In [29]:
print("dtype:", x3.dtype)

dtype: int32


Other attributes include itemsize, which lists the size (in bytes) of each array element, and nbytes, which lists the total size (in bytes) of the array:


In [30]:
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")

itemsize: 4 bytes
nbytes: 240 bytes



# Array Indexing: Accessing Single Elements

If you are familiar with Python's standard list indexing, indexing in NumPy will feel quite familiar. In a one-dimensional array, the $i^{th}$ value (counting from zero) can be accessed by specifying the desired index in square brackets, just as with Python lists:


In [31]:
x1

array([5, 0, 3, 3, 7, 9])

In [32]:
x1[0]

5

In [33]:
x1[4]

7

In [34]:
x1[-1]

9

In [37]:
x1[-3]

3

In [39]:
x2[0]

array([3, 5, 2, 4])

In [43]:
x2[:,-1]

array([4, 8, 7])

In [44]:
x2[2,0]

1

In [45]:
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [46]:
x2[1,-1]

8

In [48]:
x2[0,0] = 12

In [50]:
x2[0,0]

12

# Array Slicing: Accessing Subarrays

Just as we can use square brackets to access individual array elements, we can also use them to access subarrays with the slice notation, marked by the colon (:) character. The NumPy slicing syntax follows that of the standard Python list; to access a slice of an array x, use this:

x[start:stop:step]

If any of these are unspecified, they default to the values start=0, stop=size of dimension, step=1. We'll take a look at accessing sub-arrays in one dimension and in multiple dimensions.


In [51]:
x = np.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [52]:
x[:5] # first five elements

array([0, 1, 2, 3, 4])

In [53]:
x[5:]  # elements after index 5

array([5, 6, 7, 8, 9])

In [54]:
x[4:7]  # middle sub-array



array([4, 5, 6])

In [55]:
x[::2]  # every other element

array([0, 2, 4, 6, 8])

In [56]:
x[1::2]  # every other element, starting at index 1

array([1, 3, 5, 7, 9])

In [57]:
x[::-1]  # all elements, reversed

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

In [58]:
x[5::-2]  # reversed every other from index 5



array([5, 3, 1])

In [59]:
x2

array([[12,  5,  2,  4],
       [ 7,  6,  8,  8],
       [ 1,  6,  7,  7]])

In [62]:
# two rows, three columns
x2[:2,:3]

array([[12,  5,  2],
       [ 7,  6,  8]])

In [63]:
# all rows, every other column
x2[:,0::2]

array([[12,  2],
       [ 7,  8],
       [ 1,  7]])

In [65]:
# Finally, subarray dimensions can even be reversed together:
x2[::-1, ::-1]

array([[ 7,  7,  6,  1],
       [ 8,  8,  6,  7],
       [ 4,  2,  5, 12]])

In [70]:
# first column of x2
print(x2[:,0])

[12  7  1]


In [66]:
x2

array([[12,  5,  2,  4],
       [ 7,  6,  8,  8],
       [ 1,  6,  7,  7]])

In [71]:
# first row of x2
print(x2[0, :])

[12  5  2  4]


In [74]:
# equivalent to x2[0, :]
print(x2[0])

[12  5  2  4]


In [75]:
print(x2)

[[12  5  2  4]
 [ 7  6  8  8]
 [ 1  6  7  7]]


In [76]:
x2_sub = x2[:2,:2]
print(x2_sub)

[[12  5]
 [ 7  6]]


In [77]:
x2_sub[0, 0] = 99
print(x2_sub)

[[99  5]
 [ 7  6]]


In [78]:
print(x2)

[[99  5  2  4]
 [ 7  6  8  8]
 [ 1  6  7  7]]


In [79]:
l1 = list(range(0,10))

In [80]:
l1

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [81]:
li_s = l1[:5]

In [83]:
li_s[0] = 56

In [84]:
li_s 

[56, 1, 2, 3, 4]

In [85]:
l1

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [86]:
l2 = l1

In [87]:
l2

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [88]:
l1

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [89]:
l2[0] = 52

In [90]:
l2

[52, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [91]:
l1

[52, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [92]:
x2

array([[99,  5,  2,  4],
       [ 7,  6,  8,  8],
       [ 1,  6,  7,  7]])

In [93]:
x2_sub_copy = x2[:2, :2].copy()
print(x2_sub_copy)



[[99  5]
 [ 7  6]]


In [94]:
print(x2)

[[99  5  2  4]
 [ 7  6  8  8]
 [ 1  6  7  7]]


In [95]:
x2_sub_copy[0, 0] = 42
print(x2_sub_copy)



[[42  5]
 [ 7  6]]


In [96]:
print(x2)

[[99  5  2  4]
 [ 7  6  8  8]
 [ 1  6  7  7]]


# Reshaping of Arrays

Another useful type of operation is reshaping of arrays. The most flexible way of doing this is with the reshape method. For example, if you want to put the numbers 1 through 9 in a $3 \times 3$ grid, you can do the following:


In [105]:
x = np.arange(1,13)

In [106]:
x

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [116]:
x.shape = (1,12)

In [121]:
x = x.reshape(6,2)

In [122]:
x.shape

(6, 2)


Note that for this to work, the size of the initial array must match the size of the reshaped array. Where possible, the reshape method will use a no-copy view of the initial array, but with non-contiguous memory buffers this is not always the case.

Another common reshaping pattern is the conversion of a one-dimensional array into a two-dimensional row or column matrix. This can be done with the reshape method, or more easily done by making use of the newaxis keyword within a slice operation:


In [123]:
x = np.array([1, 2, 3])

# row vector via reshape
x.reshape((1, 3))



array([[1, 2, 3]])

In [124]:
x

array([1, 2, 3])

In [125]:
# row vector via newaxis
x[np.newaxis, :]

array([[1, 2, 3]])

In [126]:
# column vector via reshape
x.reshape((3, 1))

array([[1],
       [2],
       [3]])

In [127]:
# column vector via newaxis
x[:, np.newaxis]

array([[1],
       [2],
       [3]])

In [130]:
x

array([1, 2, 3])


# Concatenation of arrays

Concatenation, or joining of two arrays in NumPy, is primarily accomplished using the routines np.concatenate, np.vstack, and np.hstack. np.concatenate takes a tuple or list of arrays as its first argument, as we can see here:


In [131]:
x = np.array([1, 2, 3])
y = np.array([3, 2, 1])

In [132]:
np.concatenate([x,y])

array([1, 2, 3, 3, 2, 1])

In [133]:
z= [99,99,99]

In [134]:
np.concatenate([x,y,z])

array([ 1,  2,  3,  3,  2,  1, 99, 99, 99])

In [135]:
grid = np.array([[1, 2, 3],
                 [4, 5, 6]])

In [136]:
np.concatenate([grid,grid])

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3],
       [4, 5, 6]])

In [137]:
# concatenate along the second axis (zero-indexed)
np.concatenate([grid, grid], axis=1)

array([[1, 2, 3, 1, 2, 3],
       [4, 5, 6, 4, 5, 6]])

In [138]:
x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7],
                 [6, 5, 4]])

In [139]:
np.vstack([x, grid])

array([[1, 2, 3],
       [9, 8, 7],
       [6, 5, 4]])

In [141]:
np.concatenate([x, grid,])

SyntaxError: invalid syntax (Temp/ipykernel_4124/3948674231.py, line 1)