# NumPy Basics

This document tries to highlight some of the building blocks that are required to work with NumPy arrays in general. A good understanding of the features here presented is crucial for working with NumPy arrays.

## Array Attributes

Since every NumPy array is an object, they have attributes. This sections covers some of the most commonly used attributes.

In [1]:
import numpy as np
np.random.seed(0)

x = np.random.randint(10, size=(3,3))

In [2]:
# Number of dimensions 
print("ndim:", x.ndim)
# Shape
print("shape:", x.shape)
# Number of elements)
print("size:", x.size)
# Data type of the array
print("dtype:", x.dtype)
# Size of the elements (bytes)
print("itemsize:", x.itemsize, "bytes")
# Size of the array (bytes)
print("itemsize:", x.nbytes, "bytes")

ndim: 2
shape: (3, 3)
size: 9
dtype: int64
itemsize: 8 bytes
itemsize: 72 bytes


## Array Slicing

This feature makes use of Python's slice notation to give access to subarrays. For reference, the syntax is as follows

```
x[start:stop:step]
```

One important thing to know about NumPy array slicing is that they return views to the original array, not copies of it (like Python does with lists). This makes so that the operations are faster, and the original data can be altered through the use of slices. To create a copy of the original data, the method `copy` can be used.

In [3]:
x = np.arange(10)
# Five first elements
print(x[:5])
# Five last elements
print(x[5:])
# Subarray of length 5 starting at index 2 up to 7
print(x[2:7])
# Every other element
print(x[::2])
# Every other element starting at index 3
print(x[1::2])
# Reversed array. With negative steps, start and stop are swapped
print(x[::-1])

[0 1 2 3 4]
[5 6 7 8 9]
[2 3 4 5 6]
[0 2 4 6 8]
[1 3 5 7 9]
[9 8 7 6 5 4 3 2 1 0]


The same rules apply to multi-dimensional arrays

In [4]:
x = np.random.randint(10, size=(3, 4))
x

array([[4, 7, 6, 8],
       [8, 1, 6, 7],
       [7, 8, 1, 5]])

In [5]:
# First two rows and first three columns
x[:2, :3]

array([[4, 7, 6],
       [8, 1, 6]])

In [6]:
# All rows, every other column
x[:3, ::2]

array([[4, 6],
       [8, 6],
       [7, 1]])

In [7]:
# Reverses both dimensions together
x[::-1, ::-1]

array([[5, 1, 8, 7],
       [7, 6, 1, 8],
       [8, 6, 7, 4]])

To access either a single row or column, an empty slice denoted by `:` can be used:

In [8]:
# Print first column
print(x[:, 0])
# Print first row. Equivalent to x[0]
print(x[0, :])

[4 8 7]
[4 7 6 8]


## Reshaping Arrays

The `reshape` method allows for easy reshaping of arrays whenever possible (or a nice error otherwise).

In [9]:
# Creates an array of length 10 with numbers from 1 to 10, and then reshapes it to a nice little 3x3 matrix
m = np.arange(1, 10).reshape((3, 3))
m

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

A common use of reshaping arrays, either with the `reshape` method or the use of the `newaxis` keyword, is the conversion of a one-dimensional array into a two-dimensional row or column matrix

In [10]:
x = np.arange(3)
# Creates a row vector from x. Equivalent to x[np.newaxis, :]
x.reshape((1, 3))

array([[0, 1, 2]])

In [11]:
# Creates a column vector from x. Equivalent to x[:, np.newaxis]
x.reshape((3, 1))

array([[0],
       [1],
       [2]])

## Concatenation and Splitting

The following covers methods of combining multiple arrays into a single one, or splitting an array into multiple smaller arrays.

In [12]:
# One-dimensional arrays
a = np.arange(4)
b = np.arange(4, 7)
c = np.arange(7, 10)
np.concatenate([a, b, c])

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [13]:
# Two-dimensional arrays
grid = np.array([[1, 2, 3], [4, 5, 6]])
# Concatenates along the first axis (axis=0)
np.concatenate([grid, grid])

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3],
       [4, 5, 6]])

In [14]:
# Concatenates along the second axis
np.concatenate([grid, grid], axis=1)

array([[1, 2, 3, 1, 2, 3],
       [4, 5, 6, 4, 5, 6]])

NumPy also provides `np.hstack` to stack horizontally, `np.vstack` to stack vertically and `np.dstack` to stack along the third axis:

In [15]:
a = np.arange(1, 4)
b = np.array([np.arange(4, 7), np.arange(7, 10)])
# Stacks a and b vertically
np.vstack([a, b])

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [16]:
c = np.array([10, 10]).reshape((2, 1))
# Stacks b and c vertically
np.hstack([b, c])

array([[ 4,  5,  6, 10],
       [ 7,  8,  9, 10]])

The counterparts of the concatenating/stacking methods are the splitting methods `np.split`, `np.hsplit`, `np.vsplit`, and `np.dsplit`:

In [17]:
x = np.arange(1, 10)
# The second argument is the indices where the splitting takes place
x1, x2, x3 = np.split(x, [3, 6])
print(x1, x2, x3)

[1 2 3] [4 5 6] [7 8 9]


In [18]:
grid = np.arange(16).reshape((4, 4))
upper, lower = np.vsplit(grid, [2])
print(upper)
print(lower)

[[0 1 2 3]
 [4 5 6 7]]
[[ 8  9 10 11]
 [12 13 14 15]]


In [19]:
left, right = np.hsplit(grid, [2])
print(left)
print(right)

[[ 0  1]
 [ 4  5]
 [ 8  9]
 [12 13]]
[[ 2  3]
 [ 6  7]
 [10 11]
 [14 15]]
