# Arrays
*Evelyn Campbell, Ph.D.*

An *array* is a data structure that consists of a collection of elements organized into a grid-like shape. In Python, arrays can be one-dimensional, akin to a list, or multidimensional (2D, 3D, etc.). However, unlike a list, an array consists of elements that are all of the same data type. This makes arrays ideal for convienent for storage and fast manipulation of data elements. Arrays are offered through the `NumPy` library, and are often used in conjunction with other Python libraries, such as `pandas`, `SciPy`, and `Scikit-learn`. We will explore arrays in this section, along with commonly used functions used with arrays.

To make an array, we first need to import `NumPy`. We can then build an array from a list using the `np.array()` function.

In [2]:
import numpy as np

my_list = [30, 50, 70, 90]
my_array = np.array(my_list)
my_array

array([30, 50, 70, 90])

Another way an array can be made is by using the `np.arange()` function. With this function, one can build an array with a given inclusive start value and exclusive stop value.

In [3]:
my_array2 = np.arange(4, 11, 2)
my_array2

array([ 4,  6,  8, 10])

Above, we made a one-dimensional array with four elements. We started the array at 4 and stopped it at 11. Because we specified the spacing as 2, our array gave us the values 4, 6, 8, and 10 because 11 is exclusive. Therefore, the `np.arange()` function will evenly space out the elements of our array just before the stop value.

As previously stated, arrays are a data structure that allow for fast calculations. For example, we can easily multiply our two arrays to obtain a new array of values. 

In [4]:
my_array3 = my_array * my_array2
my_array3

array([120, 300, 560, 900])

The resulting array consists of the products of element-by-element multiplication of the first two arrays. Keep in mind that when performing calculations with multiple arrays, the dimensions of the arrays must be compatible. A discussion on array shape compatibility in mathematical operations can be found in the referenced documentation on *Array Broadcasting in Numpy* below.

More simply, we can also perform a desired calculation on all elements of an array

In [5]:
my_array3 / 20 + 7

array([13., 22., 35., 52.])

Arrays can also be reshaped and combined. We can use the `np.reshape()` function to change the first two arrays from a 1-dimensional 1x4 array to a 2-dimensional 2x2 array.

In [6]:
reshape1 = np.reshape(my_array, (2,2))
reshape2 = np.reshape(my_array2, (2,2))

In [7]:
reshape1

array([[30, 50],
       [70, 90]])

In [8]:
reshape2

array([[ 4,  6],
       [ 8, 10]])

When combining arrays that have the same shape, we can use the `np.row_stack()` and `np.column_stack()` functions to concatenate the rows and columns of multiple arrays, respectively. 

In [9]:
np.column_stack((reshape1, reshape2))

array([[30, 50,  4,  6],
       [70, 90,  8, 10]])

In [10]:
np.row_stack((reshape1, reshape2))

array([[30, 50],
       [70, 90],
       [ 4,  6],
       [ 8, 10]])

Construction and reshaping of arrays is an important consideration if you wish to perform aggregate functions on them. Some useful aggregate functions that can be performed on arrays include `np.min()`, `np.max()`, `np.sum()`, and `np.average()`. These functions can be applied to the entire array or across rows and columns.

In [12]:
np.min(reshape1)

30

To apply these functions across columns of the array, use an `axis=0` argument. To apply them across rows, use an `axis=1` argument. The returned array will be the same length as the number of columns or rows.

In [11]:
np.sum(reshape1, axis=0)

array([100, 140])

In [12]:
np.average(reshape1, axis=1)

array([40., 80.])

A number of other aggregate functions can be applied to transform elements within an array. These include `np.sqrt()`, `np.square()`, `np.power()`, `np.log()`, and many others.

In [13]:
np.power(reshape2, 3)

array([[  64,  216],
       [ 512, 1000]])

Documentation on the available aggregate functions offered by `NumPy` is listed below. In the next section, we will learn how to subset and index arrays.

## Resources

- <a target="_blank" href="https://numpy.org/doc/stable/reference/routines.math.html">Mathematical aggregate functions by NumPy</a>
- <a target="_blank" href="https://numpy.org/doc/1.20/user/theory.broadcasting.html">Array Broadcasting in Numpy</a>