# NumPy: the absolute basics for beginners

#### NumPy (Numerical Python) is an open source Python library that’s widely used in science and engineering. The NumPy library contains multidimensional array data structures, such as the homogeneous, N-dimensional ndarray, and a large library of functions that operate efficiently on these data structures.

## Why use NumPy?

#### Python lists are excellent, general-purpose containers. They can be “heterogeneous”, meaning that they can contain elements of a variety of types, and they are quite fast when used to perform individual operations on a handful of elements.

#### Depending on the characteristics of the data and the types of operations that need to be performed, other containers may be more appropriate; by exploiting these characteristics, we can improve speed, reduce memory consumption, and offer a high-level syntax for performing a variety of common processing tasks. NumPy shines when there are large quantities of “homogeneous” (same-type) data to be processed on the CPU.

## What is an "Array"?

#### In computer programming, an array is a structure for storing and retrieving data. We often talk about an array as if it were a grid in space, with each cell storing one element of the data. For instance, if each element of the data were a number, we might visualize a “one-dimensional” array like a list: A two-dimensional array would be like a table:

#### A three-dimensional array would be like a set of tables, perhaps stacked as though they were printed on separate pages. In NumPy, this idea is generalized to an arbitrary number of dimensions, and so the fundamental array class is called ndarray: it represents an “N-dimensional array”.

### Most NumPy arrays have some restrictions. For instance:

1. All elements of the array must be of the same type of data.

2. Once created, the total size of the array can’t change.

3. The shape must be “rectangular”, not “jagged”; e.g., each row of a two-dimensional array must have the same number of columns.

#### When these conditions are met, NumPy exploits these characteristics to make the array faster, more memory efficient, and more convenient to use than less restrictive data structures.

## How to import NumPy

In [3]:
import numpy as np 

## Array fundamentals

#### One way to initialize an array is using a Python sequence, such as a list. For example:

In [4]:
a = np.array([1, 2, 3, 4, 5, 6])
print(a)

[1 2 3 4 5 6]


In [5]:
a[0] = 10 
a

array([10,  2,  3,  4,  5,  6])

In [6]:
a[:3]

array([10,  2,  3])

In [7]:
b = a[:3]
b

array([10,  2,  3])

In [8]:
b[0] = 40
a

array([40,  2,  3,  4,  5,  6])

#### Two- and higher-dimensional arrays can be initialized from nested Python sequences:

In [9]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

#### In NumPy, a dimension of an array is sometimes referred to as an “axis”. This terminology may be useful to disambiguate between the dimensionality of an array and the dimensionality of the data represented by the array. For instance, the array a could represent three points, each lying within a four-dimensional space, but a has only two “axes”.

#### Another difference between an array and a list of lists is that an element of the array can be accessed by specifying the index along each axis within a single set of square brackets, separated by commas. For instance, the element 8 is in row 1 and column 3:

In [10]:
print(a[1, 3])


8


## Array attributes

*This section covers the* `ndim`, `shape`, `size`, *and* `dtype` *attributes of an array.*


In [11]:
a


array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

The number of dimensions of an array is contained in the `ndim` attribute.

In [12]:
a.ndim

2

#### The shape of an array is a tuple of non-negative integers that specify the number of elements along each dimension.

In [13]:
a.shape

(3, 4)

In [14]:
len(a.shape) == a.ndim

True

#### The fixed, total number of elements in array is contained in the `size` attribute.

In [15]:
a.size

12

In [16]:
import math 
a.size == math.prod(a.shape) ##(3*4)

True

#### Arrays are typically “homogeneous”, meaning that they contain elements of only one “data type”. The data type is recorded in the `dtype` attribute.

## How to create a basic array

#### This section covers `np.zeros()`, `np.ones()`, `np.empty()`, `np.arange()`, `np.linspace()`

#### Besides creating an array from a sequence of elements, you can easily create an array filled with `0’s`:

In [17]:
np.zeros(2)

array([0., 0.])

#### Or an array filled with `1’s`:

In [18]:
np.ones(2)

array([1., 1.])

#### Or even an empty array! The function `empty` creates an array whose initial content is random and depends on the state of the memory. The reason to use `empty` over `zeros` (or something similar) is speed - just make sure to fill every element afterwards!

In [19]:
# Create an empty array with 2 elements
np.empty(2) 

array([1., 1.])

#### You can create an array with a range of elements:

In [20]:
np.arange(2,10,2)

array([2, 4, 6, 8])

#### You can also use `np.linspace()` to create an array with values that are spaced linearly in a specified interval:

In [21]:
np.linspace(0,10,5)


array([ 0. ,  2.5,  5. ,  7.5, 10. ])

## Adding, removing, and sorting elements

#### This section covers `np.sort()`, `np.concatenate()`

#### Sorting an array is simple with `np.sort()`. You can specify the axis, kind, and order when you call the function

#### If you start with this array:

In [22]:
arr = np.array([2,1,5,3,7,4,6,8])

#### You can quickly sort the numbers in ascending order with:

In [23]:
np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

#### If you start with these arrays:

In [24]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

In [25]:
np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

#### Or, if you start with these arrays:

In [26]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])

#### You can concatenate them with:

In [27]:
np.concatenate((x, y), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

#### Understanding axis = 0 in Python

In [28]:
import numpy as np

# Create a 2D array
arr = np.array([[1, 2, 3], 
                [4, 5, 6]])

# Sum along axis=0 (columns)
col_sum = np.sum(arr, axis=0)  # Result: [5, 7, 9]

# Sum along axis=1 (rows)
row_sum = np.sum(arr, axis=1)  # Result: [6, 15]

# Mean along axis=0 (columns)
col_mean = np.mean(arr, axis=0)  # Result: [2.5, 3.5, 4.5]

# Mean along axis=1 (rows)
row_mean = np.mean(arr, axis=1)  # Result: [2.0, 5.0]

## How do you know the shape and size of an array?

#### This section covers `ndarray.ndim`, `ndarray.size`, `ndarray.shape`

#### `ndarray.ndim` will tell you the number of axes, or dimensions, of the array.

#### `ndarray.size` will tell you the total number of elements of the array. This is the product of the elements of the array’s shape.

#### `ndarray.shape` will display a tuple of integers that indicate the number of elements stored along each dimension of the array. If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is `(2, 3)`.

#### For example, if you create this array:

In [29]:
array_example = np.array([[[0, 1, 2, 3],
                           [4, 5, 6, 7]],
                          [[0, 1, 2, 3],
                           [4, 5, 6, 7]],
                          [[0 ,1 ,2, 3],
                           [4, 5, 6, 7]]])

#### To find the number of dimensions of the array, run:

In [30]:
array_example.ndim

3

#### To find the total number of elements in the array, run:

In [31]:
array_example.size

24

#### And to find the shape of your array, run:

In [32]:
array_example.shape

(3, 2, 4)

## Can you reshape an array?

#### This section covers `arr.reshape()`

#### Yes!

#### Using arr.reshape() will give a new shape to an array without changing the data. Just remember that when you use the reshape method, the array you want to produce needs to have the same number of elements as the original array. If you start with an array with 12 elements, you’ll need to make sure that your new array also has a total of 12 elements.

#### If you start with this array:

In [33]:
a = np.arange(6)
print(a)

[0 1 2 3 4 5]


#### You can use `reshape()` to reshape your array. For example, you can reshape this array to an array with three rows and two columns:

In [34]:
A = a.reshape(2, 3)
print(A)
B = a.reshape(3, 2)
print(B)

[[0 1 2]
 [3 4 5]]
[[0 1]
 [2 3]
 [4 5]]


#### With `np.reshape`, you can specify a few optional parameters:

In [35]:
np.reshape(a, (2, 3), order='F') # Fortran-style order

array([[0, 2, 4],
       [1, 3, 5]])

### How to convert a 1D array into a 2D array (how to add a new axis to an array)

#### This section covers `np.newaxis`, `np.expand_dims`

#### You can use `np.newaxis` and `np.expand_dims` to increase the dimensions of your existing array.

#### Using `np.newaxis` will increase the dimensions of your array by one dimension when used once. This means that a 1D array will become a 2D array, a 2D array will become a 3D array, and so on.

#### For example, if you start with this array:

In [36]:
a = np.array([1,2,3,4,5,6])
a.shape

(6,)

#### You can use `np.newaxis` to add a new axis:

In [37]:
a2 = a[np.newaxis,:]
a2.shape

(1, 6)

#### You can explicitly convert a 1D array to either a row vector or a column vector using `np.newaxis`. For example, you can convert a 1D array to a row vector by inserting an axis along the first dimension:

In [38]:
row_vector = a[np.newaxis, :]
row_vector.shape

(1, 6)

### Or, for a column vector, you can insert an axis along the second dimension:

In [39]:
col_vector = a[:, np.newaxis]
col_vector.shape

(6, 1)

#### You can also expand an array by inserting a new axis at a specified position with `np.expand_dims`.

#### For example, if you start with this array:

In [40]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

#### You can use `np.expand_dims` to add an axis at index position 1 with:

In [41]:
b = np.expand_dims(a, axis=1)
b.shape

(6, 1)

#### You can add an axis at index position 0 with:

In [42]:
c = np.expand_dims(a, axis=0)
c.shape

(1, 6)

## Indexing and Slicing 

### You can index and slice NumPy arrays in the same ways you can slice Python lists.

In [43]:
data = np.array([1,2,3])
data[1]
data[0:2]

array([1, 2])

#### You may want to take a section of your array or specific array elements to use in further analysis or additional operations. To do that, you’ll need to subset, slice, and/or index your arrays.

#### If you want to select values from your array that fulfill certain conditions, it’s straightforward with NumPy.

#### For example, if you start with this array:

In [45]:
a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

### You can easily print all of the values in the array that are less than 5.

In [None]:
print(a[a<5])

#### You can also select, for example, numbers that are equal to or greater than 5, and use that condition to index an array.

In [47]:
five_up = (a >= 5)
print(a[five_up])

[ 5  6  7  8  9 10 11 12]


#### You can select elements that are divisible by 2:

In [50]:
divisible_2 = a[a%2 ==0]
print(divisible_2)

[ 2  4  6  8 10 12]


#### Or you can select elements that satisfy two conditions using the `&` and `|` operators:

In [51]:
c = a[(a>2) & (a <11)]
print(c)

[ 3  4  5  6  7  8  9 10]


#### You can also make use of the logical operators `&` and `|` in order to return boolean values that specify whether or not the values in an array fulfill a certain condition. This can be useful with arrays that contain names or other categorical values.

In [52]:
five_up = (a > 5) | (a == 5)
print(five_up)

[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]]


## How to create an array from existing data

#### This section covers `slicing and indexing`, `np.vstack()`, `np.hstack()`, `np.hsplit()`, `.view()`, `copy()`

#### You can easily create a new array from a section of an existing array.

#### Let’s say you have this array:

In [54]:
a = np.array([1,2,3,4,5,6,7,8,9,10])

#### You can create a new array from a section of your array any time by specifying where you want to slice your array.

In [56]:
arr1 = a[3:8]
print(arr1)

[4 5 6 7 8]


#### Here, you grabbed a section of your array from index position 3 through index position 8 but not including position 8 itself.

#### Reminder: Array indexes begin at 0. This means the first element of the array is at index 0, the second element is at index 1, and so on.

#### You can also stack two existing arrays, both vertically and horizontally. Let’s say you have two arrays, `a1` and `a2`:

In [57]:
a1 = np.array([[1,1],
               [2,2]])
a2 = np.array([[3,3],
               [4,4]])

#### You can stack them vertically with `vstack`:

In [59]:
np.vstack((a1,a2))

array([[1, 1],
       [2, 2],
       [3, 3],
       [4, 4]])

In [60]:
np.hstack((a1,a2))

array([[1, 1, 3, 3],
       [2, 2, 4, 4]])

## Basic array operations

## Broadcasting

## More useful array operations

## Creating matrices

## Generating random numbers

## How to get unique items and counts

## Transposing and reshaping a matrix

## How to reverse an array

## Reshaping and flattening multidimensional arrays

## How to access the docstring for more information

## Working with mathematical formulas

## How to save and load NumPy objects

## Importing and exporting a CSV

## Plotting arrays with Matplotlib