<a href="https://colab.research.google.com/github/FredArgoX/ChaoticTest_Numpy/blob/main/02_NumPy__the_absolute_basics_for_beginners.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Source: [NumPy: the absolute basics for beginners](https://numpy.org/doc/stable/user/absolute_beginners.html)

# NumPy: the absolute basics for beginners

Welcome to the absolute beginner's guide to NumPy!

NumPy (Numerical Python) is an open source Python library that's widely used in science and engineering. The NumPy library contains multidimensional array data structures, such as the homogeneous, N-dimensional `ndarray`, and a large library of functions that operate efficiently on these data structures.


# How to import NumPy

After installing NumPy, it may be imported into Python code like:

In [1]:
import numpy as np

The widespread convention allows access to NumPy features with a short, recognizable prefix (`np.`) while distinguishing NumPy features from others that have the same name.

# Reading the example code

Throughout the NumPy documentation, you will find blocks that look like:

In [2]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])

a.shape

(2, 3)

Text preceded by `>>>` or `...` is input, the code that you would enter in a script or at a Python prompt. Everything else is output, the results of running your code. Note that `>>>` and `...` are not part of the code and may cause an error if entered at a Python prompt.

# Why use NumPy?

Python lists are excellent, general-purpose containers. They can be "heterogeneous", meaning that they can contain elements of a variety of types, and they are quite fast when used to perform individual operations on a handful of elements.

Depending on the characteristics of the data and the types of operations that need to be performed, other contaners may be more appropriate; by expoiting these characteristics, we can improve speed, reduce memory consumption, and offer a high-level syntax for performing a variety of common processing tasks. NumPy shines when there are large quantities of "homogeneous" (same-type) data to be processed on the CPU.

# What is an "array"?

In computer programming, an array is a structure for storing and retrieving data. We often talk about an array as if it were a grid in space, with each cell storing one element of the data. For instance, if each element of the data were a number, we might visualize a "one-dimensional" array like a list:

\
\begin{array}{|c|c|c|c|}
\hline
1 & 5 & 2 & 0 \\
\hline
\end{array}


A two-dimensional array would be like a table:

\
\begin{array}{|c|c|c|c|}
\hline
1 & 5 & 2 & 0 \\
\hline
8 & 3 & 6 & 1 \\
\hline
1 & 7 & 2 & 9 \\
\hline
\end{array}


A three-dimensional array would be like a set of tables, perhaps stacked as though they were printed on separate pages. In NumPy, this idea is generalized to an arbitrary number of dimensions, and so the fundamental array class is called `ndarray`: it represents an "N-dimensional array".

Most NumPy arrays have some restrictions. For instance:

- All elements of the array must be of the same type of data.
- Once created, the total size of the array can't change.
- The shape must be "rectangular", not "jagged"; e.g., each row of a two-dimensional array must have the same number of columns.

When these conditions are met, NumPy exploits these characteristics to make the array faster, more memory efficient, and more convenient to use than less restrictive data structures.

For the remainder of this document, we will use the word "array" to refer to an instance of `ndarray`.


# Array fundamentals

One way to initialize an array is using a Python sequence, such as a list. For example:

In [3]:
a = np.array([1, 2, 3, 4, 5, 6])
a

array([1, 2, 3, 4, 5, 6])

Elements of an array can be accessed in various ways. For instance, we can access an individual element of this array as we would access an element in the original list: using the integer index of the element within square brackets.

In [4]:
a[0]

np.int64(1)

As with built-in Python sequences, NumPy arrays are "0-indexed": the first element of the array is accessed using index `0`, not `1`.



Like the original list, the array is mutable.

In [5]:
a[0] = 10
a

array([10,  2,  3,  4,  5,  6])

Also like the original list, Python slice notation can be used for indexing.

In [6]:
a[:3]

array([10,  2,  3])

One major difference is that slice indexing of a list copies the elements into a new list, but slicing an array returns a *view*: an object that refers to the data in the original array. The original array can be mutated using the view.

In [7]:
b = a[3:]
b

array([4, 5, 6])

In [8]:
b[0] = 40
a

array([10,  2,  3, 40,  5,  6])

Two-and higher-dimensional arrays can be initialized from nested Python sequences:

In [9]:
a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8],
              [9, 10, 11, 12]])
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In NumPy, a dimension of an array is sometimes referred to as an "axis". This terminology may be useful to disambiguate between the dimensionality of an array and the dimensionality of the data represented by the array. For instance, the array `a` could represent three points, each lying within a four-dimensional space, but `a` has only two "axes".

Another difference between an array and a list of lists is that an element of the array can be accessed by specifying the index along each axis within a *single* set of square brackets, separated by commas. For instance, the element `8` is in row `1` and column `3`:


In [10]:
a[1, 3]

np.int64(8)

It is familiar practice in mathematics to refer to elements of a matrix by the row index first and the column index second. This happens to be true for two-dimensional arrays, but a better mental model is to think of the column index as coming *last* and the row index as *second to last*. This generalizes to arrays with any number of dimensions.

You might hear of a 0-D (zero-dimensional) array referred to as a "scalar", a 1-D (one-dimensional) array as a "vector", a 2-D (two-dimensional) array as a "matrix", or an N-D (N-dimensional, where "N" is typically an integer greater than 2) array as a "tensor". For clarity, it is best to avoid the mathematical terms when referring to an array because the mathematical objects with these names behave differently than arrays (e.g., "matrix" multiplication is fundamentally different from "array" multiplication), and there are other objects in the scientific Python ecosystem that have these names (e.g., the fundamental data structure of PyTorch is the "tensor").


# Array attributes

*This section covers the* `ndim`, `shape`, `size`, *and* `dtype`*attributes of an array*.

The number of dimensions of an array is contained in the `ndim` attribute.

In [11]:
a.ndim

2

The shape of an array is a tuple of non-negative integers that specify the number of elements along each dimension.

In [12]:
a.shape

(3, 4)

In [13]:
len(a.shape) == a.ndim

True

The fixed, total number of elements in array is contained in the `size` attribute.

In [14]:
a.size

12

In [15]:
import math
a.size == math.prod(a.shape)

True

Arrays are typically "homogeneous", meaning that they contain elements of only one "data type". The data type is recorded in the `dtype` attribute.

In [16]:
a.dtype

dtype('int64')

# How to create a basic array?

*This section covers* `np.zeros()`, `np.ones()`, `np.empty()`, `np.arange()`, `np.linspace()`

Besides creating an array from a sequence of elements, you can easily create an array filled with `0`'s:

In [17]:
np.zeros(2)

array([0., 0.])

Or an array filled with `1`'s:

In [18]:
np.ones(2)

array([1., 1.])

Or even an empty array! The function `empty` creates an array whose initial content is random and depends on the state of the memory. The reason to use `empty` over `zeros` (or something similar) is speed - just make sure to fill every element afterwards!

In [28]:
# Create an empty array with 2 elements
np.empty(20)

array([0. , 0. , 0.3, 1. , 0. , 0. , 1. , 1. , 1. , 1. , 1. , 1. , 1. ,
       0. , 0. , 1. , 0.5, 0. , 0. , 1. ])

You can create an array with a range of elements:

In [29]:
np.arange(4)

array([0, 1, 2, 3])

And even an array that contains a range of evenly spaced intervals. To do this, you will specify the first number, last number, and the step size.

In [30]:
np.arange(2, 9, 2)

array([2, 4, 6, 8])

You can also use `np.linspace()` to create an array with values that are spaced linearly in a specified interval:

In [31]:
np.linspace(0, 10, num=5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [32]:
np.linspace(0, 10, num=6)

array([ 0.,  2.,  4.,  6.,  8., 10.])

In [33]:
np.linspace(0, 10, num=7)

array([ 0.        ,  1.66666667,  3.33333333,  5.        ,  6.66666667,
        8.33333333, 10.        ])

**Specifying your data type**

While the default data type is floating point (`np.float64`), you can explicitly specify wich data type you want using the `dtype` keyword.

In [34]:
x = np.ones(2, dtype=np.int64)
x

array([1, 1])

# Adding, removing, and sorting elements

# How do you know the shape and size of an array?

# Can you reshape an array?

# How to convert a 1D array into a 2D array (how to add a new axis to an array)

# Indexing and slicing

# How to create an array from existing data?

# Basic array operations

# Broadcasting

# More useful array operations

# Creating matrices

# Generating random numbers

# How to get unique items and counts

# Transposing and reshaping a matrix

# How to reverse an array

# Reshaping and flattening multidimensional arrays

# How to access the docstring for more information

# Working with mathematical formulas

# How to save and load NumPy objects

# Importing and exporting a CSV

# Plotting arrays with Matplotlib