# Creating and Indexing Arrays

We will start with the most fundamental task in NumPy: creating and working with an array. An **array** is a vector, matrix, or tensor of numbers. In plain English, it is a grid of numeric values that we can efficiently perform numeric operations with, such as statistics or machine learning. 



## Why NumPy and Vectorization? 

In Python, you may be familiar with collections like lists. We could take numbers, such as integers or floating point values, and put them inside lists like this. 

In [2]:
x = [1.2, 5.4, 7.3]
y = [3.2, 5.1, 2.8]

print(f"x = {x}")
print(f"y = {y}")

x = [1.2, 5.4, 7.3]
y = [3.2, 5.1, 2.8]


Now let's say we wanted to add each of the respective elements of $ x $ and $ y $ together. If we were use the Python add `+` operator, it would merge the two lists. In a linear algebra/numeric computing context this is not what we want. 

In [9]:
x = [1.2, 5.4, 7.3]
y = [3.2, 5.1, 2.8]

x + y 

[1.2, 5.4, 7.3, 3.2, 5.1, 2.8]

To achieve this in plain Python, we would have to use a `for` loop or list comprehension with a `zip()` like below:  

In [10]:
x = [1.2, 5.4, 7.3]
y = [3.2, 5.1, 2.8]

[_x + _y for _x,_y in zip(x,y)]

[4.4, 10.5, 10.1]

Not to mention, **Python is REALLY slow**. It is not performant at doing numeric computing in this manner. Python's computational efficiencies comes from the low-level libraries like NumPy which are written in C. 

So what would this look like with NumPy using its `ndarray` type? Let's take a look. 

In [6]:
import numpy as np 

x = np.array([1.2, 5.4, 7.3])
y = np.array([3.2, 5.1, 2.8])

x + y 

array([ 4.4, 10.5, 10.1])

Note how we bring in `numpy` and alias it as `np`, a common best practice. We then declare two arrays using the `array()` functions and pass two lists of numeric values to it. We can then proceed to add the the two arrays together using the `+` operator. 

This **vectorized** approach to doing mathematical operations is a lot more efficient and convenient, avoiding a lot of `for` loops as well as leveraging efficiencies in NumPy.More specifically, NumPy is optimized to handle lists or grids of numbers, and performing mathematical operations with other lists or grids of numbers.  When we got lots of data, and take advantage of the fact CPU's and GPU's can do math more efficiently on multiple values at once, vectorization becomes a must-have. 

You may hear a list of numbers referred to as a **vector**, and a grid of numbers in two or more dimensions referred to as a **matrix** or **tensor**. 

## Declaring an Array 

Let's dive into the array more, or more specifically the `ndarray`. This is probably the most fundamental data type in NumPy. As we saw earlier, we can declare it using a simple numeric list passed to the `array()` function. 

In [21]:
x = np.array([6, 1, 17, 3, 0, 3]) 
x

array([ 6,  1, 17,  3,  0,  3])

In [23]:
type(x)

numpy.ndarray

That is a 1-dimensional array. An array can have 0 dimensions, meaning it is just a single scalar value. 

In [24]:
np.array(5)

array(5)

We can also make a 2-dimensional array, meaning we have an array consisting of rows and columns. You can do this by nesting lists `[]` inside a list `[[]]`. 

In [26]:
x = np.array([[6, 1, 17], 
              [3, 0, 3]]) 
x

array([[ 6,  1, 17],
       [ 3,  0,  3]])

You can get really crazy, declaring higher-dimensional tensors where you have stacks and stacks of numeric grids representing images and video data. Below we have a 3x3 pixel image stored as a tensor, where each red-green-blue channel is represented in three sub-layers. The point is... you can get crazy with how you ingest data and store it in higher-dimensional numeric patterns. 

In [36]:
my_image = np.array([
    [[0, 1, 3],
     [6, 2, 6], 
     [1, 5, 4]], 
    [[8, 3, 19],
     [33, 34, 11], 
     [13, 14, 89]], 
    [[14, 68, 17],
     [66, 84, 92], 
     [4, 2, 58]]
])

my_image

array([[[ 0,  1,  3],
        [ 6,  2,  6],
        [ 1,  5,  4]],

       [[ 8,  3, 19],
        [33, 34, 11],
        [13, 14, 89]],

       [[14, 68, 17],
        [66, 84, 92],
        [ 4,  2, 58]]])

## Differences from Pandas 

## Honorable Mention to SymPy 

Little-known trick. If you `pip install sympy` and use the `Matrix` function, your Jupyter notebook will format the `ndarray` quite nicely. SymPy is a great library for doing symbolic math, which has benefits for exploring but is not scalable and efficient as NumPy. 

In [35]:
from sympy import Matrix
Matrix(x) 

NotImplementedError: SymPy supports just 1D and 2D matrices