# Dimensions & Shapes

First things first. Let us understand what a matrix really is within Python. Is it a list, is it a tuple? Let's find out.

To start, lets start by creating a list. 

In [1]:
list1 = [1,2,3,4,5]
print(list1)

[1, 2, 3, 4, 5]


Now that we have a list, let us convert this to an ndarray.

In [2]:
import numpy as np

mat1 = np.array(list1)
print(mat1)

[1 2 3 4 5]


What changed? Nothing actually. `mat1` is exactly same as `list1`. Well, not really. A lot has changed. They both appear the same when you print them, but they are actually completely different objects. 

`list1` is of type `list`, where as `mat1` is of type `ndarray`. The `ndarray` is a class within the NumPy package. 

Let us confirm this once by looking at the specification of the two variables.

In [3]:
list1?

In [4]:
mat1?

Ok, so what we just created as a list, is actually a 1D array in NumPy. Let us understand more about 1D arrays in Python.

# 1D Array 

So, how do we define a one dimenstional array, or 1D array? 

*An array is said to be in 1D(one-dimension) if either the number of columns or rows are 1.* 

Yes, our `mat1` is a 1D array, because it has one row and 5 columns. 

In linear algebra, a 1D array is simply called a **vector**. We might interchangeably use the term `vector` and `1D array`, so get used to it. Both mean the same thing. 

Now, what if we wanted to created a 1D array that has 1 column and 5 rows. How would we do that? And we are going to do that using a `list` first and then using `ndarray`.


In [5]:
list2 = [[1],[2],[3],[4]]
mat2 = np.array(list2)
print(mat2)

[[1]
 [2]
 [3]
 [4]]


Interesting uh? What just happened? In a literal sense, the line `list2 = [[1],[2],[3],[4]]` actually created 5 independent lists. 

1. [1] -> first list
2. [2] -> second list
3. [3] -> third list
4. [4] -> fourth list
5. [....] -> fifth list (the outer main list)


To create a vertical 1D array, or as you can say a column vector, we basically added lists within the primary list. We created the main list, but instead of adding plain numbers `1`, `2`, `3`, `4`, to the list, we put the numbers in a list of their own, each of size 1, and then added each of the lists `[1]`, `[2]`, `[3]` and `[4]` to the primary list to create a 1D column array. 

Let us spend some more time to understand the datatype of each item here. We have to very clearly understand what is actually going on here. To slice a matrix, find a transpose, interting matrices, the fundamental concepts of how the data is actually arranged in a matrix how to be very clear. 

Let us look at the type of `list2`

In [6]:
list2?

As expected, `list2` is of type `list`.

Now what about the type of the element within the list? 

In [7]:
list2item1 = list2[0]
list2item1?

`list2item1` which represents the first element of `list2` is also of type `list`. Just to be sure, how does this compare to the `list1item1`? 

In [8]:
list1item1 = list1[0]
list1item1?

And there we go. The `list1item1` is actually of type `int` and not `list`. This clearly explains that to create a row array we created a single list, but to create a column array, we created 5 lists. Some developers coming from C/C++ might call a vertical 1D matrix, as actually a 2D matrix. You would not be entirely incorrect in doing so.

Let us see how to create a 2D matrix

# 2D Array 

An array that has **1D arrays as its elements** is called a 2D array. This is same as the column array that we saw above, but the inner arrays typically have more than one element each. Below is an example.

In [9]:
list2d = [[1,1],[2,2],[3,3],[4,4]]

mat2d = np.array(list2d)

print(mat2d)

[[1 1]
 [2 2]
 [3 3]
 [4 4]]


We first created a 2 dimensional list, and then converted that list into an `ndarray`. As a result, what we have effectively got is an array that has 4 rows, and each row has 2 columns. 

Keep in mind, each row is actually an independent list. 

``` Python
list2d = [[1,1],[2,2],[3,3],[4]]
mat2d = np.array(list2d)
```

Do you think we could use an uneven combination like the one shown above? The last list has just 1 element in it instead of 2. The answer is, you cannot do this. We must past the right combination for `ndarray` to be able to matrix from it; else Python will throw an error.

## Why use 2D arrays?

It is simple actually. Most of the datasets you use will be in 2D form. If you have data in Excel and you want to read the Excel data, the same will be in 2D. You can easily represent any tabular data (row & column) in 2D, and most of the structured data you encounter will be in 2D form. Excel files and CSV files, are the most commonly encountered forms of 2D data. 

Here are some example datasets:<br>
1) [Titanic dataset](https://www.kaggle.com/c/titanic)<br>
2) [Auto-MPG dataset](https://www.kaggle.com/uciml/autompg-dataset)<br>
3) [Credit Card Fraud Detection dataset](https://www.kaggle.com/mlg-ulb/creditcardfraud)<br>

We can represent all of these datasets as 2D matrices. 

# 3D Array 

An array that has **2D arrays as its elements** is called a 3D array. As you can see in the below code block, There are two 2D arrays stacked up on top of each other as elements. The process to create a 3D array is same as that of creating a 2D array. Let's look at an example.

In [17]:
list2d_1 = [[1,1],[2,2],[3,3],[4,4]]
list2d_2 = [[5,5],[6,6],[7,7],[8,8]]

list3d = [list2d_1, list2d_2]

mat3d = np.array(list3d)
print(mat3d)

[[[1 1]
  [2 2]
  [3 3]
  [4 4]]

 [[5 5]
  [6 6]
  [7 7]
  [8 8]]]


In [18]:
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr3d)

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]


We have basically combined 2 different 2D arrays to make a 3D array. Again, all lists must be appropriately sized to make a full formed 3D matrix; else we will receive an error.

## Why use 3D arrays?

Well there is lot of content that often comes in 3D form. We basically have rows and columns of data, but then each cell in this 2D matrix has more than one property associated with it; and that makes up the 3rd dimension. 

An example use case of a 3D array is image data. An image consists of a 2 dimensional space of pixels, and then each pixel has an RGB value. The RGB stands for Red, Green and Blue colors. Combinations of the RGB colors make the effective color of the pixel. 

Let's take a look at an image, and how the data corresponding to it would appear.

An **image** is simply a matrix of pixels. (Checkout the below image)
![3D](./images/Numpy_lsn2.jpg)

Each element in the above matrix is a pixel value in that image.

<br> In the shape, <br>1) 500 - Width of the image.<br>2) 800 - Height of the image.<br>3) 3 - Color channel of the image.<br>

For each image, there's a channel(RGB - Red, Green, Blue) associated to it which will give information of colour to the image. Hence a combination of a pixel co-ordinate and the channel produces an image with vivid colours. Therefore, images are 3 dimensional matrices.

## How RGB color codes work

The combination of 3 colors can supposedly create any color that we see. Our computer displays, televisions sets, render all colors as a combination of RGB values. The value of each color goes in numercial form from 0 to 255. 

So R = 0 would mean the color R is fully absent, and R = 255 would mean the color R is fully present. 

(R = 255, G = 255, B = 255) would mean that the colors Red, Green and Blue are mixed in equal proportions of 33% each to make the effective color. The effective color produced by mixing all in equal proportions is White.

(R = 0, G = 0, B = 0) would mean no color is mixed. If all colors are absent, then we actually do not have a color. Or effectively, we do have a Black color. 

(R = 127, G = 127, B = 127) what would this mean? This means all colors are mixed equally. So would we get White? The answer is Yes and No. Since all are mixed equally the color has to be white. But the intensity of each color in computer form represents brightness. This means we have a white, but that is only half as bring as the (255,255,255). What does less bright white make? The answer is grey. We would thereby have a 50% grey shade. (255,255,255) would similarly be 100% grey, but 100% grey is called white. 

Let's take a few more examples: 

> (R = 255, G = 0, B = 0) would make solid Red color

> (R = 0, G = 255, B = 0) would make solid Green color

> (R = 0, G = 0, B = 255) would make solid Blue color

> (R = 127, G = 0, B = 0) would make dark Red color

> (R = 50, G = 0, B = 0) would make an even darker shade of Red


## Converting to Greyscale

In many ML projects, you will convert RGB images to greyscale images. Greyscale are nothing but black and white images. Such shade only make use of the color grey. 0% grey would mean black, and 100% grey would mean white. Between these would exist several other shades of grey, with a total of 256 grey shades possible.

Since to produce the color grey, we will always use equal proportions of Red, Green and Blue; we can represent greyscale images by a single number.

50% grey, which is (127,127,127) could actually be represented by just the number 127. And this is the standard convention. For grey scale images we specify a single value for the % of grey used, in the range of 0 to 255, without specifying the individual RGB values. 


## Grey scale images are 2D

Since we don't need RGB values for greyscale images; we can fit the pixel data of a greyscale image in a 2D matrix. The rows and columns represent pixes, and the each pixel will then hold an integer value in the range of 0 to 255.

```
[[0,0],[0,0]]
```

The above list can represent an image of size 2x2. It is a greyscale image. There are 4 pixels in the image, and all pixes are in black color. So what we have here is a 2x2 square sized black image. 


```
[[255,255], [255,255]]
```

The above list represents a similar 2x2 greyscale image, but this one is a square white image. 

**From above,** 

1) In 1 dimensional matrix(mat_1), **(6,)** denotes that the matrix is one-dimensional and there are 6 rows in it.<br>
2) In 3 dimensional matrix(mat_3), **(2,2,3)** denotes that there are 2 matrices of shape (2,3). 


# More than 3 dimensions

As the level of complexity increases with the data, the dimensions keep increasing and algebraically we call it **N-Dimensions**. We can use `ndarray` to make arrays of any number of dimensions. 

To make a 4 dimensional array, we would simply put 3D arrays into a list; and so on and so forth to make any number of dimensions. 

# Determining Shape & Dimensions

The number of dimensions and items in an array is defined by its shape, which is a tuple of N non-negative integers that specify the sizes of each dimension. The type of items in the array is specified by a separate data-type object (dtype), one of which is associated with each n-dimensional array(ndarray).

Let's take a look.

In [19]:
list2d = [[1,1],[2,2],[3,3],[4,4]]
mat2d = np.array(list2d)

print(mat2d.shape)
print(mat2d.ndim)

(4, 2)
2


The shape is returned in a form of a Tuple. We use the `shape` function to fetch the shape of a matrix. Here 4 is the number of rows, and 2 is the number of columns. We are basically looking at a 4x2 matrix. 

We also use the `ndim` function to determine the number of dimensions of the matrix. Here it returns `2` which means this is a 2D matrix.

Let us try something interesting. Getting the shape of a 1D matrix. 

In [20]:
mat1 = np.array([1,2,3,4])
print(mat1)
print(mat1.shape)
print(mat1.ndim)

[1 2 3 4]
(4,)
1


What we got is very interesting. We would have assumed for this matrix to have 1 row and 4 columns. This is not actually true. The shape of the matrix is actually 4 rows and no second dimension, which is represented as `(4,)`. This also effectively is same as a matrix with 1 row and 4 columns.

We can confirm that this is a 1D matrix, as the `ndim` function returns the value `1`

Let's try a vertical matrix.

In [21]:
mat2 = np.array([[1],[2],[3],[4]])
print(mat2)
print(mat2.shape)
print(mat2.ndim)

[[1]
 [2]
 [3]
 [4]]
(4, 1)
2


Now we see that the shape function officially considers this a 2D matrix. The shape is `(4,1)`, which means 4 rows and 1 column. The `(4,)` and `(4,1)` mean very different types of data. We need to understand this fully and keep this in mind at all times. The `ndim` function also returns the value `2`, thereby indeed confirming that this is a 2D matrix.

Let's now try to get the shape of a 3D matrix.

In [15]:
mat3 = np.array([[[1],[2]],[[1],[2]]])
print(mat3)
print(mat3.shape)
print(mat3.ndim)

[[[1]
  [2]]

 [[1]
  [2]]]
(2, 2, 1)
3


The above example clearly denotes, that this is a 3D matrix; where in there are 2 individual matrices each of shape `(2,1)`. 

So out of curiosity, how would we create a `(2,2,2)` matrix?

In [16]:
mat3 = np.array([[[1,1],[2,2]],[[1,1],[2,2]]])
print(mat3)
print(mat3.shape)

[[[1 1]
  [2 2]]

 [[1 1]
  [2 2]]]
(2, 2, 2)


There you go. We have a (2,2,2) matrix. This is also the equivalent of a Cube. 