## Demonstation of the reshape function

You will often find the reshape function useful when working with data. It allows you to change the shape of an array without changing its data. For example, if you have a one-dimensional array with 12 elements, you can reshape it into a 3x4 array. The only requirement is that the size of the initial array must match the size of the reshaped array.

In [3]:
import numpy as np
import pandas as pd

## Reshaping a Numpy Array

Let start by creating a simple 2D array of integers.

In [4]:
arr = np.array([
    [1, 2, 3], 
    [4, 5, 6], 
    [7, 8, 9], 
    [10, 11, 12]
])
arr

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [5]:
arr.shape

(4, 3)

We can use the numpy reshape method to reshape the array into a 2d array with 12 rows and 1 column.

In [24]:
arr.reshape(-1, 1)

array([[ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12]])

Note that the original array is not changed. The reshape method returns a new array with the new shape.

In [26]:
arr

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

If I wish to save the results of reshape, I can assign it to a new variable (or overwrite the original variable).

In [30]:
arr12x1 = arr.reshape(-1, 1)
arr12x1

array([[ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12]])

I can also reshape the array into a 1 dimensional array (often called a vector).

In [29]:
arr.reshape(12)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

But, since array length could change, it would be best to make not hard-code the length of the array. Instead, we can use the negative index approach -- where -1 means the last item (-2 would be the second to last item, etc).

In [7]:
arr.reshape(-1) 

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

As you can image, we can reshape an array into any shape we want (as long as it has the same number of elements as the original array).

In [33]:
arr.reshape(2,6)

array([[ 1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12]])

In [34]:
arr.reshape(6,2)

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10],
       [11, 12]])

## Using the ravel function to 'flatten' any array

In [10]:
arr.ravel() # same as reshape(-1), it flattens the array

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

# Reshaping and flattening pandas dataframes

You will often find examples where the code will move data between dataframes and numpy arrays. This allows programmers to take advantage of the specialized methods of each object type. For example, pandas dataframes have a lot of methods for working with dataframes, but numpy arrays have a lot of methods for working with arrays. So, if you need to do some array manipulation, you might want to convert your dataframe to an array, do the manipulation, and then convert it back to a dataframe.

Let's create a simple dataframe with 3 columns and 4 rows.

In [36]:
df = pd.DataFrame(arr, columns=('x1', 'x2', 'y'))
df

Unnamed: 0,x1,x2,y
0,1,2,3
1,4,5,6
2,7,8,9
3,10,11,12


Here, we can see that the dataframe has 3 columns and 4 rows.

In [37]:
df.shape

(4, 3)

We can translate the dataframe into a numpy array as follows:

In [38]:
df_arr = df.to_numpy()
df_arr

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

Once we have the data transformed into a numpy array, we can reshape it into any shape we want.

In [14]:
df_arr.reshape(3,4)

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Often, you'll also find that numpyt functions can directly work with dataframes. For example, the ravel function can be used to flatten a dataframe into a 1D array.

In [15]:
np.ravel(df)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

Something you will see more of later, we can 'chain' methods together. Here, we are calling the ravel method on the dataframe, and then calling the reshape method on the results of the ravel method.

In [16]:
arr2 = np.ravel(df).reshape(6,2)
arr2

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10],
       [11, 12]])

## Converting a numpy array into a dataframe.

In [17]:
arr1 = df.to_numpy()
arr1

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [18]:
df1 = pd.DataFrame(arr1, columns=('x1', 'x2', 'y'))
df1

Unnamed: 0,x1,x2,y
0,1,2,3
1,4,5,6
2,7,8,9
3,10,11,12


In [19]:
df2 = pd.DataFrame(arr2, columns=('x1', 'x2'))
df

Unnamed: 0,x1,x2,y
0,1,2,3
1,4,5,6
2,7,8,9
3,10,11,12


## Conclusions

The code in this notebook provides you with an introduction to some of the common operations you can perform with NumPy. You can learn more about NumPy from the [official documentation](https://numpy.org/doc/stable/user/absolute_beginners.html).
