<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#What-is-the-current-shape-of-my-data?" data-toc-modified-id="What-is-the-current-shape-of-my-data?-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>What is the current <code>shape</code> of my data?</a></span></li><li><span><a href="#Reshape-yourself" data-toc-modified-id="Reshape-yourself-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Reshape yourself</a></span></li><li><span><a href="#Unraveling-an-array" data-toc-modified-id="Unraveling-an-array-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Un<code>ravel</code>ing an array</a></span></li><li><span><a href="#Reshaping-irregular-sized-arrays" data-toc-modified-id="Reshaping-irregular-sized-arrays-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Reshaping irregular sized arrays</a></span></li></ul></div>

>All content is released under Creative Commons Attribution [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/) and all source code is released under a [BSD-3 clause license](https://en.wikipedia.org/wiki/BSD_licenses). 
>
>Please reuse, remix, revise, and reshare this content in any way, keeping this notice.
>
><img style="float: right;" width="150px" src="images/jupyter-logo.png">**Are you viewing this on jupyter.org?** Then this notebook will be read-only. <br>
>See how you can interactively run the code in this notebook by visiting our [instruction page about Notebooks](https://yint.org/notebooks). 

# Reshaping your data matrix

On many occasions you might face the need to keep the data that you have, but move it into a different shape.

The most common example is when you read your data and it is one long vector. But you know that every 7th entry, for example, is the data from the start of a new week. So entry 1, 7, 14, etc, should be the start of the row, and you would like 7 columns. Once you have it in this form you can calculate weekly averages (across the rows), or daily averages (down each of the 7 columns).

Another example is data that is measured every hour, and you want to reshape the vector into a matrix with 24 columns.

Let's take a look at this, and we will break it down:
* what is the current ``shape`` of my array
* how can I ``reshape`` it?
* how can I un``ravel`` the data into a long vector again?
* what happens with irregular shapes?

## What is the current ``shape`` of my data?

Every array in NumPy has a ``.shape`` attribute. For example: ``my_data.shape`` will return a tuple with the array's shape. You can also use a NumPy function: ``np.shape(...)``. We show both below.

An interesting other attribute is the ``my_data.ndim`` attribute, which shows how many dimensions your data has. The number of dimensions will **always** match the number of values in the ``.shape`` tuple.

Let's try it (we actually saw this command already in the [prior notebook](./)):

In [1]:
import numpy as np
A = np.array([[8, 7, 3], [6, 5, 4], [4, 3, 5], [2, 1, 6]])
print('Matrix A has this shape: {}'.format(A.shape))
print('Once it is transposed, it has this shape: {}'.format(A.T.shape))
print('This is another way to get the shape: {}'.format(np.shape(A.T)))

print(('The number of dimensions of this data is: {}, which '
       'is the same as the number of integers in this tuple: {}').format(A.ndim, A.shape))

Matrix A has this shape: (4, 3)
Once it is transposed, it has this shape: (3, 4)
This is another way to get the shape: (3, 4)
The number of dimensions of this data is: 2, which is the same as the number of integers in this tuple: (4, 3)


## Reshape yourself

Let's try it with an example where we have daily data, over 21 days. We would like to reshape the vector that has 21 entries to have 3 rows and 7 columns.

To make it simple, let our vector simply be the numbers 1 to 21, but in reverse order.

In [2]:
daily = np.linspace(21, 1, 21)
print('The data before reshaping: {}'.format(daily))
print('which has a shape of {}'.format(daily.shape))
print('which has a dimension of {}'.format(daily.ndim))


matrix = daily.reshape((3, 7))
print('When reshaping it into a matrix:\n{}'.format(matrix))
print('which has a new shape of {}'.format(matrix.shape))

The data before reshaping: [21. 20. 19. 18. 17. 16. 15. 14. 13. 12. 11. 10.  9.  8.  7.  6.  5.  4.
  3.  2.  1.]
which has a shape of (21,)
which has a dimension of 1
When reshaping it into a matrix:
[[21. 20. 19. 18. 17. 16. 15.]
 [14. 13. 12. 11. 10.  9.  8.]
 [ 7.  6.  5.  4.  3.  2.  1.]]
which has a new shape of (3, 7)


## Un``ravel``ing an array

To *unravel* something means to take something that is knotted up, or tangled and undo it, or straighten it out. 

Before we show how, did you notice the shape of the ``daily`` variable above? It was printed as ``(21,)``. That is not the same as ``(21, 1)`` which means 21 rows and 1 column, and would have 2 dimensions. It is also not simply ``21``. A shape of ``(21,)`` indicates explicitly that the array has 1 dimension.

We mention that, because when you un``ravel`` an array the unraveled dimension will be 1. Let's try it out:

In [3]:
my_data = np.linspace(15, 1, 15).reshape((3, 5))
print('The "my_data" matrix has {} dimensions and a shape of {}.'.format(my_data.ndim, my_data.shape))

unraveled = my_data.ravel()
print('The unraveled data has {} dimension and a shape of {}.'.format(unraveled.ndim, unraveled.shape))
print('Printed out, it looks like this:\n {}'.format(unraveled))

# Advanced/enrichment:
# We want to reshape the unraveled array back into a matrix
print('Folded back up into a matrix with 3 rows, the array is:\n{}'.format(unraveled.reshape(3, 5)))

# but you can also do this: say you know that you want 5 rows, you can let
# Numpy automatically figure out how many columns. Notice the "-1"?
print('Folded back up into a matrix with 5 rows, the array is:\n{}'.format(unraveled.reshape(5, -1)))


The "my_data" matrix has 2 dimensions and a shape of (3, 5).
The unraveled data has 1 dimension and a shape of (15,).
Printed out, it looks like this:
 [15. 14. 13. 12. 11. 10.  9.  8.  7.  6.  5.  4.  3.  2.  1.]
Folded back up into a matrix with 3 rows, the array is:
[[15. 14. 13. 12. 11.]
 [10.  9.  8.  7.  6.]
 [ 5.  4.  3.  2.  1.]]
Folded back up into a matrix with 5 rows, the array is:
[[15. 14. 13.]
 [12. 11. 10.]
 [ 9.  8.  7.]
 [ 6.  5.  4.]
 [ 3.  2.  1.]]


## Reshaping irregular sized arrays

It is an error to reshape an array into another without preserving the size exactly. 

NumPy will not drop elements away, or fill entries with a missing value indicator (such as ``NaN``). The number of entries before and after reshaping ***must match exactly***.

Try it out:

In [4]:
my_data = np.linspace(15, 1, 15).reshape((3, 5))
print('The "my_data" matrix has {} dimensions and a shape of {}.'.format(my_data.ndim, my_data.shape))

# Trying to reshape a matrix with 3x5 entries into a 2x8 matrix:
print('Reshaping it into a 2 by 8 matrix: \n{}'.format(unraveled.reshape(2, 8)))



The "my_data" matrix has 2 dimensions and a shape of (3, 5).


ValueError: cannot reshape array of size 15 into shape (2,8)