In [1]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

<img alt="NumPy" src="https://cdn.rawgit.com/numpy/numpy/master/branding/icons/numpylogo.svg" height="60">
# NumPy is the fundamental package needed for scientific computing with Python.


It provides:

- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities

# The ndarray data structure
The core functionality of NumPy is its "ndarray", for n-dimensional array, data structure. These arrays are strided views on memory. In contrast to Python's built-in list data structure (which, despite the name, is a dynamic array), these arrays are homogeneously typed: all elements of a single array must be of the same type.

Such arrays can also be views into memory buffers allocated by C/C++, Cython, and Fortran extensions to the CPython interpreter without the need to copy data around, giving a degree of compatibility with existing numerical libraries. This functionality is exploited by the SciPy package, which wraps a number of such libraries (notably BLAS and LAPACK). NumPy has built-in support for memory-mapped ndarrays.

However, you should know that, on a structural level, an ndarray is basically nothing but pointers. It’s a combination of a memory address, a data type, a shape and strides:

* The ***data*** pointer indicates the memory address of the first byte in the array,
* The data type or ***dtype*** pointer describes the kind of elements that are contained within the array,
* The ***shape*** indicates the shape of the array, and
* The ***strides*** are the number of bytes that should be skipped in memory to go to the next element. If your strides are (10,1), you need to proceed one byte to get to the next column and 10 bytes to locate the next row.

Or, in other words, an ndarray contains information about the raw data, how to locate an element and how to interpret an element.

In [2]:
import numpy as np

# Init a numpy array
array = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

# Print out the array it self
print(array)

# Print out memory address
print(array.data)

# Print out the shape of `array`
print(array.shape)

# Print out the data type of `array`
print(array.dtype)

# Print out the stride of `array`
print(array.strides)

[[1 2 3 4]
 [5 6 7 8]]
<memory at 0x104d73480>
(2, 4)
int64
(32, 8)


The **ndarray** Object have a lot more Attributes, this is a quick summary for some of them. You can look it up on the [Numpy Documentation](/numpybook.pdf) for more details
![Attributes of the ndarray](http://imageshack.com/a/img921/7005/weC72W.png)

### What advantages do NumPy arrays offer over (nested) Python lists?

Python’s lists are efficient general-purpose containers. They support (fairly) efficient insertion, deletion, appending, and concatenation, and Python’s list comprehensions make them easy to construct and manipulate. However, they have certain limitations: they don’t support “vectorized” operations like elementwise addition and multiplication, and the fact that they can contain objects of differing types mean that Python must store type information for every element, and must execute type dispatching code when operating on each element. This also means that very few list operations can be carried out by efficient C loops – each iteration would require type checks and other Python API bookkeeping.

### Numpy Array (ndarry) can have multi-dimension

![Array](http://community.datacamp.com.s3.amazonaws.com/community/production/ckeditor_assets/pictures/332/content_arrays-axes.png)

The array that you see above is, as its name already suggested, a 2-dimensional array: you have rows and columns. The rows are indicated as the “axis 0”, while the columns are the “axis 1”. The number of the axis goes up accordingly with the number of the dimensions: in 3-D arrays, of which you have also seen an example in the previous code chunk, you’ll have an additional “axis 2”. Note that these axes are only valid for arrays that have at least 2 dimensions, as there is no point in having this for 1-D arrays;

These axes will come in handy later when you’re manipulating the shape of your NumPy arrays.

# How To Make NumPy Arrays

To make a numpy array, you can just use the np.array() function. All you need to do is pass a list to it and optionally, you can also specify the data type of the data. Here is a quick references about all the **data type (dtype)** that Numpy Array supported:
![Data Types](http://imageshack.com/a/img923/8066/tL9lVj.png)

There’s no need to go and memorize these NumPy data types if you’re a new user; But you do have to know and care what data you’re dealing with. The data types are there when you need more control over how your data is stored in memory and on disk. Especially in cases where you’re working with large data, it’s good that you know to control the storage type.

In [3]:
array = np.array([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=np.int64)
print(array)

[[1 2 3 4]
 [5 6 7 8]]


### How To Make An “Empty” NumPy Array
What people often mean when they say that they are creating “empty” arrays is that they want to make use of initial placeholders, which you can fill up afterwards. You can initialize arrays with ones or zeros, but you can also make arrays that get filled up with evenly spaced values, constant or random values.

However, you can still make a totally empty array, too.

Luckily for us, there are quite a lot of functions to make

In [4]:
# Create an array of ones
print("Array of ones:")
print(np.ones((3,4)))

# Create an array of zeros
print("\nArray of zeros:")
print(np.zeros((2,3,4),dtype=np.int16))

# Create an array with random values
print("\nArray with random values:")
print(np.random.random((2,2)))

# Create an empty array
print("\nEmpty Array:")
print(np.empty((3,2)))

# Create a full array
print("\nFull array of 7:")
print(np.full((2,2),7))

# Create an array of evenly-spaced values
print("\nArray of evenly-spaced values:")
print(np.arange(10,25,5))

# Create an array of evenly-spaced values
print("\nArray of evenly-spaced values:")
print(np.linspace(0,2,9))

# Create an dentity matrix
print("\nIdentity matrix:")
print(np.eye(2))

# Create a upper-triangular and lower-triangular matrix
print("\nUpper-triangular matrix:")
print(np.triu(array, 0))
print("\nLower-triangular matrix:")
print(np.tril(array, 0))

# Create a diagonal matrix
print("\nDiagonal matrix:")
print(np.diag([1,4,5,7], 0))

Array of ones:
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]

Array of zeros:
[[[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 [[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]]

Array with random values:
[[0.5104085  0.42014661]
 [0.42528554 0.16823465]]

Empty Array:
[[0. 0.]
 [0. 0.]
 [0. 0.]]

Full array of 7:
[[7 7]
 [7 7]]

Array of evenly-spaced values:
[10 15 20]

Array of evenly-spaced values:
[0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]

Identity matrix:
[[1. 0.]
 [0. 1.]]

Upper-triangular matrix:
[[1 2 3 4]
 [0 6 7 8]]

Lower-triangular matrix:
[[1 0 0 0]
 [5 6 0 0]]

Diagonal matrix:
[[1 0 0 0]
 [0 4 0 0]
 [0 0 5 0]
 [0 0 0 7]]


* For some, such as **np.ones()**, **np.random.random()**, **np.empty()**, **np.full()** or **np.zeros()** the only thing that you need to do in order to make arrays with ones or zeros is pass the shape of the array that you want to make. As an option to **np.ones()** and **np.zeros()**, you can also specify the data type. In case of **np.full()**, you also have to specify the constant value that you want to insert into the array.
* With **np.linspace()** and **np.arange()** you can make arrays of evenly spaced values. The difference between these two functions is that the last value of the three that are passed in the code chunk above designates either the step value for **np.linspace()** or number of samples for **np.arange()**. What happens in the first is that you want, for example, an array of 9 values that lie between 0 and 2. For the latter, you specify that you want an array to start at 10 and per steps of 5, generate values for the array that you’re creating.

NumPy also allows you to create **Triangular Matrix** and **Diagonal Matrix** with **np.triu()**, **np.tril()** and **np.diag()**. It's also support you to create an identity array or matrix with **np.eye()** and **np.identity()**. An **identity matrix** is a **square matrix** of which **all elements in the principal diagonal are ones and all other elements are zeros**. When you multiply a matrix with an identity matrix, the given matrix is left unchanged.

In other words, if you multiply a matrix by an identity matrix, the resulting product will be the same matrix again by the standard conventions of matrix multiplication. Identity matrices are useful when you’re starting to do matrix calculations: they can simplify mathematical equations, which makes your computations more efficient and robust.

### How To Load NumPy Arrays From Text

Creating arrays with the help of initial placeholders or with some example data is a great way of getting started with numpy. But when you want to get started with data analysis, you’ll need to load data from text files.

With that what you have seen up until now, you won’t really be able to do much. Make use of some specific functions to load data from your files, such as **loadtxt()** or **genfromtxt()**.

Let’s say you have the following text files with data:

In [5]:
# This is your data in the text file
# Value1  Value2  Value3
# 0.2536  0.1008  0.3857
# 0.4839  0.4536  0.3561
# 0.1292  0.6875  0.5929
# 0.1781  0.3049  0.8928
# 0.6253  0.3486  0.8791

# Import your data
x, y, z = np.loadtxt('data.txt', skiprows=1, unpack=True)

In the code above, you use **loadtxt()** to load the data in your environment. You see that the first argument that both functions take is the text file **data.txt**. Next, there are some specific arguments for each: in the first statement, you skip the first row and you return the columns as separate arrays with **unpack=TRUE**. This means that the values in column Value1 will be put in x, and so on.

Note that, in case you have comma-delimited data or if you want to specify the data type, there are also the arguments delimiter and dtype that you can add to the **loadtxt()** arguments.

That’s easy and straightforward, right?

Let’s take a look at your second file with data:

In [6]:
# Your data in the text file
# Value1  Value2  Value3
# 0.4839  0.4536  0.3561
# 0.1292  0.6875  MISSING
# 0.1781  0.3049  0.8928
# MISSING 0.5801  0.2038
# 0.5993  0.4357  0.7410

my_array = np.genfromtxt('data2.txt', skip_header=1, filling_values=-999)

You see that here, you resort to **genfromtxt()** to load the data. In this case, you have to handle some missing values that are indicated by the **'MISSING'** strings. Since the **genfromtxt()** function converts character strings in numeric columns to nan, you can convert these values to other ones by specifying the **filling_values** argument. In this case, you choose to set the value of these missing values to **-999**.

If, by any chance, you have values that don’t get converted to nan by **genfromtxt()**, there’s always the **missing_values** argument that allows you to specify what the missing values of your data exactly are.

But this is not all. You can check out [this page](https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html#numpy.genfromtxt) to see what other arugments you can add to import your data successfully.

You now might wonder what the difference between these two functions really is.

The examples indicated this maybe implicitly, but, in general, **genfromtxt()** gives you a little bit more flexibility, it’s also more robust than **loadtxt()**.

Let’s make this difference a little bit more practical: the latter, **loadtxt()**, only works when each row in the text file has the same number of values; So when you want to handle missing values easily, you’ll typically find it easier to use **genfromtxt()**.

But this is definitely not the only reason. A brief look on the number of arguments that **genfromtxt()** has to offer will teach you that there is really a lot more things that you can specify in your import, such as the maximum number of rows to read or the option to automatically strip white spaces from variables.

# How To Save NumPy Arrays
Once you have done everything that you need to do with your arrays, you can also save them to a file. If you want to save the array to a text file, you can use the **savetxt()** function to do this:

In [7]:
import numpy as np
x = np.arange(0.0,5.0,1.0)
np.savetxt('test.out', x, delimiter=',')

There are, of course, other ways to save your NumPy arrays to text files. Check out the functions in the table below if you want to get your data to binary files or archives:

![table](http://imageshack.com/a/img921/1218/EskRQ6.png)

For more information or examples of how you can use the above functions to save your data, go [here](https://docs.scipy.org/doc/numpy/reference/routines.io.html) or make use of one of the help functions that NumPy has to offer to get to know more instantly!

### It’s time to look more closely into the second key element that really defines the NumPy library: scientific computing.

# How NumPy Broadcasting Work

Before you go deeper into scientific computing, it might be a good idea to first go over what broadcasting exactly is: it’s a mechanism that allows NumPy to work with arrays of different shapes when you’re performing arithmetic operations.

To put it in a more practical context, you often have an array that’s somewhat larger and another one that’s somewhat smaller. Ideally, you want to use the smaller array multiple times to perform an operation (such as a sum, multiplication, etc.) on the larger array.

To do this, you use the broadcasting mechanism.

However, there are some rules if you want to use it. And, before you already sigh, you’ll see that these “rules” are very simple and kind of straightforward!

* First off, to make sure that the broadcasting is successful, the dimensions of your arrays need to be compatible. Two dimensions are compatible when they are equal. Consider the following example:

In [8]:
# Initialize `x`
x = np.ones((3,4))

# Check shape of `x`
print(x.shape)

# Initialize `y`
y = np.random.random((3,4))

# Check shape of `y`
print(y.shape)

# Add `x` and `y`
x + y

(3, 4)
(3, 4)


array([[1.04625024, 1.56403667, 1.74277323, 1.74909045],
       [1.85012545, 1.72090598, 1.41006665, 1.90795935],
       [1.81231969, 1.05317978, 1.68153039, 1.38134115]])

* Two dimensions are also compatible when one of them is 1

In [9]:
# Import `numpy` as `np`
import numpy as np

# Initialize `x`
x = np.ones((3,4))

# Check shape of `x`
print(x.shape)

# Initialize `y`
y = np.arange(4)

# Check shape of `y`
print(y.shape)

# Subtract `x` and `y`
x - y 

(3, 4)
(4,)


array([[ 1.,  0., -1., -2.],
       [ 1.,  0., -1., -2.],
       [ 1.,  0., -1., -2.]])

Note that if the dimensions are not compatible, you will get a **ValueError**.

Tip: also test what the size of the resulting array is after you have done the computations! You’ll see that the size is actually the maximum size along each dimension of the input arrays.

In other words, you see that the result of **x-y** gives an array with shape **(3,4)**: **y** had a shape of **(4,)** and **x** had a shape of **(3,4)**. The maximum size along each dimension of **x** and **y** is taken to make up the shape of the new, resulting array.

* Lastly, the arrays can only be broadcast together if they are compatible in all dimensions. Consider the following example:

In [10]:
# Import `numpy` as `np`
import numpy as np

# Initialize `x` and `y`
x = np.ones((3,4))
y = np.random.random((5,1,4))

# Add `x` and `y`
x + y

array([[[1.39475736, 1.09134579, 1.78151785, 1.96330821],
        [1.39475736, 1.09134579, 1.78151785, 1.96330821],
        [1.39475736, 1.09134579, 1.78151785, 1.96330821]],

       [[1.41621246, 1.7971975 , 1.6402449 , 1.78055495],
        [1.41621246, 1.7971975 , 1.6402449 , 1.78055495],
        [1.41621246, 1.7971975 , 1.6402449 , 1.78055495]],

       [[1.96325937, 1.31064251, 1.31885178, 1.01555847],
        [1.96325937, 1.31064251, 1.31885178, 1.01555847],
        [1.96325937, 1.31064251, 1.31885178, 1.01555847]],

       [[1.30380013, 1.9921333 , 1.03822396, 1.36507611],
        [1.30380013, 1.9921333 , 1.03822396, 1.36507611],
        [1.30380013, 1.9921333 , 1.03822396, 1.36507611]],

       [[1.50431361, 1.6396145 , 1.80952121, 1.33598401],
        [1.50431361, 1.6396145 , 1.80952121, 1.33598401],
        [1.50431361, 1.6396145 , 1.80952121, 1.33598401]]])

You see that, even though **x** and **y** seem to have somewhat different dimensions, the two can be added together.

That is because they are compatible in all dimensions:

* Array **x** has dimensions 3 X 4,
* Array **y** has dimensions 5 X 1 X 4

Since you have seen above that dimensions are also compatible if one of them is equal to 1, you see that these two arrays are indeed a good candidate for broadcasting!

What you will notice is that in the dimension where **y** has size 1 and the other array has a size greater than 1 (that is, 3), the first array behaves as if it were copied along that dimension.

Note that the shape of the resulting array will again be the maximum size along each dimension of x and y: the dimension of the result will be (5,3,4)

Here is a beautiful sumary of the broadcasting mechanism in Numpy:

![broadcasting](http://imageshack.com/a/img922/9647/RQWV3r.png)

In short, if you want to make use of broadcasting, you will rely a lot on the shape and dimensions of the arrays with which you’re working.

***But what if the dimensions are not compatible?***

***What if they are not equal or if one of them is not equal to 1?***

You’ll have to fix this by manipulating your array! You’ll see how to do this in one of the next sections.

# How Do Array Mathematics Work?

You’ve seen that broadcasting is handy when you’re doing arithmetic operations. In this section, you’ll discover some of the functions that you can use to do mathematics with arrays.

As such, it probably won’t surprise you that you can just use **+, -, *, / or %** to **add, subtract, multiply, divide or calculate the remainder** of two (or more) arrays. However, a big part of why NumPy is so handy, is because it also has functions to do this. The equivalent functions of the operations that you have seen just now are, respectively, **np.add(), np.subtract(), np.multiply(), np.divide()** and **np.remainder().**

You can also easily do exponentiation and taking the square root of your arrays with **np.exp()** and **np.sqrt()**, or calculate the sines or cosines of your array with **np.sin()** and **np.cos()**. Lastly, its’ also useful to mention that there’s also a way for you to calculate the natural logarithm with **np.log()** or calculate the dot product by applying the **dot()** to your array.

In [11]:
# Add `x` and `y`
np.add(x,y)

# Subtract `x` and `y`
np.subtract(x,y)

# Multiply `x` and `y`
np.multiply(x,y)

# Divide `x` and `y`
np.divide(x,y)

# Calculate the remainder of `x` and `y`
np.remainder(x,y)

# Calculate dot product of 'x' and 'y'
# print(np.dot(x,y))

array([[[0.21048527, 0.08654206, 0.21848215, 0.03669179],
        [0.21048527, 0.08654206, 0.21848215, 0.03669179],
        [0.21048527, 0.08654206, 0.21848215, 0.03669179]],

       [[0.16757508, 0.2028025 , 0.3597551 , 0.21944505],
        [0.16757508, 0.2028025 , 0.3597551 , 0.21944505],
        [0.16757508, 0.2028025 , 0.3597551 , 0.21944505]],

       [[0.03674063, 0.06807248, 0.04344466, 0.00425788],
        [0.03674063, 0.06807248, 0.04344466, 0.00425788],
        [0.03674063, 0.06807248, 0.04344466, 0.00425788]],

       [[0.0885996 , 0.0078667 , 0.00617714, 0.26984777],
        [0.0885996 , 0.0078667 , 0.00617714, 0.26984777],
        [0.0885996 , 0.0078667 , 0.00617714, 0.26984777]],

       [[0.49568639, 0.3603855 , 0.19047879, 0.32803197],
        [0.49568639, 0.3603855 , 0.19047879, 0.32803197],
        [0.49568639, 0.3603855 , 0.19047879, 0.32803197]]])

But there is more. Check out this small list of aggregate functions:

![fucntion list](http://imageshack.com/a/img921/6279/4Msd1s.png)

Besides all of these functions, you might also find it useful to know that there are mechanisms that allow you to compare array elements. For example, if you want to check whether the elements of two arrays are the same, you might use the **==** operator. To check whether the array elements are smaller or bigger, you use the **<** or **>** operators.

This all seems quite straightforward, yes?

However, you can also compare entire arrays with each other! In this case, you use the **np.array_equal()** function. Just pass in the two arrays that you want to compare with each other and you’re done.

Note that, besides comparing, you can also perform logical operations on your arrays. You can start with **np.logical_or(), np.logical_not()** and **np.logical_and()**. This basically works like your typical OR, NOT and AND logical operations;

In the simplest example, you use OR to see whether your elements are the same (for example, 1), or if one of the two array elements is 1. If both of them are 0, you’ll return **FALSE**. You would use **AND** to see whether your second element is also 1 and NOT to see if the second element differs from 1.

In [12]:
a = np.array([True, True, False, False])
b = np.array([False, False, True, True])

# `a` AND `b` 
print(np.logical_and(a, b))

# `a` OR `b`
print(np.logical_or(a, b))

# `a` NOT `b`
print(np.logical_not(a,b))

[False False False False]
[ True  True  True  True]
[False False  True  True]


# How To Subset, Slice, And Index Arrays

Besides mathematical operations, you might also consider taking just a part of the original array (or the resulting array) or just some array elements to use in further analysis or other operations. In such case, you will need to subset, slice and/or index your arrays.

These operations are very similar to when you perform them on Python lists.

In [13]:
# Select the element at the 1st index
print(my_array[1])

# Select the element at row 1 column 2
print(my_2d_array[1][2])

# Select the element at row 1 column 2
print(my_2d_array[1,2])

# Select the element at row 1, column 2 and 
print(my_3d_array[1,1,2])

[ 1.292e-01  6.875e-01 -9.990e+02]


NameError: name 'my_2d_array' is not defined

Something a little bit more advanced than subsetting, if you will, is slicing. Here, you consider not just particular values of your arrays, but you go to the level of rows and columns.

In [None]:
# Select items at index 0 and 1
print(my_array[0:2])

# Select items at row 0 and 1, column 1
print(my_2d_array[0:2,1])

# Select items at row 1
# This is the same as saying `my_3d_array[1,:,:]
print(my_3d_array[1,...])

You’ll see that, in essence, the following holds:

In [None]:
a[start:end] # items start through the end (but the end is not included!)
a[start:]    # items start through the rest of the array
a[:end]      # items from the beginning through the end (but the end is not included!)

Here is a beautiful sumary of the slicing in Numpy:

![slicing](https://i.imgur.com/ZiLIrsM.png)

## Common Two-Dimensional Slicing for Machine Learning

### Split Input and Output Features

It is common to split your loaded data into input variables **x** and the output variable **y**.

We can do this by slicing all rows and all columns up to, but before the last column, then separately indexing the last column.

For the input features, we can select all rows and all columns except the last one by specifying **‘:’** for in the rows index, and **:-1** in the columns index.

For the output column, we can select all rows again using **‘:’** and index just the last column by specifying the **-1** index.

Putting all of this together, we can separate a 3-column 2D dataset into input and output data as follows:

In [None]:
# split input and output
from numpy import array
# define array
data = array([[11, 22, 33],
              [44, 55, 66],
              [77, 88, 99]])
# separate data
x, y = data[:, :-1], data[:, -1]
print("Input Features:\n",x)
print("Output:",y)

### Split Train and Test Rows
It is common to split a loaded dataset into separate train and test sets.

This is a splitting of rows where some portion will be used to train the model and the remaining portion will be used to estimate the skill of the trained model.

This would involve slicing all columns by specifying **‘:’** in the second dimension index. The training dataset would be all rows from the beginning to the split point.

The test dataset would be all rows starting from the split point to the end of the dimension.

Putting all of this together, we can split the dataset at the contrived split point of 2.

In [None]:
# split train and test
from numpy import array
# define array
data = array([[11, 22, 33],
              [44, 55, 66],
              [77, 88, 99]])
# separate data
split = 2
train,test = data[:split,:],data[split:,:]
print("Train data:\n", train)
print("Test data:\n", test)

# How To Manipulate Arrays

Performing mathematical operations on your arrays is one of the things that you’ll be doing, but probably most importantly to make this and the broadcasting work is to know how to manipulate your arrays.

Below are some of the most common manipulations that you’ll be doing.

### How To Transpose Your Arrays

What transposing your arrays actually does is permuting the dimensions of it. Or, in other words, you switch around the shape of the array. 

In [None]:
# Print `my_2d_array`
print(array)
print("\n")
# Transpose `my_2d_array`
print(np.transpose(array))
print("\n")
# Or use `T` to transpose `my_2d_array`
print(array.T)

### Reshaping Versus Resizing Your Arrays
You might have read in the broadcasting section that the dimensions of your arrays need to be compatible if you want them to be good candidates for arithmetic operations. Well, this is where you get the answer! It is resize your array. You will then return a new array that has the shape that you passed to the **np.resize()** function. If you pass your original array together with the new dimensions, and if that new array is larger than the one that you originally had, the new array will be filled with copies of the original array that are repeated as many times as is needed.

However, if you just apply **np.resize()** to the array and you pass the new shape to it, the new array will be filled with zeros.

In [None]:
# Print the shape of `x`
print(x.shape)

# Resize `x` to ((6,4))
np.resize(x, (6,4))

# Try out this as well
x.resize((6,4))

# Print out `x`
print(x)

Besides resizing, you can also reshape your array. This means that you give a new shape to an array without changing its data. The key to reshaping is to make sure that the total size of the new array is unchanged. If you take the example of array **x** that was used above, which has a size of 3 X 4 or 12, you have to make sure that the new array also has a size of 12.

Psst… If you want to calculate the size of an array with code, make sure to use the **size** attribute: **x.size** or **x.reshape((2,6)).size**:

In [None]:
# Print the size of `x` to see what's possible
print(x.size)

# Reshape `x` to (2,6)
print(x.reshape((2,6)))

# Flatten `x`
z = x.ravel()

# Print `z`
print(z)

Here also is a beautiful sumary of the reshape method in Numpy:

![reshape](https://i.imgur.com/ZrEEtZe.png)

If all else fails, you can also append an array to your original one or insert or delete array elements to make sure that your dimensions fit with the other array that you want to use for your computations.

Another operation that you might keep handy when you’re changing the shape of arrays is **ravel()**. This function allows you to flatten your arrays. This means that if you ever have 2D, 3D or n-D arrays, you can just use this function to flatten it all out to a 1-D array.

## Reshape Array for Machine Learning

For example, some libraries, such as scikit-learn, may require that a one-dimensional array of output variables (y) be shaped as a two-dimensional array with one column and outcomes for each column.

Some algorithms, like the Long Short-Term Memory recurrent neural network in Keras, require input to be specified as a three-dimensional array comprised of samples, timesteps, and features.

### Reshape 1D to 2D Array

It is common to need to reshape a one-dimensional array into a two-dimensional array with one column and multiple arrays.

In [None]:
# reshape 1D array
from numpy import array
from numpy import reshape
# define array
data = array([11, 22, 33, 44, 55])
print(data.shape)
# reshape
data = data.reshape((data.shape[0], 1))
print(data.shape)

### Reshape 2D to 3D Array

It is common to need to reshape two-dimensional data where each row represents a sequence into a three-dimensional array for algorithms that expect multiple samples of one or more time steps and one or more features.

A good example is the **LSTM recurrent neural network** model in the Keras deep learning library.

The reshape function can be used directly, specifying the new dimensionality. This is clear with an example where each sequence has multiple time steps with one observation (feature) at each time step.

We can use the sizes in the shape attribute on the array to specify the number of samples (rows) and columns (time steps) and fix the number of features at 1.

In [None]:
# reshape 2D array
from numpy import array
# list of data
data = [[11, 22],
        [33, 44],
        [55, 66]]
# array of data
data = array(data)
print(data.shape)
# reshape
data = data.reshape((data.shape[0], data.shape[1], 1))
print(data.shape)

## Additional Shape Method


### How To Append Arrays
When you append arrays to your original array, they are “glued” to the end of that original array. If you want to make sure that what you append does not come at the end of the array, you might consider inserting it. Go to the next section if you want to know more.

Appending is a pretty easy thing to do thanks to the NumPy library; You can just make use of the **np.append()**

In [None]:
# Append a 1D array to your `my_array`
new_array = np.append(my_array, [7, 8, 9, 10])

# Print `new_array`
print(new_array)

# Append an extra column to your `my_2d_array`
new_2d_array = np.append(my_2d_array, [[7], [8]], axis=1)

# Print `new_2d_array`
print(new_2d_array)

Note how, when you append an extra column to **my_2d_array**, the **axis** is specified. Remember that axis 1 indicates the columns, while axis 0 indicates the rows in 2-D arrays.

### How To Insert And Delete Array Elements
Next to appending, you can also insert and delete array elements. As you might have guessed by now, the functions that will allow you to do these operations are **np.insert()** and **np.delete()**:

In [None]:
# Insert `5` at index 1
np.insert(my_array, 1, 5)

# Delete the value at index 1
np.delete(my_array,[1])

### How To Join And Split Arrays
You can also ‘merge’ or join your arrays. There are a bunch of functions that you can use for that purpose and most of them are listed below.

Try them out, but also make sure to test out what the shape of the arrays is in the IPython shell. The arrays that have been loaded are **x, my_array, my_resized_array** and **my_2d_array**.

In [None]:
# Concatentate `my_array` and `x`
print(np.concatenate((my_array,x)))

# Stack arrays row-wise
print(np.vstack((my_array, my_2d_array)))

# Stack arrays row-wise
print(np.r_[my_resized_array, my_2d_array])

# Stack arrays horizontally
print(np.hstack((my_resized_array, my_2d_array)))

# Stack arrays column-wise
print(np.column_stack((my_resized_array, my_2d_array)))

# Stack arrays column-wise
print(np.c_[my_resized_array, my_2d_array])

You’ll note a few things as you go through the functions:
* The number of dimensions needs to be the same if you want to concatenate two arrays with **np.concatenate()**. As such, if you want to concatenate an array with **my_array**, which is 1-D, you’ll need to make sure that the second array that you have, is also 1-D.
* With **np.vstack()**, you effortlessly combine **my_array** with **my_2d_array**. You just have to make sure that, as you’re stacking the arrays row-wise, that the number of columns in both arrays is the same. As such, you could also add an array with shape **(2,4)** or **(3,4)** to **my_2d_array**, as long as the number of columns matches. Stated differently, the arrays must have the same shape along all but the first axis. The same holds also for when you want to use **np.r[].**
* For **np.hstack()**, you have to make sure that the number of dimensions is the same and that the number of rows in both arrays is the same. That means that you could stack arrays such as **(2,3)** or **(2,4)** to **my_2d_array**, which itself as a shape of **(2,4)**. Anything is possible as long as you make sure that the number of rows matches. This function is still supported by NumPy, but you should prefer **np.concatenate()** or **np.stack()**.
* With **np.column_stack()**, you have to make sure that the arrays that you input have the same first dimension. In this case, both shapes are the same, but if my_resized_array were to be **(2,1)** or **(2,)**, the arrays still would have been stacked.
* **np.c_[]** is another way to concatenate. Here also, the first dimension of both arrays needs to match.

When you have joined arrays, you might also want to split them at some point. Just like you can stack them horizontally, you can also do the same but then vertically. You use **np.hsplit()** and **np.vsplit()**, respectively:

In [None]:
# Split `my_stacked_array` horizontally at the 2nd index
print(np.hsplit(my_stacked_array, 2))

# Split `my_stacked_array` vertically at the 2nd index
print(np.vsplit(my_stacked_array, 2))

What you need to keep in mind when you’re using both of these split functions is probably the shape of your array. Let’s take the above case as an example: **my_stacked_array** has a shape of **(2,8)**. If you want to select the index at which you want the split to occur, you have to keep the shape in mind.

### You've reached the end of this tutorial. 

Congratulations, you have reached the end of this NumPy tutorial! You have covered a lot of ground, so now you have to make sure to retain the knowledge that you have gained. Don’t forget to try and implement what you have learnt by yourselfs!

***Thansk for reading! ^^***