# Numpy

This notebook will show examples of use and operations with the Numpy library. Part of the information shown in the booklet has been extracted from the book: "Python Data Science HandBook (2017)" [online version](https://jakevdp.github.io/PythonDataScienceHandbook/).

*The text is released under the CC-BY-NC-ND license, and code is released under the MIT license. If you find this content useful, please consider supporting the work by [buying the book!](http://shop.oreilly.com/product/0636920034919.do).*

Table of contents:

* [Creating Arrays from Scratch](#Creating-Arrays-from-Scratch)
* [Data Manipulation using NumPy](#Data-Manipulation-using-NumPy)
  * [Attributes](#Attributes)
  * [Array Indexing](#Array-Indexing.-Accesing-to-single-elements)


## Creating Arrays from Scratch

Especially for larger arrays, it is **more efficient** to create arrays from scratch **using routines** built into NumPy. Here are several examples. We'll start with the standard NumPy import, under the alias np:

In [1]:
import numpy as np

In [2]:
# Create a length-10 integer array filled with zeros
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [3]:
# Create a 3x5 floating-point array filled with ones
np.ones((3, 5), dtype=float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [4]:
# Create a 3x5 array filled with 3.14
np.full((3, 5), 3.14)

array([[3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14]])

In [5]:
# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0, 20, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [9]:
# Create an array of five values evenly spaced between 0 and 1
np.linspace(0, 1, 5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [2]:
a = np.array([[1., 2.],[4., 5.]])
print(a)

[[1. 2.]
 [4. 5.]]


In [12]:
# Create a 3x3 array of uniformly distributed
# random values between 0 and 1
np.random.random((3, 3))

array([[0.74778554, 0.97596455, 0.25233894],
       [0.75400273, 0.59962465, 0.35108106],
       [0.13527468, 0.98132142, 0.04422794]])

In [13]:
# Create a 3x3 array of normally distributed random values
# with mean 0 and standard deviation 1
np.random.normal(0, 1, (3, 3))

array([[-0.47480562, -0.89955885,  1.29303459],
       [ 1.04271041,  0.46552169, -0.60269015],
       [ 0.20269149, -0.46546784, -0.19825129]])

In [14]:
# Create a 3x3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3, 3))

array([[5, 0, 6],
       [2, 3, 7],
       [6, 5, 1]])

In [15]:
# Create a 3x3 identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [16]:
# Create an uninitialized array of three integers
# The values will be whatever happens to already exist at that memory location
np.empty(3)

array([1., 1., 1.])

## Data Manipulation using NumPy

These are the categories that will be seen:

- **Attributes of arrays**: Determining the size, shape, memory consumption, and data types of arrays
- **Indexing of arrays**: Getting and setting the value of individual array elements
- **Slicing of arrays**: Getting and setting smaller subarrays within a larger array
- **Reshaping of arrays**: Changing the shape of a given array
- **Joining and splitting of arrays**: Combining multiple arrays into one, and splitting one array into many

### Attributes

Each array has the following attributes. Another useful attribute is the `dtype`, the data type of the array.

In [21]:
import numpy as np
np.random.seed(0)  # seed for reproducibility

x3 = np.random.randint(10, size=(3, 4, 5))  # Three-dimensional array

print("x3 ndim :", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size :", x3.size)
print("x3 dtype:", x3.dtype)
print("itemsize:", x3.itemsize, "bytes")
print("nbytes  :", x3.nbytes, "bytes")

x3 ndim : 3
x3 shape: (3, 4, 5)
x3 size : 60
x3 dtype: int64
itemsize: 8 bytes
nbytes  : 480 bytes


In general, we expect that nbytes is equal to itemsize times size (`x3_size * 8 bytes/per = 480 bytes`)

### Array Indexing. Accesing to single elements.

In [23]:
x1 = np.random.randint(10, size=6)  # One-dimensional array
print(x1)
print(x1[0])
print(x1[-1])

[8 1 1 7 9 9]
8
9


In [24]:
x2 = np.random.randint(10, size=(3, 4))  # Two-dimensional array
print(x2)
print(x2[0, 0])
print(x2[2, -1])

[[3 6 7 2]
 [0 3 5 9]
 [4 4 6 4]]
3
4


Values can also be modified using any of the above index notation. Keep in mind that, unlike Python lists, NumPy arrays have a **fixed type**. This means, for example, that if you attempt to insert a floating-point value to an integer array, the value will be silently **truncated**. Don't be caught unaware by this behavior!

In [28]:
x2[0, 0] = 12
x2

array([[12,  6,  7,  2],
       [ 0,  3,  5,  9],
       [ 4,  4,  6,  4]])

In [29]:
x2[0, 0] = 3.14159  # this will be truncated!
x2

array([[3, 6, 7, 2],
       [0, 3, 5, 9],
       [4, 4, 6, 4]])

### Array Slicing: Accessing Subarrays.

In [3]:
np.mean(a, axis=1)

array([1.5, 4.5])

Notes:

For Table Of Contents installing, type the following commands:
```bash
jupyter nbextension install --user https://rawgithub.com/minrk/ipython_extensions/master/nbextensions/toc.js
curl -L https://rawgithub.com/minrk/ipython_extensions/master/nbextensions/toc.css > $(jupyter --data-dir)/nbextensions/toc.css
jupyter nbextension enable toc
```