# NumPy

## Understanding Data Types in Python

* Effective data-driven science requires understanding how data is stored and manipulated
* How arrays of data are handled in the Python language itself and how NumPy improves on this

* __Statically-typed__ language like C or Java requires each variable to be explicitly declared
* __Dynamically-typed__ language like Python skips this specification

```C
/* C code */
int result = 0;
for(int i=0; i<100; i++){
    result += i;
}
```

* While in Python:

```python
# Python code
result = 0
for i in range(100):
    result += i
```

* In Python the types are dynamically inferred:

```python
# Python code
x = 4
x = "four"
```

* We've switched the contents of ``x`` from an integer to a string.

```C
/* C code */
int x = 4;
x = "four";  // FAILS
```

* Python variables are more than just their value; they also contain extra information about the type of the value

## The Basics of NumPy Arrays

We'll cover:

- **Attributes of arrays**: Determining the size, shape, memory consumption, and data types of arrays
- **Indexing of arrays**: Getting and setting the value of individual array elements
- **Slicing of arrays**: Getting and setting smaller subarrays within a larger array
- **Reshaping of arrays**: Changing the shape of a given array
- **Joining and splitting of arrays**: Combining multiple arrays into one, and splitting one array into many

### NumPy Array Attributes

We'll start by defining three random arrays, a one-dimensional, two-dimensional, and three-dimensional array

In [1]:
import numpy as np
np.random.seed(0)  # seed for reproducibility

x1 = np.random.randint(10, size=6)  # One-dimensional array
x2 = np.random.randint(10, size=(3, 4))  # Two-dimensional array
x3 = np.random.randint(10, size=(3, 4, 5))  # Three-dimensional array

Each array has attributes ``ndim`` (the number of dimensions), ``shape`` (the size of each dimension), and ``size`` (the total size of the array):

In [2]:
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)

x3 ndim:  3
x3 shape: (3, 4, 5)
x3 size:  60


Another useful attribute is the ``dtype``, the data type of the array

In [3]:
print("dtype:", x3.dtype)

dtype: int64


Other attributes include ``itemsize``, which lists the size (in bytes) of each array element, and ``nbytes``, which lists the total size (in bytes) of the array:

In [4]:
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")

itemsize: 8 bytes
nbytes: 480 bytes


In general, we expect that ``nbytes`` is equal to ``itemsize`` times ``size``.

### Array Indexing: Accessing Single Elements

The $i^{th}$ value (counting from zero) can be accessed by specifying the desired index in square brackets, just as with Python lists:

In [5]:
x1

array([5, 0, 3, 3, 7, 9])

In [6]:
x1[0]

5

In [7]:
x1[4]

7

To index from the end of the array, you can use negative indices:

In [8]:
x1[-1]

9

In [9]:
x1[-2]

7

In a multi-dimensional array, items can be accessed using a comma-separated tuple of indices:

In [10]:
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [11]:
x2[0, 0]

3

In [15]:
x2[2, 0]

1

In [14]:
x2[2, -1]

7

Values can also be modified using any of the above index notation:

In [16]:
x2[0, 0] = 12
x2

array([[12,  5,  2,  4],
       [ 7,  6,  8,  8],
       [ 1,  6,  7,  7]])

Keep in mind that, unlike Python lists, NumPy arrays have a fixed type.
This means that if you attempt to insert a floating-point value to an integer array, **the value will be silently truncated**

In [17]:
x1[0] = 3.14159  # this will be truncated!
x1

array([3, 0, 3, 3, 7, 9])