# NumPy

* Jake VanderPlas. 2016. *Python Data Science Handbook: Essential Tools for Working with Data*. O'Reilly Media, Inc.
* Chapter 2 - Introduction to NumPy
* https://github.com/jakevdp/PythonDataScienceHandbook

In [2]:
import numpy as np
np.__version__

'1.26.4'

In [11]:
# type TAB to get the HUGE numpy namespace
#np.

## Data Types

NumPy provides an alternative implementation for numerical arrays, improving the performance of data-driven computation compared to standard Python built-in lists.


### Python Integers vs C Integers

* Python `int`s are *complex* objects (written in C)
   * Dynamically-typed language
   * Almost infinite integer arithmetic precision
```c
struct _longobject {
    long ob_refcnt;         # reference count
    PyTypeObject *ob_type;  # type of the variable
    size_t ob_size;         # size of the following data members
    long ob_digit[1];       # integer value encoded into a long array
};
```
* C language integers (char, short, int, long, long...) are simple references to a position in memory whose bytes encode an integer value.

### Python Lists vs NumPy Arrays

* Python `list`s are *complex* objects (much more than `int`s)
   * `list`s are heterogeneous
   * Different object types are different sizes 
   * `list`s contain an array with <u>references</u> to each object
   * Contain a pointer to a block of pointers, each of which points to a Python object
* *Standard* Numpy arrays are homogeneus.
   * Contain a single pointer to one contiguous block of data.

<center><img src="img/array_vs_list.png" alt="NumPy array vs python list" style="width: 100%;"/></center>

In [15]:
#help(np.array)

## NumPy Arrays

### Creating Arrays from Python Lists

* `np.array(some_list)` &rarr; create an (homogeneous) array
* `np.array(some_list, dtype=<data type>)` &rarr; create an array of a given type

<br>

In [28]:
np.array([1, 4, 2, 5, 3])

array([1, 4, 2, 5, 3])

In [31]:

np.array([1, 4, 2, 5, 3], dtype='float32')

array([1., 4., 2., 5., 3.], dtype=float32)

<br>
If types do not match, NumPy will upcast if possible:

In [17]:
np.array([1, 4, 2, 5.9, 3])

array([1. , 4. , 2. , 5.9, 3. ])

<br>
Nested lists result in multi-dimensional arrays

In [32]:
np.array([[ 0,  1,  2,  3], [10, 11, 12, 13], [20, 21, 22, 23]])

array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23]])

### NumPy Standard Data Types

when constructing an array, the type (`dtype`) argument can be specified using:
 * string &rarr; `dtype='float32'`
 * NumPy object &rarr; `dtype=np.float32`

In [40]:

np.array([1, 4, 2, 5, 3], dtype='float32')

array([1., 4., 2., 5., 3.], dtype=float32)

In [41]:

np.array([1, 4, 2, 5, 3], dtype=np.float32)

array([1., 4., 2., 5., 3.], dtype=float32)

|  Data type | Description                                                                   |   |   |   |
|:----------:|-------------------------------------------------------------------------------|---|---|---|
| bool_      | Boolean (True or False) stored as a byte                                      |   |   |   |
| int_       | Default integer type (same as C long; normally either int64 or int32)         |   |   |   |
| intc       | Identical to C int (normally int32 or int64)                                  |   |   |   |
| intp       | Integer used for indexing (same as C ssize_t; normally either int32 or int64) |   |   |   |
| int8       | Byte (-128 to 127)                                                            |   |   |   |
| int16      | Integer (-32768 to 32767)                                                     |   |   |   |
| int32      | Integer (-2147483648 to 2147483647)                                           |   |   |   |
| int64      | Integer (-9223372036854775808 to 9223372036854775807)                         |   |   |   |
| uint8      | Unsigned integer (0 to 255)                                                   |   |   |   |
| uint16     | Unsigned integer (0 to 65535)                                                 |   |   |   |
| uint32     | Unsigned integer (0 to 4294967295)                                            |   |   |   |
| uint64     | Unsigned integer (0 to 18446744073709551615)                                  |   |   |   |
| float_     | Shorthand for float64.                                                        |   |   |   |
| float16    | Half precision float: sign bit, 5 bits exponent, 10 bits mantissa             |   |   |   |
| float32    | Single precision float: sign bit, 8 bits exponent, 23 bits mantissa           |   |   |   |
| float64    | Double precision float: sign bit, 11 bits exponent, 52 bits mantissa          |   |   |   |
| complex_   | Shorthand for complex128.                                                     |   |   |   |
| complex64  | Complex number, represented by two 32-bit floats                              |   |   |   |
| complex128 | Complex number, represented by two 64-bit floats                              |   |   |   |

### Creating Arrays from Scratch

<br>

`np.zeros(10, dtype='int64')` &rarr; a length-10 integer array filled with zeros 

In [42]:
np.zeros(10, dtype='int64')

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

<br>

`np.ones((3,5), dtype='float32')` &rarr; a 3,5 floating-point array filled with ones 

In [46]:
np.ones((3,5), dtype='float32')

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]], dtype=float32)

## UFuncs - Universal Functions

## Aggregations Functions

## Broadcasting

## Boolean Manipulation

## Fancy Indexing