# Introduction to NumPy Library

- NumPy provides the numerical backend for nearly every scientific or technical library for Python. In fact, NumPy is the foundation library for scientific computing in Python since it provides data structures and high-performing functions that the basic Python standard library cannot provide. Therefore, knowledge of this library is essential in terms of numerical calculations since its correct use can greatly influence the performance of your computations. 

- NumPy provides the following additional features:
   - `Ndarray`: A multidimensional array much faster and more efficient
than those provided by the basic package of Python. The core of NumPy is implemented in C and provides efficient functions for manipulating and processing arrays.

  - `Element-wise computation`: A set of functions for performing this type of calculation with arrays and mathematical operations between arrays.
  
  - `Integration with other languages such as C, C++, and FORTRAN`: A
set of tools to integrate code developed with these programming
languages.

- At a first glance, NumPy arrays bear some resemblance to Python’s list data structure. But an important difference is that while Python lists are generic containers of objects:
  - NumPy arrays are homogenous and typed arrays of fixed size. 
  - Homogenous means that all elements in the array have the same data type. 
  - Fixed size means that an array cannot be resized (without creating a new array). 

## Importing the modules

In order to use the NumPy library, we need to import it in our program. By convention,
the numPy module imported under the alias np, like so:

In [2]:
import numpy as np

After this, we can access functions and classes in the numpy module using the np
namespace. Throughout this notebook, we assume that the NumPy module is imported in
this way.

## The NumPy Array Object

- The core of the NumPy Library is one main object: `ndarray` (which stands for N-dimensional array)
- This object is a multi-dimensional homogeneous array with a predetermined number of items
- In addition to the data stored in the array, this data structure also contains important metadata about the array, such as its shape, size, data type, and other attributes. 


**Basic Attributes of the ndarray Class**

| Attribute | Description                                                                                              |
|-----------|----------------------------------------------------------------------------------------------------------|
| shape     | A tuple that contains the number of elements (i.e., the length)  for each dimension (axis) of the array. |
| size      | The total number elements in the array.                                                                  |
| ndim      | Number of dimensions (axes).                                                                             |
| nbytes    | Number of bytes used to store the data.                                                                  |
| dtype     | The data type of the elements in the array.                                                              |
| itemsize  | Defines teh size in bytes of each item in the array.                                                     |
| data      | A buffer containing the actual elements of the array.                                                    |

In [3]:
data = np.array([[10, 2], [5, 8], [1, 1]])
data

array([[10,  2],
       [ 5,  8],
       [ 1,  1]])

In [4]:
type(data)

numpy.ndarray

In [5]:
data.ndim

2

In [6]:
data.shape

(3, 2)

In [7]:
data.size

6

In [8]:
data.dtype

dtype('int64')

In [9]:
data.nbytes

48

In [26]:
data.itemsize

4

In [29]:
data.data

<memory at 0x1186dadc8>

Here the ndarray instance data is created from a nested Python list using the
function `np.array`. More ways to create ndarray instances from data and from rules of
various kinds are introduced later in this tutorial. 

### Data types

- `dtype` attribute of the `ndarray` describes the data type of each element in the array.
- Since NumPy arrays are homogeneous, all elements have the same data type. 

**Basic Numerical Data Types Available in NumPy**

| dtype   | Variants                            | Description                           |
|---------|-------------------------------------|---------------------------------------|
| int     | int8, int16, int32, int64           | Integers                              |
| uint    | uint8, uint16, uint32, uint64       | Unsigned (non-negative) integers      |
| bool    | Bool                                | Boolean (True or False)               |
| float   | float16, float32, float64, float128 | Floating-point numbers                |
| complex | complex64, complex128, complex256   | Complex-valued floating-point numbers |

Once a NumPy array is created, its `dtype` cannot be changed, other than by creating a new copy with type-casted array values

In [11]:
data = np.array([5, 9, 87], dtype=np.float32)
data

array([ 5.,  9., 87.], dtype=float32)

In [12]:
data = np.array(data, dtype=np.int32) # use np.array function for type-casting
data

array([ 5,  9, 87], dtype=int32)

In [13]:
data = np.array([5, 9, 87], dtype=np.float32)
data

array([ 5.,  9., 87.], dtype=float32)

In [14]:
data = data.astype(np.int32) # Use astype method of the ndarray class for type-casting
data

array([ 5,  9, 87], dtype=int32)

**Data Type Promotion**

When working with NumPy arrays, the data type might get promoted from one type to another, if required by the operation. 
For instance, adding float-value and integer-valued arrays, the resulting array is a float-valued array:

In [16]:
arr1 = np.array([0, 2, 3], dtype=float)
arr1

array([0., 2., 3.])

In [19]:
arr2 = np.array([10, 20, 30], dtype=int)
arr2

array([10, 20, 30])

In [21]:
res = arr1 + arr2
res

array([10., 22., 33.])

In [22]:
res.dtype

dtype('float64')

**NOTE:**
    
In some cases, depending on the application and its requirements, it is essential to create arrays with data type appropriately set to right data type. The default data type is `float`:

In [24]:
np.sqrt(np.array([0, -1, 2]))

  """Entry point for launching an IPython kernel.


array([0.        ,        nan, 1.41421356])

In [25]:
np.sqrt(np.array([0, -1, 2], dtype=complex))

array([0.        +0.j, 0.        +1.j, 1.41421356+0.j])

Here, using the `np.sqrt` function to compute the square root of each element in
an array gives different results depending on the data type of the array. Only when the data type of the array is complex is the square root of `–1` resulting in the imaginary unit (denoted as `1j` in Python).

### Memory layout of multi-dimensional arrays

Multidimensional arrays are stored as contiguous data in memory. There's freedom of choice in how to arrange the array elements in this memory segment. Consider the case of a two-dimensional array, containing rows and columns:

- One possible way to store this array as a consecutive sequence of values is to store the rows after each other, and another equally valid approach is to store the columns one after another. 

- The former is called **row-major** format and the latter is **column-major** format. 


<div class="row" style="display:flex;">
  <div class="column" style="flex:50%; padding:5px;">
    <img src="https://i.imgur.com/XPKvEXc.jpg" alt="row-major" style="width:100%">
      <figcaption>Row-major format</figcaption>
  </div>
  <div class="column" style="flex:50%; padding:5px;">
    <img src="https://i.imgur.com/xXItpc4.jpg" alt="colum-major" style="width:100%">
      <figcaption>Colum-major format</figcaption>
  </div>
</div>

- Whether to use row-major or column-major is a matter of conventions, and row-major format is used, for example, in the C programming language, whereas Fortran uses the column-major format. 

- A NumPy array can be specified to be stored in row-major format, using the keyword argument `order='C'`, and the column-major format, using the keyword argument `order='F'`, when the array is created or reshaped. 

- The default format is row-major. 

- The `'C'` or `'F'` ordering of NumPy array is particularly relevant when NumPy arrays are used in interfaces with software written in C and Fortran, which is often required when working with numerical computing with Python. 

- Row-major and column-major ordering are special cases of strategies for mapping
the index used to address an element, to the offset for the element in the array’s memory segment. 

- In general, the NumPy array attribute `ndarray.strides` defines exactly how this mapping is done. 

- The strides attribute is a tuple of the same length as the number of axes (dimensions) of the array. Each value in strides is the factor by which the index for the corresponding axis is multiplied when calculating the memory offset (in bytes) for a given index expression.


Let's see how this looks:

In [43]:
arrc = np.array([[1, 2, 3], [11, 12, 13], [21, 22, 23]], dtype='uint8', order='C')
arrf = np.array([[1, 2, 3], [11, 12, 13], [21, 22, 23]], dtype='uint8', order='F')

In [49]:
arrc

array([[ 1,  2,  3],
       [11, 12, 13],
       [21, 22, 23]], dtype=uint8)

In [54]:
arrc.itemsize # Each item uses 1 byte because the data type is uint8

1

The strides attribute of this array is therefore `(1x3, 1x1) = (3, 1)`, because each increment of `m=3` in `A[n, m]` increases the memory offset with one item or 1 byte. Likewise, each increment of n increases the memory offset with three items or 3 bytes (because the second dimension of the array has lenght 3)

In [55]:
arrc.strides

(3, 1)

In [56]:
'  '.join(str(x) for x in np.nditer(arrc))

'1  2  3  11  12  13  21  22  23'

In `"C"` order, elements of rows are contiguous, as expected. Let's try Fortran layout now:

In [57]:
arrf

array([[ 1,  2,  3],
       [11, 12, 13],
       [21, 22, 23]], dtype=uint8)

In [58]:
arrf.strides

(1, 3)

In [59]:
'  '.join(str(x) for x in np.nditer(arrf))

'1  11  21  2  12  22  3  13  23'

- Using strides to describe the mapping of array index to array memory offset is clever because it can be used to describe different mapping strategies, and many common operations on arrays, such as for example the transpose, can be implemented by simply changing the strides attribute, which can eliminate the need for moving data around in the memory. 

- Operations that only require changing the strides attribute result in new ndarray objects that refer to the same data as the original array. Such arrays are called views. 

- For efficiency, NumPy strives to create views rather than copies when applying operations on arrays. This is generally a good thing, but it is important to be aware of that some array operations result in views rather than new independent arrays, because modifying their data also modifies the data of the original array.

## Creating Arrays

### Arrays Created from Lists and Other Array-like Objects

### Arrays Filled with Constant Values

### Arrays with Incremental Sequences

### Arrays Filled with Logarithmic Sequences

### Meshgrid Arrays

### Creating Uninitialized Arrays

### Creating Arrays with Properties of Other Arrays

### Creating Matrix Arrays

## Indexing and Slicing

### One-Dimensional Arrays

### Multidimensional Arrays

### Copies or Views of Objects

### Broadcasting

### Fancy Indexing and Boolean-Valued Index

## Reshaping and Resizing

## Vectorized Expressions

### Arithmetic Operations

### Elementwise Functions

### Aggregate Functions

### Aggregate Functions

### Boolean Arrays and Conditional Expressions

### Set Operations

### Operations on Arrays

## Structure Arrays

## Matrix