$\LARGE\text{Data analysis with numpy and pandas - part 1}$

$\small\text{Ralph Tambala}$

## The <code>numpy</code> Basics

NumPy's main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of
the same type, indexed by a tuple of non-negative integers. In NumPy dimensions are called axes.

For example, the coordinates of a point in 3D space $[1, 2, 1]$ has one axis. That axis has 3 elements in it, so we
say it has a length of 3. In the example pictured below, the array has 2 axes. The first axis has a length of 2, the second
axis has a length of 3.

In [None]:
import numpy as np
arr = np.array([[ 1, 0, 0], [ 0, 1, 2]])
arr

NumPy's array class is called **ndarray**. It is also known by the alias array.

In [None]:
type(arr)

The more important properties of an ndarray object are: <code>ndim, shape, size, dtype</code>


- **ndarray.ndim**: the number of axes (dimensions) of the array.

In [None]:
arr.ndim

- **ndarray.shape**: the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.

In [None]:
arr.shape

- **ndarray.size**: the total number of elements of the array. This is equal to the product of the elements of shape.

In [None]:
arr.size

- **ndarray.dtype**: an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.

In [None]:
arr.dtype

## Array Creation

There are several mechanisms for creating arrays. In this session, we are going to cover the 3 below:

>1. Conversion from other Python structures (e.g., lists, tuples)
2. Intrinsic numpy array creation objects (e.g., arange, ones, zeros, etc.)
>3. Use of special library functions (e.g., random)

### Creating numpy arrays from lists and tuples

In general, numerical data arranged in an array-like structure in Python can be converted to arrays through the use of
the array() function. The most obvious examples are lists and tuples.

Examples:

In [2]:
ex1 = np.array([1, 2, 3, 4, 5.0])
ex1.dtype

NameError: name 'np' is not defined

In [1]:
ex1.dtype


NameError: name 'ex1' is not defined

In [None]:
ex2 = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
np.array(ex2)

In [None]:
np.array(ex2).dtype

In [None]:
ex3 = ((1, 1), (1, 1))
np.array(ex3)

### Intrinsic numpy array creation
NumPy has built-in functions for creating arrays from scratch:

#### np.zeros(shape) 
**np.zeros(shape)** will create an array filled with 0 values with the specified shape. The default dtype is float64.
> NB: **shape** is a tuple of dimensions of the array

In [None]:
ex5 = np.zeros((2, 4))
ex5

In [None]:
ex5.dtype

#### np.ones(shape) 
**np.ones(shape)** will create an array filled with 1 values. It is identical to zeros in all other respects.

In [None]:
np.ones((2, 4))

In [None]:
np.ones((10), dtype='int') # to get an integer type

#### np.arange()
**np.arange()** will create arrays with regularly incrementing values.

In [None]:
np.arange(10) # always indicate stop-1, default start is 1, default step is 1

In [None]:
np.arange(1, 11) # one can change default start value

In [None]:
np.arange(1, 10, 2) # np.

#### np.linspace()
**np.linspace()** will create arrays with a specified number of elements, and spaced equally between the specified beginning
and end values.

In [None]:
np.linspace(1, 10, 25)

#### np.eye()
**np.eye()** creates an identity matrix

In [None]:
np.eye(4)

### Create numpy arrays using special libraries

In [None]:
np.random.choice((1,2,3,4,5,6), size=3, replace=True, p=None)

In [None]:
np.random.binomial(5, 0.5, size=10)

## Shape Manipulation

### ndarray.ravel() 
**ndarray.ravel()** returns the array flattened but does not modify the original shape.

In [None]:
arr = np.random.random_sample((3,3))
arr

In [None]:
arr.shape

In [None]:
arr.ravel()

In [None]:
arr.shape # still maintains original shape

### reshape(shape)
**ndarray.reshape(shape)** method returns its argument with a modified shape:

In [None]:
arr = np.ones((4, 4), dtype='int32')
arr

In [None]:
arr.reshape((2, 8))

In [None]:
arr.shape

### ndarray.resize(shape)
**ndarray.resize(shape)** method modifies the array itself:

In [None]:
arr = np.zeros((2, 4), dtype='int32')
arr

In [None]:
arr.resize((4,2))

In [None]:
arr

## Matrix Operations

### scalar arithmetic
Aany arithmetic operation between a scalar and a numpy array results into an array, same as in maths.

In [None]:
arr = np.eye(3, dtype=int)
arr

In [None]:
arr = arr * 2
arr

In [None]:
arr = arr ** 3
arr

In [None]:
arr + arr / 4 + 4

### ndarray.T or ndarray.transpose()
**ndarray.T** returns the transposed array

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
arr

In [None]:
arr.T

In [None]:
arr.transpose()

### dot product vs cross product vs element-wise product


In [None]:
# dot product
p = np.array([1, 2, 3])
q = np.array([4, 5, 6])
np.dot(p, q)

In [None]:
p.dot(q)

In [None]:
p @ q

In [None]:
# cross product
np.cross(p, q)

In [None]:
# element-wise product
p * q

## Indexing with Boolean arrays and other tricks

In [None]:
a = np.arange(12)**2 # the first 12 square numbers
i = np.array( [1,1,3,8,5] ) # an array of indices
a[i]

In [None]:
j = np.array( [[3, 4], [9, 1]] ) # a bidimensional array of indices
a[j] # the same shape as j

In [None]:
a = np.arange(3, 100, 3).reshape((3,11)) # numbers divisible by 3 below 100
a

In [None]:
a[:, 10]

In [None]:
a[1:, 1:5]

In [None]:
a > 50

In [None]:
a[a > 65]

In [None]:
a = np.ones((5,5), int)
a

In [None]:
a[:, 4] = 8 # change only last column to 8s
a

In [None]:
a[:2, :3] = 0 
a[3:, :3] = 0
a

## Practice Work

1. Import the numpy package under the alias np.

2. Write code to create an array similar to the matrices below:

> a. $$\begin{pmatrix}1 & 2 & 3 \end{pmatrix}$$

In [None]:
# a

> b. $$\begin{pmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \end{pmatrix}$$

In [None]:
# b

> c. $$\begin{pmatrix}1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \end{pmatrix}$$

In [None]:
# c

> d. $$\begin{pmatrix}1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\  0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{pmatrix}$$

In [None]:
# d

> e. $$\begin{pmatrix} 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2\\ 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 & 2 \end{pmatrix}$$

In [None]:
# e

3. Create an identity matrix of dimension 100 by 100

4. Create a 1-D array of even values from 2-2000000.

5. Create a 1-D array of 200 values between 1 and 10.

6. Create a 10 by 1000 matrix of counting values 1-10000. First row should have values 1, 2, ... 1000, second should have values 1001, 102, ..., 2000 and so on up to 10th row should have values 9001, 9002, ..., 10000. Below is the visual representation of the forementioned matrix.

$$\begin{pmatrix}
1 & 2 & 3 & & ... & & 999 & 1000 \\ 
1001 & 1002 & 1003 & & ... & & 1999 & 2000 \\ 
2001 & 2002 & 2003 & & ... & & 2999 & 3000 \\
... & ... & ... & ... & ... & ... & ... & ... \\ 
8001 & 8002 & 8003 & & ... & & 8999 & 9000 \\
9001 & 9002 & 9003 & & ... & & 9999 & 10000 \\
\end{pmatrix}$$

7. Flatten the matrix in **question 5**.

8. Reshape the flattened matrix in **Question 6** to a 100 by 100 matrix.

9. Extract all odd numbers from <code>arr</code>

In [None]:
arr = np.array([12, 11, 7, 4, 19, 62, 71, 10, 81, 100])

# code here

Output: <code>array([11, 7, 19, 71, 81])</code>

10. Find out how each of the following numpy functions may be used and give an example for illustration of usage:

> a. np.full()

> b. np.tile()

> c. np.vstack()

> d. np.hstack()