# Python Basics: Introduction to Numpy and Datetime

## Numpy

Numpy offers a wide set of numerical operations, thus the name numpy. Main object in numpy is the numpy array.


In [None]:
import numpy as np

### Numpy arrays: Basics

Numpy arrays can also be interpreted as vectors. Multidimensional vectors then correspond to matrices. Several
numpy operations also allows for "reshaping" a vector into a different size.

In [None]:
# One dimensional numpy array

data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)

arr1

Note that a one dimensional np array is simply an n-array and NOT a nx1 vector. This quite confusing
if vector arithmetic is done. Making a vector explicitly a nx1 in the mathematical sense requires
us to add a dimension.

In [None]:
data1 = [[6], [7.5], [8], [0], [1]]
arr1 = np.array(data1)

arr1

or with another method

In [None]:
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)[:, np.newaxis]

arr1

`np.newaxis()` increases the dimension of a series/vector/matrix by 1.

In [None]:
# Multi dimension
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)

In [None]:
# data generation
arr_random = np.random.randn(5, 4)      # generating random numbers
arr_ones = np.ones((2, 4))              # generate array with ones
arr_zeros = np.zeros((3, 4))            # generate array with zeros

Generating zeros, or simply random elements like above is a way of pre-allocating memory. This is in particular
important if large datasets are considered.

Sometimes, vectors can be reshaped into a matrix. Mostly used in mutivariate timeseries analysis and
high dimensional data analysis.


In [None]:
vector = np.arange(0,10)
vector

In [None]:
matrix = np.reshape(vector, (2,5))
matrix


### Arithmetics with Arrays

#### Basic functions

In [None]:
# Arrays of equal length
arr = np.array([[1., 2., 3.], [4., 5., 6.]])

In [None]:
# Basic operation is done component-wise
arr * arr

In [None]:
arr + arr

In [None]:
1/arr

In [None]:
arr**2

In [None]:
# Comparison
arr2 = np.array([[1., 33., 1.], [4., 321., 6.]])
arr < arr2

#### Some Statistical Methods

In [None]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])

In [None]:
# Basic sum function
arr.sum()

In [None]:
arr.sum(axis=0)

In [None]:
# Mean, Variance, Standard Deviation
arr.mean(axis=1)
arr.std()
arr.var()

In [None]:
# Cumulative sum
arr.cumsum()

#### Some Linear Algebra

In [None]:
from numpy.linalg import eig, inv, solve

x = np.array([[1., 2., 3.], [4., 5., 6.]])
y = np.array([[6., 23.], [-1, 7], [8, 9]])
A = np.array([[1., 2., 3.], [4., 5., 6.], [7., 3., 4.]])
b = np.array([1, 3, 4])

In [None]:
# Matrix multiplication
x.dot(y)

In [None]:
np.dot(x, y)
# use multidot for multiple matrix operations

In [None]:
# Or use the @ operator

x @ y

In [None]:
# Invert a matrix
inv(A)

In [None]:
# solve linear system Ax=b
solve(A,b)

In [None]:
# Eigenvalue decomposition
eig_val, eig_vec = eig(A)


## Datetime

Finally, the datetime module. Since financial data is inevitably tied with a time component, understanding
the mechanics of datetime is crucial. The datetime module allows for consistent arithmetic with
time data.

In [None]:
import datetime

### Datetime Objects

Datetime objects basically consists of two parts, i.e. date and time. Date covers the year, month
and day whereas time covers hours, minutes and seconds. Hence, the usage of the components
depend on the frequency of data (quarterly, high-frequency, etc.).

#### Date

Date consists of year, month and da

In [None]:
# initiate datetime object

bday = datetime.date(2021, 12, 14)

In [None]:
# Access the elements of a datetime object with built-in functions of the object

In [None]:
bday.year

In [None]:
bday.month

In [None]:
bday.day

In [None]:
# return week day
tday = datetime.date.today()
tday.weekday()          # standard format monday=0, sunday=6
tday.isoweekday()       # intuitive format monday=1, sunday=7

#### Time

Precision up to microseconds.

In [None]:
t = datetime.time(3, 59, 30, 1234)

In [None]:
t.hour

In [None]:
t.minute

In [None]:
t.second

In [None]:
t.microsecond

#### Datetime

Datetime objects is the summary of date and time.

In [None]:
dt = datetime.datetime(2020, 11, 4, 14, 10, 30)
print(dt)

dt.year

#### Timedeltas

With datetime objects, it is now easy to compute distance between timepoints.

In [None]:
# date + timedelta = date
timedelta = datetime.timedelta(days=7)

In [None]:
tday + timedelta

In [None]:
tday - timedelta

In [None]:
# date - date = timedelta
timedelta = bday - tday

In [None]:
timedelta.total_seconds()       # timedelta in seconds

In [None]:
timedelta.days                  # timedelta in days


#### Datetime conversions

In [None]:
# String to datetime

dt_string = 'November 04, 2020'
dt = datetime.datetime.strptime(dt_string, '%B %d, %Y')

# for formatting codes, see: https://docs.python.org/3/library/datetime.html

In [None]:
import pandas as pd

dti = pd.date_range("2018-01-01", periods=3, freq="H")