# Python Semantics:
A collection of lesser known Python features.

### List Slicing:
Useful for extracting sublists from a list.

In [None]:
a[start:stop]  # [start, stop)  -  inclusive start index, exclusive stop index
a[start:]      # [start, ...]   -  inclusive start index to the end of the list
a[:stop]       # [..., stop)    -  from the start of the list to the exclusive stop index
a[:]           # a copy of the whole array

a[start:stop:step]  # Optionally specify a step between elements to slice out

#### Negative Indices:

In [None]:
a[-1]    # Last item
a[-2:]   # Last two items 
a[:-2]   # Everything but the last two items

a[::-1]  # Reversed list (a hacky alternative to the reverse() method)



#### Extracting a column (not vanilla Python):

In [None]:
# Suppose we have this numpy array:
array([[11, 12, 13, 14],
       [21, 22, 23, 24],
       [31, 32, 33, 34],
       [41, 42, 43, 44],
       [51, 52, 53, 54],
       [61, 62, 63, 64],
       [71, 72, 73, 74],
       [81, 82, 83, 84],
       [91, 92, 93, 94]])


m[:,0]    # Gets us column 0: array([11, 21, 31, 41, 51, 61, 71, 81, 91])


In general, $\texttt{m[:,n]}$ slices out the $n^{th}$ column vector of the matrix $m$. Note: commas within the brackets $\texttt{[ ]}$ are not part of vanilla Python syntax. Packages like numpy or pytorch implement the behaviour of the comma for their $\text{__}\texttt{getitem}\text{__}$ function 

## Vectorisation:

Vectorisation is the process of taking an algorithm which operates on a single value at a time and making it run simultaneous operations on multiple values at a time.
- <a href="https://en.wikipedia.org/wiki/Array_programming">Array programming</a> &mdash; the application of operations on an entire set of values at once 
- CPUs and GPUs have parallelisation instructions &mdash; also called SIMD instructions (single instruction, multiple data).

In [78]:
import numpy as np
import time

a = np.random.rand(10000000)
b = np.random.rand(10000000)

# ===== Vectorised dot product =====
start = time.time()
weighted_sum = np.dot(a, b)
stop = time.time()

print("Vectorised time: {} milliseconds".format(round((stop - start) * 1000, 3)))
print("\tAnswer: {}".format(round(weighted_sum, 2)))

# ===== Non-vectorised dot product =====
start = time.time()
weighted_sum = 0
for i in range(10000000):
    weighted_sum += a[i] * b[i]
stop = time.time()

print("Non-vectorised time: {} milliseconds".format(round((stop - start) * 1000, 3)))
print("\tAnswer: {}".format(round(weighted_sum, 2)))


Vectorised time: 9.043 milliseconds
	Answer: 2500545.36
Non-vectorised time: 6140.246 milliseconds
	Answer: 2500545.36


<a href="https://numpy.org/doc/1.19/reference/routines.math.html">Numpy mathematical functions</a>


### Vector Operations:

#### Elementwise operations:
Any 2 vectors can be directly added, subtracted, multiplied, divided with +, -, \*, /.

In [94]:
a = np.array([1, 2, 3, 4, 5])
b = np.array([3, 3, 3, 3, 3])
print("a:     {}".format(a))
print("b:     {}".format(b))
print("a + b: {}".format(a + b))
print("a - b: {}".format(a - b))
print("a * b: {}".format(a * b))
print("a / b: {}".format(a / b))

a:     [1 2 3 4 5]
b:     [3 3 3 3 3]
a + b: [4 5 6 7 8]
a - b: [-2 -1  0  1  2]
a * b: [ 3  6  9 12 15]
a / b: [0.33333333 0.66666667 1.         1.33333333 1.66666667]


### Broadcasting:

#### Scalar expansion:
When a scalar is used in arithmetic operations with a vector, it is implicitly 'expanded' into a vector of the appropriate size. 

In [85]:
a = np.array([1, 2, 3, 4, 5]) + 1
print(a)

[2 3 4 5 6]


#### Dimension expansion:

When adding a vector to a matrix, numpy will attempt to broadcast them together. In the following example, the $1 \times 3$ vector will be expanded to a $3 \times 3$ matrix by duplicating the rows so that it can be directly added elementwise with the other $3 \times 3$ matrix.

In [101]:
a = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

b = np.array([
    [100, 200, 300]
])

print(a + b)

ValueError: operands could not be broadcast together with shapes (4,3) (2,3) 

### Matrix Operations:

In [95]:
A = np.array([
    [56,  0,   4.4, 68 ],
    [1.2, 104, 52,  8  ],
    [1.8, 135, 99,  0.9]
])

# Specifying axis=0 makes the operation work on each column
# Having axis=1 makes the operation work on each row instead
column_sums = np.sum(A, axis=0)
print(column_sums)

print(A / column_sums)

[ 59.  239.  155.4  76.9]
[[0.94915254 0.         0.02831403 0.88426528]
 [0.02033898 0.43514644 0.33462033 0.10403121]
 [0.03050847 0.56485356 0.63706564 0.01170351]]
