# Introduction to Python Container Types

Python has several **container** data types:
* String - **`str`**
* List and Tuple - **`list`**, **`tuple`**
* Dictionary - **`dict`**
* Set - **`set`**

In [2]:
s = 'String'
L = [1, 2, 3, 4, 5, 'Belagavi', [10, 20]] # Set is a collection of elements of any assorted type
T = (1, 2, 3, 4, 5) # Tuple is immutable
d = {'city': 'Belagavi', 'PIN': 590018} # key - value pairs. Keys can be either a string or an integer
x = {1, 2, 3, 1} # Set contains unique elements

In [3]:
print(type(s), s)
print(type(L), L)
print(type(T), T)
print(type(d), d)
print(type(x), x)

<class 'str'> String
<class 'list'> [1, 2, 3, 4, 5, 'Belagavi', [10, 20]]
<class 'tuple'> (1, 2, 3, 4, 5)
<class 'dict'> {'city': 'Belagavi', 'PIN': 590018}
<class 'set'> {1, 2, 3}


Collections have zero or more elements, **`len()`** returns the number of elements

In [4]:
print(len(s), s)
print(len(L), L)
print(len(T), T)
print(len(d), x)
print(len(x), x)

6 String
7 [1, 2, 3, 4, 5, 'Belagavi', [10, 20]]
5 (1, 2, 3, 4, 5)
2 {1, 2, 3}
3 {1, 2, 3}


Individual elements of all containers except **`dict`** and **`set`**, can be accessed using their index. Index of initial element is **`0`**. Elements of a **`dict`** are accessed using the **key**. Sets are not **subscriptable**.

In [5]:
print(s[0])
print(L[0])
print(T[0])
print(d['city'])
print(x)

S
1
1
Belagavi
{1, 2, 3}


Indexing can also be made in the backward direction. Last element in a subscriptable container is **`-1`** and the last but one is **`-2`**.

In [6]:
print(s[-1])
print(L[-1])
print(T[-1])

g
[10, 20]
5


Multiple elements of subscriptable containers can be accessed by **slicing**, in both forward and backward directions. Range of elements in a slice is indicated using two indices, starting with the start index up to but not including the stop index.

In [7]:
print(s[0:3]) # start index 0, up to but not including index 3 = 0, 1, 2
print(L[0:3])
print(T[0:3])

Str
[1, 2, 3]
(1, 2, 3)


In [8]:
print(s[0:-1]) # index 0 up to but not including index of last element = 0 to 5
print(s[0:7])  # index 0 up to but not including 7 = 0 to 6. Whole string

Strin
String


Slicing with a step other than 1

In [9]:
print(s, s[0:7:2])
print(s, s[0:7:3])

String Srn
String Si


Leaving out the start index is automatically taken as **`0`**. Leaving out the stop index is automatically taken as index of last element + 1.

In [10]:
print(s[0:])
print(s[:7])
print(s[:])

String
String
String


Slicing using negative indexing can be interesting, especially with steps other than 1.

In [11]:
print(s[-3:])  # last 3 elements
print(s[::-1]) # start index -1, stop index -7, step -1
print(s[-1:-7:-1])
print(s[::-2])

ing
gnirtS
gnirtS
git


All the above indexing and slicing techniques work with lists and tuples.

In [12]:
print(L[::-1])

[[10, 20], 'Belagavi', 5, 4, 3, 2, 1]


Python does not have the **array** data type. An array has elements **all of the same type**. Memory for the elements an array is contiguous. It is for this reason that any element can be accessed knowing the address of the initial element and size of one element. An array is more efficient than a list when you wish to store elements all of the same type.

# NumPy and the `ndarray` Type

**NumPy** introduces the **n-dimensioned array** data type to Python.To use **NumPy**, one must **`import`** the **`numpy`** package. It is customary to import **`numpy`** and assign it the short alias **`np`**. The basic operations on containers, such as **`len()`**, indexing, slicing, apply to **`ndarrays`**. An array is created with the **`numpy.array()`** function that takes a valid container as the input creates an array from the elements in that container.

In [13]:
import numpy as np

a = np.array([1, 2, 3, 4, 5])
print(type(a)) # type numpy.ndarray
print(len(a))  # number of elements
print(a[0])    # initial element
print(a[-1])   # last element
print(a[::2])  # alternate element starting with initial element in forward direction
print(a[::-1]) # Reverse the array

<class 'numpy.ndarray'>
5
1
5
[1 3 5]
[5 4 3 2 1]


## Properties of `ndarray`

In addition to length, a NumPy array has other properties.

In [14]:
print('ndim', a.ndim)    # number of dimensions
print('shape', a.shape) # tuple with size along each dimension
print('size', a.size)   # total number of elements in the array
print('dtype', a.dtype) # data type of each element of the array, detected automatically

ndim 1
shape (5,)
size 5
dtype int32


In [15]:
b = np.array([1, 2, 3, 4, 5.])
print('dtype', b.dtype)

dtype float64


In [16]:
c = np.array([1, 2, 3, 4, 5], dtype=float)
print('dtype', c.dtype)

dtype float64


## Two Dimensioned Arrays

Let us first create a two dimensioned list, with 2 rows and 5 columns. Then convert it into a two dimensioned array.

In [17]:
L = [ [1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
print(len(L), L)
print('Index 0:', L[0])
print('Index 1:', L[1])
print(len(L[0]))
print(len(L[1]))

2 [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
Index 0: [1, 2, 3, 4, 5]
Index 1: [6, 7, 8, 9, 10]
5
5


In [18]:
print(L[0][0])

1


In [19]:
x = np.array(L)
print(len(x))
print(x.shape)
print(x.size)
print(x.ndim)
print(x.dtype)

2
(2, 5)
10
2
int32


In [20]:
y = np.array([[1, 2, 3, 4], [5, 6, 7]]) # An array of lists, not an array of numbers

In [21]:
print(type(y))
print(y.shape)
print(y.dtype)
print(type(y[0]))

<class 'numpy.ndarray'>
(2,)
object
<class 'list'>


In [22]:
z = np.array([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=float)
print(z.dtype)

float64


## Array Indexing

In [23]:
x = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]])

def array_prop(a):
    print(type(a))
    print(a.ndim)
    print(a.shape)
    print(a.size)
    print(a.dtype)

array_prop(x)

<class 'numpy.ndarray'>
2
(3, 5)
15
int32


In [24]:
print(x[0]) # Row 0
print(x[1]) # Row 1
print(x[2]) # Row 2

[1 2 3 4 5]
[ 6  7  8  9 10]
[11 12 13 14 15]


In [25]:
print(x[0,0])
print(x[0,1])
print(x[2,4])
print(x[-1,-1])

1
2
15
15


## Array Slicing

In [26]:
print(x[0:2, 0:3])

[[1 2 3]
 [6 7 8]]


In [27]:
print(x)

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]]


## Operations on Arrays

Transpose, Addition, subtraction, and Multiplication with a scalar, elementwise array multiplication, matrix multiplication

In [28]:
print(x.T)

[[ 1  6 11]
 [ 2  7 12]
 [ 3  8 13]
 [ 4  9 14]
 [ 5 10 15]]


In [29]:
a = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
b = np.array([[11, 12, 13, 14, 15], [16, 17, 18, 19, 20]])
print(a)

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]


In [30]:
print(b)

[[11 12 13 14 15]
 [16 17 18 19 20]]


In [31]:
print(a + b)

[[12 14 16 18 20]
 [22 24 26 28 30]]


In [32]:
print(b - a)

[[10 10 10 10 10]
 [10 10 10 10 10]]


In [33]:
print(2 * a)

[[ 2  4  6  8 10]
 [12 14 16 18 20]]


In [34]:
print(a * b)

[[ 11  24  39  56  75]
 [ 96 119 144 171 200]]


In [35]:
print(a @ b.T)

[[205 280]
 [530 730]]


In [36]:
print(np.dot(a, b.T))

[[205 280]
 [530 730]]


In [41]:
a = np.array(np.arange(1, 13))
print(a)
a = a.reshape((3, 4))
print(a)
a = a.reshape((6, 2))
print(a)

[ 1  2  3  4  5  6  7  8  9 10 11 12]
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]]


## Creating Common Matrices Quickly

Some matrices are required commonly, a matrix with all zeros, or ones, identity matrix etc. NumPy provides functions to create such commonly required matrices quickly.

In [37]:
a = np.arange(4)
print(type(a), a)

<class 'numpy.ndarray'> [0 1 2 3]


In [38]:
b = np.arange(1, 6)
print(b)

[1 2 3 4 5]


In [39]:
c = np.arange(1, 11, 2)
print(c)
d = np.arange(10, 0, -2)
print(d)

[1 3 5 7 9]
[10  8  6  4  2]


In [40]:
a = np.zeros((3,5), dtype=float)
print(a)

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]


In [41]:
b = np.ones((4,5), dtype=int)
print(b)

[[1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 1 1 1]]


In [42]:
c = np.eye(4, dtype=float)
print(c)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


In [43]:
d = np.diag([1, 2, 3, 4.0])
print(d)

[[1. 0. 0. 0.]
 [0. 2. 0. 0.]
 [0. 0. 3. 0.]
 [0. 0. 0. 4.]]


In [44]:
x = np.diag([1, 2, 3], 1)
print(x)

[[0 1 0 0]
 [0 0 2 0]
 [0 0 0 3]
 [0 0 0 0]]


In [45]:
y = np.eye(4) + np.diag([1, 1, 1], 1) + np.diag([1, 1, 1], -1)
print(y)

[[1. 1. 0. 0.]
 [1. 1. 1. 0.]
 [0. 1. 1. 1.]
 [0. 0. 1. 1.]]


## NumPy Sub-packages

In addition to the **`ndarray`** data type, NumPy also provides several useful sub-packages, including the following:
* **`random`** - Random number tools
* **`linalg`** - Linear algebra functions
* **`fft`** - FFT Fast Fourier Transform functions
* **`polynomial`** - Polynomial operations

and a few utilities.

### Random Numbers

In [46]:
x = np.random.rand(3, 4) # Uniform distribution
print(x)

[[0.02797458 0.12077824 0.84780474 0.16462124]
 [0.18204782 0.79512987 0.26202031 0.83225063]
 [0.16812299 0.24643745 0.92575764 0.52471936]]


In [47]:
print(x * 100)

[[ 2.79745845 12.07782355 84.78047369 16.46212373]
 [18.20478216 79.51298715 26.20203061 83.22506266]
 [16.81229859 24.64374503 92.57576436 52.47193592]]


In [48]:
y = np.random.randn(3, 4) # Normal distribution
print(y)

[[ 0.04315548  0.09685165 -2.08139938 -0.48477488]
 [-0.13302113 -0.90243101 -0.25537939  0.18989469]
 [ 0.75357576 -1.02389049  1.33712237  0.87444531]]


In [49]:
a = np.arange(1, 11)
print(a)
np.random.shuffle(a)
print(a)

[ 1  2  3  4  5  6  7  8  9 10]
[ 6  2  7  4  8 10  3  9  1  5]


In [50]:
b = np.arange(1, 11)
print(b)
c = np.random.choice(b, 4, replace=True)
print(c)

[ 1  2  3  4  5  6  7  8  9 10]
[2 2 4 2]


### Linear Algebra

In [51]:
a = [ [10.0, 3, -2],
      [2, -8, 5],
      [-1, 3, 7]]
b = [100, -25, 75]
x = np.linalg.solve(a, b)
print(a)
print(b)
print(x)

[[10.0, 3, -2], [2, -8, 5], [-1, 3, 7]]
[100, -25, 75]
[ 8.51900393 10.02621232  7.63433814]


In [52]:
ainv = np.linalg.inv(a)
print(ainv)
print(ainv @ b - x)

[[ 0.09305374  0.03538663  0.00131062]
 [ 0.0249017  -0.08912189  0.07077326]
 [ 0.00262123  0.04325033  0.11271298]]
[1.77635684e-15 0.00000000e+00 0.00000000e+00]


In [53]:
print(np.linalg.det(a))

-762.9999999999999


In [54]:
v, x = np.linalg.eig(a)
print('Eigenvalues:', v)
print('Eigenvectors:\n', x)

Eigenvalues: [-9.32341951 10.610847    7.71257251]
Eigenvectors:
 [[ 0.169663   -0.97120631  0.34495882]
 [-0.96736865 -0.04133496  0.32420881]
 [ 0.18818171  0.23462678  0.88084735]]


In [55]:
from numpy.linalg import cholesky

a = [ [10.0, 3, -2],
      [3, 8, 5],
      [-2, 5, 7]]
b = [100, -25, 75]
c = cholesky(a)
print(c)
print(c @ c.T)

[[ 3.16227766  0.          0.        ]
 [ 0.9486833   2.66458252  0.        ]
 [-0.63245553  2.10164255  1.47753125]]
[[10.  3. -2.]
 [ 3.  8.  5.]
 [-2.  5.  7.]]


In [56]:
import numpy.linalg as la

print(la.cond(a))
print(la.norm(a))
print(la.cond.__doc__)

11.862312981776856
17.0

    Compute the condition number of a matrix.

    This function is capable of returning the condition number using
    one of seven different norms, depending on the value of `p` (see
    Parameters below).

    Parameters
    ----------
    x : (..., M, N) array_like
        The matrix whose condition number is sought.
    p : {None, 1, -1, 2, -2, inf, -inf, 'fro'}, optional
        Order of the norm:

        p      norm for matrices
        None   2-norm, computed directly using the ``SVD``
        'fro'  Frobenius norm
        inf    max(sum(abs(x), axis=1))
        -inf   min(sum(abs(x), axis=1))
        1      max(sum(abs(x), axis=0))
        -1     min(sum(abs(x), axis=0))
        2      2-norm (largest sing. value)
        -2     smallest singular value

        inf means the numpy.inf object, and the Frobenius norm is
        the root-of-sum-of-squares norm.

    Returns
    -------
    c : {float, inf}
        The condition number of the matrix. May 

## Statistics

NumPy can calculate simple statistical values

In [57]:
a = np.random.randn(10, 3) * 10
print(a)

[[  7.44640745   0.10100092  -3.96500844]
 [ -6.00831314 -14.24438215   0.82565843]
 [  3.39570667   4.44061107  -1.30390896]
 [ -4.27913083   1.69602986  19.32241972]
 [ -2.73944071  -9.45596009  -2.91295835]
 [ -0.78515802   3.2053325  -23.01305992]
 [ 17.91820824  -8.5935065  -15.74516848]
 [ 15.81416943 -11.23875178 -19.53873444]
 [  8.92640417   0.74334787  -2.91949804]
 [  6.59893155  -6.43074994   9.37646758]]


In [43]:
print(a)
print(np.sum(a))
print(np.sum(a, axis=0))
print(np.sum(a, axis=1))
# Alternate 
print(a.sum(axis=1))

[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]]
78
[36 42]
[ 3  7 11 15 19 23]
[ 3  7 11 15 19 23]


In [59]:
print(np.mean(a))
print(np.mean(a, axis=0))
print(np.mean(a, axis=1))

-1.1121011454585215
[ 4.62877848 -3.97770283 -3.98737909]
[ 1.19413331 -6.47567896  2.17746959  5.57977291 -5.03611972 -6.86429515
 -2.14015558 -4.98777227  2.25008467  3.18154973]


In [60]:
print(np.std(a))
print(np.std(a, axis=0))
print(np.std(a, axis=1))

10.016165605418157
[ 7.80789621  6.39563378 12.2338363 ]
[ 4.72237983  6.16118786  2.49839332 10.01901167  3.12610181 11.53452662
 14.48079272 15.09443632  4.95203942  6.89085894]


# References

* [NumPy home page](https://numpy.org/#)
* [NumPy tutorial for absolute beginners](https://numpy.org/devdocs/user/absolute_beginners.html)
* [Learning Scientific Programming with Python by Christian Hill](https://scipython.com/)