# NumPy Arrays

- important packages in Python for numerical computations, scientific computations.

<h3> python objects </h3>

- high-level number objects: integers, floating point
- containers: lists (costless insertion and append), dictionaries (fast lookup)

<h3> NumPy provides </h3>

- extension package to Python for multi-dimensional arrays.
- closer to hardware (efficiency)
- designed for Scientific computation (convenience)
- also known as array oriented computing

In [1]:
# 2 ways of creating an array in NumPy
# 1st way
import numpy as np
a = np.array([0, 1, 2, 3])
print(a)

# 2nd way
print(np.arange(10))

[0 1 2 3]
[0 1 2 3 4 5 6 7 8 9]


<b>Why is it useful</b>: Memory-efficient container that provides fast numerical operations.

In [4]:
# python lists

L = range(1000)
%timeit [i**2 for i in L]

700 µs ± 30 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [5]:
a = np.arange(1000)
%timeit [a**2]

3.3 µs ± 70.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


<h3> 1. Creating arrays </h3>

<h4>1.1. Manual Construction of Arrays </h4>

- 1-D array is known as Vector
- 2-D array is known as Matrix
- 3-D array is known as Tensor

In [6]:
# 1-D

a = np.array([0, 1, 2, 3])
a

array([0, 1, 2, 3])

In [7]:
# print dimensions

a.ndim

1

In [8]:
# shape

a.shape

(4,)

In [10]:
# length of an array
len(a)

4

In [12]:
# 2-D, 3-D, ...

b = np.array([[0, 1, 2, 3], [4, 5, 6, 7]])
b

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [13]:
# print dimension

b.ndim

2

In [18]:
# shape  - 2 rows, 4 columns

b.shape

(2, 4)

In [19]:
len(b)   # returns the size of the first dimension  - no. of rows

2

In [22]:
c = np.array([[[0, 1, 2], [3, 4, 5]], [[6, 7, 8], [9, 10, 11]]])
c

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]]])

In [23]:
# print dimension
c.ndim

3

In [24]:
# shape
c.shape  # 2 rows, 2 columns, depth of 3

(2, 2, 3)

In [25]:
len(c)

2

<h4> 1.2. Functions for creating arrays </h4>

In [26]:
# arange is an array-valued version of built-in Python range function

a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [28]:
b = np.arange(1, 10, 2)   # start, end(exclusive), step
b

array([1, 3, 5, 7, 9])

In [39]:
# using linspace  - using linear space

a = np.linspace(0, 1, 6)  # start, end, number of points
a

array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])

In [40]:
# common arrays

a = np.ones((3, 3))
a

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [43]:
b = np.zeros((3, 3))
b

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [46]:
c = np.eye(3)  # returns a 2-D array with 1's on the diagonal - Identity Matrix
c

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [48]:
d = np.eye(3, 2) # number of rows = 3, number of columns = 2  - Index of the diagonal = 1
d

array([[1., 0.],
       [0., 1.],
       [0., 0.]])

In [56]:
# create an array using diag function
e = np.diag([1, 2, 3, 4])
e


array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

In [58]:
np.diag(e)   # Extract diagonal

array([1, 2, 3, 4])

In [59]:
# create array using random
# np - package, random - module, rand() - function
# Create an array of the given shape and populate it with random samples  

# 2 types of random numbers:
# 1. Uniform Distributed Random Numbers
# 2. Gaussian Distributed Random Numbers - Standard Normal Variant
z = np.random.rand(4)
z

array([0.51808847, 0.13426908, 0.6944096 , 0.22180684])

In [62]:
# Return a sample from "Standard Normal Variant"

y = np.random.randn(4) 
y

array([-0.01433715,  1.30050469, -1.39584676, -0.49504879])

<h3> 2. Basic Datatypes </h3>

- Array elements are displayed with a trailing dot (2.). This is due to a difference in the data-type used:

In [65]:
m = np.arange(10)
m.dtype

dtype('int32')

In [66]:
# mention the data type explicitly
m = np.arange(10, dtype='float64')
m

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [73]:
# default data type is float for zeros and ones function

a = np.ones((3, 3))
print(a)
a.dtype

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


dtype('float64')

In [74]:
b = np.zeros((2, 4))
print(b)
b.dtype

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]


dtype('float64')

<h4> other datatypes </h4>

In [78]:
d = np.array([1+2j, 2+4j])  # Complex datatype
d.dtype

dtype('complex128')

In [80]:
b = np.array([True, False, True, False]) # Boolean Datatype
b.dtype

dtype('bool')

In [84]:
q = np.array(["Ram", "Rahim", "Robert"])
q.dtype

dtype('<U6')

<h4> Each built-in data type has a character code that uniquely identifies it. </h4>

- 'b' - boolean
- 'i' - (signed) integer
- 'u' - (unsigned) integer
- 'f' - floating-point
- 'c' - complex-floating point
- 'm' - timedelta
- 'M' - datetime
- 'O' - Python objects
- 'S', 'a' - (byte-)string
- 'U' - Unicode
- 'V' - raw data (void)

<h3> 3. Indexing & Slicing </h3>

<h4> 3.1. Indexing </h4>

- The items of an arrays can be accessed & assigned to the same way as other Python sequences (lists)

In [86]:
a = np.arange(10)
a[5]  # indices begin at 0

5

In [88]:
# For multi-dimensional arrays, indexes are tuples of integers

a = np.diag([1, 2, 3])
a[1, 1]

2

In [89]:
a[2, 1] = 5 # assigning value
a

array([[1, 0, 0],
       [0, 2, 0],
       [0, 5, 3]])

<h4> 3.2. Slicing </h4>

In [90]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [91]:
a[1:8:2]  # [startIndex: endIndex(exclusive): step]

array([1, 3, 5, 7])

In [92]:
# combine assignment and slicing

a = np.arange(10)
a[5:] = 10
a

array([ 0,  1,  2,  3,  4, 10, 10, 10, 10, 10])

In [95]:
b = np.arange(5)
a[5:] = b[::-1]  # assigning    -1 : reverse order
a

array([0, 1, 2, 3, 4, 4, 3, 2, 1, 0])

<h3>4. Copies & Views </h3>

- A slicing operation creates a view on the original array which is a way of accessing array data. 
- So the original array is not copied in memory.
- Use np.shares_memory(a,b) to check if 2 arrays share the same memory block.
- <b> When modifying the view, the original array is modified as well. </b>

In [96]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [97]:
b = a[::2]
b

array([0, 2, 4, 6, 8])

In [98]:
np.shares_memory(a, b)

True

In [99]:
b[0] = 10
b

array([10,  2,  4,  6,  8])

In [101]:
a   # a and b share the same memory.

array([10,  1,  2,  3,  4,  5,  6,  7,  8,  9])

In [107]:
a = np.arange(10)

c = a[::2].copy()    # force a copy
c

array([0, 2, 4, 6, 8])

In [103]:
np.shares_memory(a, c)

False

In [105]:
c[0] = 10
c

array([10,  2,  4,  6,  8])

In [106]:
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

<h3> 5. Fancy Indexing </h3>

- NumPy arrays can be indexed with slices, but also with boolean or integer arrays (masks). - Fancy Indexing
- It creates copies not views.

<h4> 5.1. Using Boolean Mask </h4>

In [109]:
a = np.random.randint(0, 20, 15)
a

array([ 8,  2, 12, 11, 17, 12, 14, 19, 12,  7, 10,  0,  3, 12, 14])

In [110]:
mask = (a % 2 == 0)

In [111]:
extract_from_a = a[mask]

extract_from_a

array([ 8,  2, 12, 12, 14, 12, 10,  0, 12, 14])

<h4> Indexing with a mask can be very useful to assign a new value to a sub-array </h4>

In [112]:
a[mask] = -1
a

array([-1, -1, -1, 11, 17, -1, -1, 19, -1,  7, -1, -1,  3, -1, -1])

<h4> Indexing with an array of integers </h4>

In [113]:
a = np.arange(0, 100, 10)
a

array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [116]:
# Indexing can be done with an array of integers, where the same index is repeated.

a[[2, 3, 2, 4, 2]]

array([20, 30, 20, 40, 20])

In [115]:
# New values can be assigned

a[[9, 7]] = -200

a

array([   0,   10,   20,   30,   40,   50,   60, -200,   80, -200])