### The Numpy array object

### Numpy array

#### python objects:

1. high-level number objects: integers, floating point
2. containers: lists (costless insertion and append), dictionaries (fast lookup)

#### numpy provides:

1. extension package to Python for multi-dimensional arrays
2. closer to hardware (efficiency)
3. designed for scientific computation (convenience)
4. Also known as array oriented computing

In [1]:
# to install numpy
import sys
!{sys.executable} -m pip install numpy

Collecting numpy
[?25l  Downloading https://files.pythonhosted.org/packages/35/d5/4f8410ac303e690144f0a0603c4b8fd3b986feb2749c435f7cdbb288f17e/numpy-1.16.2-cp36-cp36m-manylinux1_x86_64.whl (17.3MB)
[K    100% |████████████████████████████████| 17.3MB 909kB/s eta 0:00:01
[?25hInstalling collected packages: numpy
Successfully installed numpy-1.16.2


In [2]:
import numpy as np
a = np.array([0,1,2,3,4,5])
print(a)

print(np.arange(10))

[0 1 2 3 4 5]
[0 1 2 3 4 5 6 7 8 9]


Why it is useful?

Memory efficient container that provides fast numerical operation

In [3]:
# python list
L = range(1000)
%timeit [i ** 2 for i in L]

268 µs ± 34.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [4]:
nA = np.arange(1000)
%timeit a**2

713 ns ± 8.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


### 1. Creating arrays

#### 1.1. Manual Construction of arrays

In [5]:
# 1-D array

a = np.array([0,1,2,3,4])

a

array([0, 1, 2, 3, 4])

In [6]:
# print dimension

a.ndim

1

In [7]:
# shape

a.shape

(5,)

In [8]:
len(a)

5

In [9]:
# 2-D, 3-D array...

b = np.array([[0,1,2],[3,4,5]])
b

array([[0, 1, 2],
       [3, 4, 5]])

In [10]:
b.ndim

2

In [11]:
b.shape

(2, 3)

In [12]:
len(b)

2

In [13]:
c = np.array([[[1,2,3],[4,5,6]],[[1,2,3],[4,5,6]]])
c

array([[[1, 2, 3],
        [4, 5, 6]],

       [[1, 2, 3],
        [4, 5, 6]]])

In [14]:
c.ndim

3

In [15]:
c.shape

(2, 2, 3)

#### 1.2 Functions for creating arrays

In [16]:
# using arange function
# arange is an array-valued version of the built-in Python range function

a = np.arange(10)
print(a)

[0 1 2 3 4 5 6 7 8 9]


In [17]:
# start, end and exclusive step

a = np.arange(1,20,2)
print(a)

[ 1  3  5  7  9 11 13 15 17 19]


In [18]:
# using linspace
# start, end and number of points in middle

a = np.linspace(1,8,4)
print(a)

[1.         3.33333333 5.66666667 8.        ]


In [19]:
# common arrays

a = np.ones((3,3))
print(a)

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


In [20]:
b = np.zeros((3,3))
print(b)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [21]:
c = np.eye(3)
print(c)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [22]:
d = np.eye(3,2)
print(d)

[[1. 0.]
 [0. 1.]
 [0. 0.]]


In [23]:
# create array using diag function
a = np.diag([1,2,3,4])
print(a)

[[1 0 0 0]
 [0 2 0 0]
 [0 0 3 0]
 [0 0 0 4]]


In [24]:
np.diag(a)

array([1, 2, 3, 4])

In [25]:
# create array using random

# create an array of the given shape and populate it with random samples from a uniform distribution over [0,1]

a = np.random.rand(4)
print(a)

[0.99784049 0.3642509  0.99598928 0.72989047]


In [26]:
# returns a sample or samples from the "standard normal" distribution ***Gausian***

a = np.random.randn(4)

print(a)

[2.69669581 0.77390827 0.77551798 0.6181791 ]


**Note:**

    For random samples from N(\mu, \sigma^2), use:

    sigma * np.random.randn(...) + mu

### 2. Basic DataTypes

You may have noticed that, in some instances, array elements are displayed with a trailing dot (e.g. 2. vs 2). This is due to a difference in the data-type used:

In [28]:
a = np.arange(10)
print(a)
a.dtype

[0 1 2 3 4 5 6 7 8 9]


dtype('int64')

In [29]:
# You can explicitly specify which datatype you want

a = np.arange(10,dtype= "float64")
print(a)
a.dtype

[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]


dtype('float64')

In [30]:
# The default datatype is float for zeros and ones function

a = np.zeros((2,2))
print(a)
a.dtype

[[0. 0.]
 [0. 0.]]


dtype('float64')

#### other datatypes

In [31]:
d = np.array([1+2j, 2+4j])
print(d)
print(d.ndim)
print(d.shape)
print(d.dtype)

[1.+2.j 2.+4.j]
1
(2,)
complex128


In [32]:
b = np.array([True,False,False,True])
print(b.dtype)

bool


In [33]:
s = np.array(["Ram","Robert","Sham"])
print(s.dtype)

<U6


#### Each built-in data type has a character code that uniquely identifies it.

    'b' − boolean

    'i' − (signed) integer

    'u' − unsigned integer

    'f' − floating-point

    'c' − complex-floating point

    'm' − timedelta

    'M' − datetime

    'O' − (Python) objects

    'S', 'a' − (byte-)string

    'U' − Unicode

    'V' − raw data (void)

    For more details

    https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html

### 3. Indexing and Slicing

#### 3.1 Indexing

The items of an array can be accessed and assigned to the same way as other Python sequences (e.g. lists):

In [34]:
a = np.arange(5)
print(a[3]) #indices begin at 0, like other Python sequences (and C/C++)

3


In [35]:
# For multidimensional arrays, indexes are tuples of integers:

a = np.array([[1,2],[1,2]])
print(a[1,1])

b = np.diag([1,2,3])
print(b[2,2])

2
3


In [36]:
b[2,1] = 5
print(b)

[[1 0 0]
 [0 2 0]
 [0 5 3]]


#### 3.2 Slicing

In [37]:
a = np.arange(10)
print(a)

[0 1 2 3 4 5 6 7 8 9]


In [38]:
b = a[2:8:2] # [startindex: endindex(exclusive) : step]
print(b)

[2 4 6]


In [39]:
# we can also combine assignment and slicing

a = np.arange(10)
a[5:] = 10
a

array([ 0,  1,  2,  3,  4, 10, 10, 10, 10, 10])

In [40]:
b = np.arange(5)
a[5:] = b[::-1]
print(a)

[0 1 2 3 4 4 3 2 1 0]


### 4. Copies and Views

A slicing operation creates a view on the original array, which is just a way of accessing array data. 

Thus the original array is not copied in memory. You can use np.may_share_memory() to check if two arrays share the same memory block.

#### When modifying the view, the original array is modified as well:

In [41]:
a = np.arange(10)
print(a)

[0 1 2 3 4 5 6 7 8 9]


In [42]:
b = a[::2]
print(b)

[0 2 4 6 8]


In [43]:
b[1] = 3
print(b)

[0 3 4 6 8]


In [46]:
print(a) # even though we have modified b, it has updated a as well,because it shares the same memory location

[0 1 3 3 4 5 6 7 8 9]


In [45]:
# check if both the array shares the same memory

np.shares_memory(a,b)


True

In [47]:
# force a copy
c = a[::2].copy()
print(c)

[0 3 4 6 8]


In [48]:
c[1] = 10
print(c)

[ 0 10  4  6  8]


In [49]:
print(a)

[0 1 3 3 4 5 6 7 8 9]


In [50]:
# as seen above the value of a has not been changed because of the force copy

#  It is because, force copy creates different memory location

np.shares_memory(a,c)

False

### 5. Fancy Indexing

NumPy arrays can be indexed with slices, but also with boolean or integer arrays (masks). This method is called fancy indexing. It creates copies not views.

#### Using Boolean Mask

In [52]:
a = np.random.randint(0,20,15) # 15 numbers between 0 and 20
print(a)

[ 7  1  3  6 13 17  8  7  5 15  8 10 11  6 19]


In [53]:
mask = (a % 2 == 0)

In [54]:
extract_from_a = a[mask]
print(extract_from_a)

[ 6  8  8 10  6]


#### Indexing with a mask can be very useful to assign a new value to a sub-array:

In [55]:
a[mask] = -1
print(a)

[ 7  1  3 -1 13 17 -1  7  5 15 -1 -1 11 -1 19]


#### Indexing with an array of integers

In [56]:
a = np.arange(0,100,10)
print(a)

[ 0 10 20 30 40 50 60 70 80 90]


In [57]:
#Indexing can be done with an array of integers, where the same index is repeated several time:

print(a[[1,2,3,1,2,3]])

[10 20 30 10 20 30]


In [60]:
# new values can be assigned
# for single value
a[[9,7]] = -200
print(a)

# for multiple values
a[[9,7]] = (1,2)
print(a)

a[[2,3,4]] = (1,2)

[   0   10   20   30   40   50   60 -200   80 -200]
[ 0 10 20 30 40 50 60  2 80  1]


ValueError: shape mismatch: value array of shape (2,) could not be broadcast to indexing result of shape (3,)