# NumPy Arrays

**Numpy provides:**

1. extension package to Python for multi-dimensional arrays
2. closer to hardware (efficiency)
3. designed for scientific computation (convenience)
4. Also known as array oriented computing

**python objects:** 

1. high-level number objects: integers, floating point
2. containers: lists (costless insertion and append), dictionaries (fast lookup)  (Internally build for operations)

In [154]:
import numpy as np
a = np.array([0, 1, 2, 3])
print(a)

print(np.arange(10))

[0 1 2 3]
[0 1 2 3 4 5 6 7 8 9]


In [156]:
l = range(1000)
%timeit [i**2 for i in l]

693 µs ± 16.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [157]:
a = np.arange(1000)    # Creating a numpy array using arange from 0 to 999
%timeit a**2

3.22 µs ± 169 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


## 1. Creating Arrays

1D array: Vectors

2D array: Matrix

nD array: Tensors

In [158]:
# 1.1 Manual Creation

#1-D

a = np.array([0, 1, 2, 3])

a

array([0, 1, 2, 3])

In [159]:
#print dimensions

a.ndim

1

In [160]:
#shape

a.shape

(4,)

In [161]:
# 2-D, 3-D....

b = np.array([[0, 1, 2], [3, 4, 5]])

b

array([[0, 1, 2],
       [3, 4, 5]])

In [162]:
b.ndim

2

In [163]:
b.shape

(2, 3)

In [164]:
len(b)    # return size of first dimension

2

In [165]:
c = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]])

c

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [166]:
c.ndim

3

In [167]:
c.shape    # return a tuple

(2, 2, 2)

In [168]:
# 1.2 Functions for creating arrays

# using arrange function
# arange is an array-valued version of the built-in Python range function

a = np.arange(10) # 0.... n-1
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [169]:
b = np.arange(1, 10, 2) #start, end (exclusive), step

b

array([1, 3, 5, 7, 9])

In [170]:
#using linspace (linear space)

a = np.linspace(0, 1, 6) #start, end, number of points

a

array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])

In [171]:
#common arrays

a = np.ones((3, 3))

a

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [172]:
b = np.zeros((3, 3))

b

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [173]:
c = np.eye(3)  #Return a 2-D array with ones on the diagonal and zeros elsewhere.

c

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [174]:
d = np.eye(3, 2) #3 is number of rows, 2 is number of columns, index of diagonal start with 0

d

array([[1., 0.],
       [0., 1.],
       [0., 0.]])

In [175]:
#create array using diag function

a = np.diag([1, 2, 3, 4]) #construct a diagonal array.

a

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

In [176]:
np.diag(a)   #Extract diagonal

array([1, 2, 3, 4])

In [177]:
#create array using random
#Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1).

a = np.random.rand(4) 

a

array([0.0902639 , 0.84893138, 0.89033801, 0.59579203])

In [178]:
a = np.random.randn(4)      
# Return a sample (or samples) from the “standard normal” distribution.  
# **Gausian** (Learn in future these various type of random numbers)

a

array([ 2.73685811, -0.95867218, -1.64617343, -0.59402775])

## 2. Basic data types

In [179]:
a = np.arange(10)

a.dtype

dtype('int32')

In [180]:
#You can explicitly specify which data-type you want:

a = np.arange(10, dtype='float64')
a

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [181]:
#The default data type is float for zeros and ones function

a = np.zeros((3, 3))

print(a)

a.dtype

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


dtype('float64')

In [182]:
d = np.array([1+2j, 2+4j])   #Complex datatype

print(d.dtype)

complex128


In [183]:
b = np.array([True, False, True, False])  #Boolean datatype

print(b.dtype)

bool


In [184]:
s = np.array(['Jashan', 'Jimmy', 'Mansa'])

s.dtype

dtype('<U6')

**Each built-in data type has a character code that uniquely identifies it.**

'b' − boolean

'i' − (signed) integer

'u' − unsigned integer

'f' − floating-point

'c' − complex-floating point

'm' − timedelta

'M' − datetime

'O' − (Python) objects

'S', 'a' − (byte-)string

'U' − Unicode

'V' − raw data (void)

**https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html**

## 3. Indexing and Slicing


In [185]:
a = np.arange(10)

print(a[5])

5


In [186]:
a = np.diag([1, 2, 3])

print(a[2, 2])

3


In [187]:
a[2,1] = 5
print(a)

[[1 0 0]
 [0 2 0]
 [0 5 3]]


In [189]:
a = np.arange(10)

a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [190]:
a[1:8:2] 

array([1, 3, 5, 7])

In [191]:
#################################################Important#################################################3
#we can also combine assignment and slicing:

a = np.arange(10)
a[5:] = 10
a

array([ 0,  1,  2,  3,  4, 10, 10, 10, 10, 10])

In [192]:
b = np.arange(5)
a[5:] = b[::-1]

a

array([0, 1, 2, 3, 4, 4, 3, 2, 1, 0])

## 4. Copies and Views

In [193]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [195]:
b = a[::2]
b

array([0, 2, 4, 6, 8])

In [199]:
np.shares_memory(a,b)
# By creating views we save the space

True

In [197]:
b[0] = 10
b

array([10,  2,  4,  6,  8])

In [198]:
a

array([10,  1,  2,  3,  4,  5,  6,  7,  8,  9])

In [200]:
a = np.arange(10)

c = a[::2].copy()     #force a copy
c

array([0, 2, 4, 6, 8])

In [201]:
c[0] = 10

print(np.shares_memory(a,c))

False


## 5. Fancy Indexing

NumPy arrays can be indexed with slices, but also with boolean or integer arrays **(masks)**. This method is called **fancy indexing**. It creates copies not views.

In [202]:
a = np.random.randint(0, 20, 15)
a

array([ 0, 16,  6, 15, 18,  5,  3, 12, 12, 19, 15, 10,  6, 17,  9])

In [203]:
mask = (a % 2 == 0)

In [204]:
extract_from_a = a[mask]
extract_from_a

array([ 0, 16,  6, 18, 12, 12, 10,  6])

In [205]:
# Indexing with a mask can be very useful to assign a new value to a sub-array:

a[mask] = -1
a

array([-1, -1, -1, 15, -1,  5,  3, -1, -1, 19, 15, -1, -1, 17,  9])

In [206]:
a = np.arange(0, 100, 10)
#Indexing can be done with an array of integers, where the same index is repeated several time:
a[[2, 3, 2, 4, 2]]

array([20, 30, 20, 40, 20])

In [207]:
a[[9, 7]] = -200
a

array([   0,   10,   20,   30,   40,   50,   60, -200,   80, -200])

# Elementwise operations

## 1. Basic Operations

In [208]:
# With Scalars

a = np.array([1,2,3,4])
a + 1

array([2, 3, 4, 5])

In [209]:
a ** 2

array([ 1,  4,  9, 16], dtype=int32)

In [210]:
# All arithematic operations elementwise

b = np.ones(4) + 1   # [2,2,2,2]

a-b

array([-1.,  0.,  1.,  2.])

In [211]:
a * b

array([2., 4., 6., 8.])

In [214]:
# Matrix Multiplications

c = np.diag([1,2,3,4])
print(c * c)            # Both matrices must have same shape
print("-----------------------------")
print(c.dot(c))         # Both matrices must have shape reverse of each other

[[ 1  0  0  0]
 [ 0  4  0  0]
 [ 0  0  9  0]
 [ 0  0  0 16]]
-----------------------------
[[ 1  0  0  0]
 [ 0  4  0  0]
 [ 0  0  9  0]
 [ 0  0  0 16]]


In [215]:
# Comparisons

a = np.array([1, 2, 3, 4])
b = np.array([5, 2, 2, 4])
a == b

array([False,  True, False,  True])

In [216]:
a > b

array([False, False,  True, False])

In [217]:
#array-wise comparisions
a = np.array([1, 2, 3, 4])
b = np.array([5, 2, 2, 4])
c = np.array([1, 2, 3, 4])

np.array_equal(a, b)

False

In [218]:
np.array_equal(a,c)

True

In [219]:
# Logical Operations
a = np.array([1, 1, 0, 0], dtype=bool)
b = np.array([1, 0, 1, 0], dtype=bool)

np.logical_or(a, b)

array([ True,  True,  True, False])

In [220]:
# Transedental functions

a = np.arange(5)

np.sin(a)  

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ])

In [221]:
np.log(a)

  """Entry point for launching an IPython kernel.


array([      -inf, 0.        , 0.69314718, 1.09861229, 1.38629436])

In [223]:
np.exp(a)    # It is e to power x

array([ 1.        ,  2.71828183,  7.3890561 , 20.08553692, 54.59815003])

In [224]:
# Shape mismatch

a = np.arange(4)

a + np.array([1,2])

ValueError: operands could not be broadcast together with shapes (4,) (2,) 

## 2. Basic Reductions



In [225]:
# Computing Sums

x = np.array([1,2,3,4])
np.sum(x)


10

In [227]:
x = np.array([[1,1], [2,2]])
x

array([[1, 1],
       [2, 2]])

In [229]:
x.sum(axis = 0)             #sum by columns

array([3, 3])

In [230]:
x.sum(axis = 1)             # sum by rows

array([2, 4])

In [231]:
# Other reductions

x = np.array([1,2,3])
x.min()

1

In [232]:
x.max()

3

In [233]:
x.argmin()             # argument of minimum element

0

In [234]:
x.argmax()           

2

In [235]:
# Logical operations

np.all([True,1,False])

False

In [236]:
np.any([True, False, False])

True

In [237]:
#Note: can be used for array comparisions ##### Important########
a = np.zeros((50, 50))
np.any(a != 0)

False

In [238]:
np.all(a == a)

True

In [239]:
a = np.array([1,2,3,2])
b = np.array([2,2,3,2])
c = np.array([6,4,4,5])
((a <= b) & (b <= c)).all()      # if all are true then .all() produces the result as true

True

In [240]:
# Statistics

x = np.array([1,2,3,1])
y = np.array([[1,2,3], [5,6,1]])
x.mean()

1.75

In [241]:
np.median(x)

1.5

In [242]:
np.median(y, axis = -1) # last axis(same as +1),axis = 0 columns and axis=1 rows . Therefore axis= -1 is represent last row

array([2., 5.])

In [243]:
x.std()               # full population standard deviation


0.82915619758885

In [None]:
#load data into numpy array object
data = np.loadtxt('fileDataInColumns.txt')      # Just a dummy function. No file is there.

In [None]:
# Take transpose
data.T

In [None]:
# Columns to variables

col1 , col2 , col3, col4 = data.T

# Broadcasting

Basic operations on numpy arrays (addition, etc.) are elementwise

This works on arrays of the same size.
    Nevertheless, It’s also possible to do operations on arrays of different sizes if NumPy can transform these arrays     so that they all have the same size: this conversion is called broadcasting.

It done this by replicating the number of rwows and columns.

Remeber all the three tricks.

In [244]:
# Tile this array, and replicate this 3 times along the row.
a = np.tile(np.arange(0, 40, 10), (3,1))
print(a)

print("___________________________________________________")
a=a.T
print(a)

[[ 0 10 20 30]
 [ 0 10 20 30]
 [ 0 10 20 30]]
___________________________________________________
[[ 0  0  0]
 [10 10 10]
 [20 20 20]
 [30 30 30]]


In [245]:
# Tile this array, and replicate this 3 times along the row and two times along the column
a = np.tile(np.arange(0, 40, 10), (3,2))
print(a)

print("___________________________________________________")
a=a.T
print(a)

[[ 0 10 20 30  0 10 20 30]
 [ 0 10 20 30  0 10 20 30]
 [ 0 10 20 30  0 10 20 30]]
___________________________________________________
[[ 0  0  0]
 [10 10 10]
 [20 20 20]
 [30 30 30]
 [ 0  0  0]
 [10 10 10]
 [20 20 20]
 [30 30 30]]


In [254]:
a = np.tile(np.arange(0, 40, 10), (3,1))

a = a.T
a

array([[ 0,  0,  0],
       [10, 10, 10],
       [20, 20, 20],
       [30, 30, 30]])

In [252]:
b = np.array([0,1,2])
b

array([0, 1, 2])

In [253]:
a + b

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22],
       [30, 31, 32]])

In [255]:
a = np.arange(0, 40, 10)
a.shape


(4,)

In [259]:
a = a[:, np.newaxis]      # adds a new axis -> 2D array (convert it from 1D array to 2D)
a.shape

(4, 1, 1)

In [257]:
a

array([[ 0],
       [10],
       [20],
       [30]])

In [258]:
a + b

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22],
       [30, 31, 32]])

## Flattening

In [260]:
a = np.array([[1,2,3], [4,5,6]])
a.ravel()          # Return a cont. flattened array. A 1-D array

array([1, 2, 3, 4, 5, 6])

In [261]:
a.T

array([[1, 4],
       [2, 5],
       [3, 6]])

In [262]:
a.T.ravel()

array([1, 4, 2, 5, 3, 6])

## Reshaping


In [263]:
print(a.shape)
print(a)

(2, 3)
[[1 2 3]
 [4 5 6]]


In [264]:
b = a.ravel()
print(b)

[1 2 3 4 5 6]


In [266]:
b = b.reshape((2,3))
b

array([[1, 2, 3],
       [4, 5, 6]])

In [267]:
b[0, 0] = 100            ##############Important(Internally same address)#####################
a

array([[100,   2,   3],
       [  4,   5,   6]])

In [268]:
# Note: reshape may also return a copy
a = np.zeros((3,2))
b = a.T.reshape(3*2)
b[0] = 50
a

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

## Adding a Dimension

Indexing with the np.newaxis object allows us to add an axis to an array

newaxis is used to increase the dimension of the existing array by one more dimension, when used once. Thus,

1D array will become 2D array

2D array will become 3D array

3D array will become 4D array and so on

In [269]:
z = np.array([1, 2, 3])
z

array([1, 2, 3])

In [270]:
z[:, np.newaxis]

array([[1],
       [2],
       [3]])

## Dimension shuffling


In [272]:
a = np.arange(4*3*2)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23])

In [274]:
a = np.arange(4*3*2).reshape(4, 3, 2)
a.shape

(4, 3, 2)

In [275]:
a

array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5]],

       [[ 6,  7],
        [ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15],
        [16, 17]],

       [[18, 19],
        [20, 21],
        [22, 23]]])

In [276]:
# Resizing

a = np.arange(4)
a.resize((8,))
a

array([0, 1, 2, 3, 0, 0, 0, 0])

In [278]:
b = a      #####################Important############################
a.resize((4,)) 

ValueError: cannot resize an array that references or is referenced
by another array in this way.  Use the resize function

In [279]:
#Sorting along an axis:
a = np.array([[5, 4, 6], [2, 3, 2]])
b = np.sort(a, axis=1)
b

array([[4, 5, 6],
       [2, 2, 3]])

In [281]:
# In place sort
a.sort(axis = 1)
a

array([[4, 5, 6],
       [2, 2, 3]])

In [282]:
#sorting with fancy indexing
a = np.array([4, 3, 1, 2])
j = np.argsort(a)
j

array([2, 3, 1, 0], dtype=int64)

In [283]:
a[j]

array([1, 2, 3, 4])