## The Numpy Array Object

## NumPy Arrays

In [1]:
import numpy as np 
a = np.array([0,1,2,3])
print(a)

[0 1 2 3]


In [5]:
print(np.arange(10))

[0 1 2 3 4 5 6 7 8 9]


Why it is useful: Memory-efficient container that provides fast numerical operations.

In [6]:
# python lists
l = range(1000)
%timeit [i**2 for i in l]

513 µs ± 15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [7]:
a = np.arange(1000)
%timeit a**2

3.26 µs ± 170 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


## 1. Creating Arrays

**1.1 Manual Construction of arrays

In [8]:
# 1 - D
a = np.array([0,1,2,3])
a

array([0, 1, 2, 3])

In [9]:
# print dimensions
a.ndim

1

In [10]:
# shape
a.shape

(4,)

In [11]:
len(a)

4

In [12]:
# 2-D, 3-D 
b = np.array([[0,1,2], [3,4,5]])
b

array([[0, 1, 2],
       [3, 4, 5]])

In [13]:
b.ndim

2

In [14]:
b.shape

(2, 3)

In [16]:
len(b) # returns the size of the first dimension

2

In [17]:
c = np.array([[[0,1],[2,3]],[[4,5],[6,7]]])
print(c)

[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]


In [18]:
c.ndim

3

In [19]:
c.shape

(2, 2, 2)

** 1.2 Functions for creating arrays

In [22]:
# using arange function

a = np.arange(10)  # 0 to n-1
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [23]:
b = np.arange(1,10,2)
b

array([1, 3, 5, 7, 9])

In [24]:
# using linspace
a = np.linspace(0,1,6)  # start , end , number of points

a

array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])

In [25]:
# common arrays
a = np.ones((3,3))
a

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [27]:
b = np.zeros((3,3))
b

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [28]:
c = np.eye(3)  #Return a 2-D array with ones on the diagonal and zeros elsewhere.

c

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [29]:
d = np.eye(3,2)

d

array([[1., 0.],
       [0., 1.],
       [0., 0.]])

In [31]:
# create array using diag function
a = np.diag([1,2,3,4])
a

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

In [32]:
np.diag(a)   # extract diagonal

array([1, 2, 3, 4])

In [33]:
# create array using random
a = np.random.rand(4)
a

array([0.03920795, 0.91502481, 0.46874961, 0.54378996])

In [38]:
a = np.random.randn(4) #Return a sample (or samples) from the “standard normal” distribution.  ***Gausian***
a

array([ 1.19698164,  0.26645583, -0.09179414,  0.44389339])

Note:

For random samples from N(\mu, \sigma^2), use:

sigma * np.random.randn(...) + mu

## 2. Basic Data Types

In [39]:
a = np.arange(10)

a.dtype

dtype('int32')

In [41]:
# you can explicitly specify
a = np.arange(10,dtype='float64')
a

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [43]:
# the default data type is float for zeros and ones function
a = np.zeros((3,3))

print(a)

a.dtype

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


dtype('float64')

### other datatypes

In [44]:
d = np.array([1+2j , 2+4j])

print(d.dtype)

complex128


In [45]:
b = np.array([True , False, True, False]) # boolean datatype

print(b.dtype)

bool


In [46]:
s = np.array(['Pramit', 'Rajesh', 'Shirsendu'])

s.dtype

dtype('<U9')

Each built-in data type has a character code that uniquely identifies it.

'b' − boolean

'i' − (signed) integer

'u' − unsigned integer

'f' − floating-point

'c' − complex-floating point

'm' − timedelta

'M' − datetime

'O' − (Python) objects

'S', 'a' − (byte-)string

'U' − Unicode

'V' − raw data (void)

## 3. Indexing and Slicing

### 3.1 Indexing

In [48]:
a = np.arange(10)
print(a[5])

5


In [49]:
# For multidimensional arrays, indexes are tuples of integers:
a = np.diag([1,2,3])
print(a[2,2])

3


In [50]:
a[2,1] = 5 

a

array([[1, 0, 0],
       [0, 2, 0],
       [0, 5, 3]])

### 3.2 Slicing

In [51]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [52]:
a[1:8:2]

array([1, 3, 5, 7])

In [53]:
# we can also combine assignment and slicing

a = np.arange(10)
a[5:] = 10
a

array([ 0,  1,  2,  3,  4, 10, 10, 10, 10, 10])

In [54]:
b = np.arange(5)
a[5:] = b[::-1] # assigning
a

array([0, 1, 2, 3, 4, 4, 3, 2, 1, 0])

## 4. Copies and Views

A slicing operation creates a view on the original array, which is just a way of accessing array data. Thus the original array is not copied in memory. You can use np.may_share_memory() to check if two arrays share the same memory block.

In [55]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [56]:
b = a[::2]
b

array([0, 2, 4, 6, 8])

In [57]:
np.shares_memory(a,b)

True

In [59]:
b[0] = 10
b

array([10,  2,  4,  6,  8])

In [60]:
a  #eventhough we modified b,  it updated 'a' because both shares same memory

array([10,  1,  2,  3,  4,  5,  6,  7,  8,  9])

In [61]:
a = np.arange(10)

c = a[::2].copy()  # force a copy
c

array([0, 2, 4, 6, 8])

In [62]:
np.shares_memory(a,c)

False

In [63]:
c[0] = 10

a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

## 5. Fancy Indexing

NumPy arrays can be indexed with slices, but also with boolean or integer arrays (masks). This method is called fancy indexing. It creates copies not views.

#### Using Boolean Mask

In [65]:
a = np.random.randint(0,20,15)
a

array([ 0, 10,  9,  3,  2, 10,  0, 17, 10,  4,  6,  0,  6,  6, 15])

In [66]:
mask = (a % 2 == 0)

In [67]:
extracted_from_a = a[mask]
extracted_from_a

array([ 0, 10,  2, 10,  0, 10,  4,  6,  0,  6,  6])

#### Indexing with a mask can be very useful to assign a new value to a sub-array:

In [68]:
a[mask] = -1
a

array([-1, -1,  9,  3, -1, -1, -1, 17, -1, -1, -1, -1, -1, -1, 15])

#### Indexing with an array of integers

In [69]:
a = np.arange(0,100,10)
a

array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [70]:
#Indexing can be done with an array of integers, where the same index is repeated several time:
a[[2,3,2,4,2]]

array([20, 30, 20, 40, 20])

In [71]:
# new values can be assigned

a[[9,7]] = -200
a

array([   0,   10,   20,   30,   40,   50,   60, -200,   80, -200])