<a href="https://colab.research.google.com/github/Sagarranjan007/Numpy/blob/master/1_Numpy_array_object.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The Numpy array object

# NumPy Arrays

**python objects:** 

1. high-level number objects: integers, floating point
2. containers: lists (costless insertion and append), dictionaries (fast lookup)

**Numpy provides:**

1. extension package to Python for multi-dimensional arrays
2. closer to hardware (efficiency)
3. designed for scientific computation (convenience)
4. Also known as array oriented computing

In [1]:
import numpy as np
a = np.array([0, 1, 2, 3])
print(a)

print(np.arange(10))

[0 1 2 3]
[0 1 2 3 4 5 6 7 8 9]


**Why it is useful:** Memory-efficient container that provides fast numerical operations.

In [4]:
#python lists
L = range(1000)
%timeit [i**2 for i in L]

1000 loops, best of 3: 253 µs per loop


In [5]:
a = np.arange(1000)
%timeit a**2

The slowest run took 24.11 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.38 µs per loop


# 1. Creating arrays

** 1.1.  Manual Construction of arrays**

In [8]:
#1-D

a = np.array([0, 1, 2, 3])

a
print(a)

[0 1 2 3]


In [9]:
#print dimensions

a.ndim

1

In [0]:
#shape

a.shape

(4,)

In [0]:
len(a)

4

In [10]:
# 2-D, 3-D....

b = np.array([[0, 1, 2], [3, 4, 5]])

b

array([[0, 1, 2],
       [3, 4, 5]])

In [0]:
b.ndim

2

In [11]:
b.shape

(2, 3)

In [12]:
len(b) #returns the size of the first dimention

2

In [13]:
c = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]])

c

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [14]:
c.ndim

3

In [15]:
c.shape

(2, 2, 2)

** 1.2  Functions for creating arrays**

In [16]:
#using arrange function

# arange is an array-valued version of the built-in Python range function

a = np.arange(10) # 0.... n-1
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [17]:
b = np.arange(1, 10, 2) #start, end (exclusive), step

b

array([1, 3, 5, 7, 9])

In [19]:
#using linspace

a = np.linspace(0, 1, 100) #start, end, number of points

a

array([0.        , 0.01010101, 0.02020202, 0.03030303, 0.04040404,
       0.05050505, 0.06060606, 0.07070707, 0.08080808, 0.09090909,
       0.1010101 , 0.11111111, 0.12121212, 0.13131313, 0.14141414,
       0.15151515, 0.16161616, 0.17171717, 0.18181818, 0.19191919,
       0.2020202 , 0.21212121, 0.22222222, 0.23232323, 0.24242424,
       0.25252525, 0.26262626, 0.27272727, 0.28282828, 0.29292929,
       0.3030303 , 0.31313131, 0.32323232, 0.33333333, 0.34343434,
       0.35353535, 0.36363636, 0.37373737, 0.38383838, 0.39393939,
       0.4040404 , 0.41414141, 0.42424242, 0.43434343, 0.44444444,
       0.45454545, 0.46464646, 0.47474747, 0.48484848, 0.49494949,
       0.50505051, 0.51515152, 0.52525253, 0.53535354, 0.54545455,
       0.55555556, 0.56565657, 0.57575758, 0.58585859, 0.5959596 ,
       0.60606061, 0.61616162, 0.62626263, 0.63636364, 0.64646465,
       0.65656566, 0.66666667, 0.67676768, 0.68686869, 0.6969697 ,
       0.70707071, 0.71717172, 0.72727273, 0.73737374, 0.74747

In [21]:
#common arrays

a = np.ones((10, 15))

a

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

In [0]:
b = np.zeros((3, 3))

b

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

In [0]:
c = np.eye(3)  #Return a 2-D array with ones on the diagonal and zeros elsewhere.

c

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [22]:
d = np.eye(3, 2) #3 is number of rows, 2 is number of columns, index of diagonal start with 0

d

array([[1., 0.],
       [0., 1.],
       [0., 0.]])

In [24]:
#create array using diag function

a = np.diag([1, 2, 3, 4,5,6,7,8,9]) #construct a diagonal array.

a

array([[1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 2, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 3, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 4, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 5, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 6, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 7, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 8, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 9]])

In [0]:
np.diag(a)   #Extract diagonal

array([1, 2, 3, 4])

In [25]:
#create array using random

#Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1).
a = np.random.rand(50) 

a

array([0.30902742, 0.49589391, 0.04653959, 0.41870783, 0.94802201,
       0.27670661, 0.56367269, 0.87231532, 0.44772363, 0.40854744,
       0.2858315 , 0.14047583, 0.30433994, 0.76467723, 0.43515226,
       0.27670773, 0.36955055, 0.56759075, 0.68709222, 0.97085625,
       0.46336081, 0.18192748, 0.96666559, 0.28337202, 0.46626248,
       0.56202184, 0.05799102, 0.33769519, 0.19824424, 0.38278153,
       0.8758828 , 0.61358904, 0.1798454 , 0.75739343, 0.57534423,
       0.53316734, 0.57856226, 0.83022448, 0.39080613, 0.93237519,
       0.37660244, 0.98397289, 0.47199791, 0.91510715, 0.82755851,
       0.29234382, 0.33371726, 0.33650656, 0.38187891, 0.51055268])

In [0]:
a = np.random.randn(4)#Return a sample (or samples) from the “standard normal” distribution.  ***Gausian***

a

array([  1.99407539e+00,  -1.33836224e+00,   3.07395038e-04,
         4.73482900e-01])

**Note:**
    
For random samples from N(\mu, \sigma^2), use:

sigma * np.random.randn(...) + mu



# 2. Basic DataTypes

You may have noticed that, in some instances, array elements are displayed with a **trailing dot (e.g. 2. vs 2)**. This is due to a difference in the **data-type** used:

In [26]:
a = np.arange(10)

a.dtype

dtype('int64')

In [27]:
#You can explicitly specify which data-type you want:

a = np.arange(10, dtype='float64')
a

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [29]:
#The default data type is float for zeros and ones function

a = np.zeros((10, 10))

print(a)

a.dtype

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]


dtype('float64')

**other datatypes**

In [31]:
d = np.array([1+2j, 2+4j]) 
d  #Complex datatype

print(d.dtype)

complex128


In [32]:
b = np.array([True, False, True, False])  #Boolean datatype

print(b.dtype)

bool


In [33]:
s = np.array(['Ram', 'Robert', 'Rahim'])

s.dtype

dtype('<U6')

**Each built-in data type has a character code that uniquely identifies it.**

'b' − boolean

'i' − (signed) integer

'u' − unsigned integer

'f' − floating-point

'c' − complex-floating point

'm' − timedelta

'M' − datetime

'O' − (Python) objects

'S', 'a' − (byte-)string

'U' − Unicode

'V' − raw data (void)

**For more details**

**https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html**

# 3. Indexing and Slicing

**3.1 Indexing**

The items of an array can be accessed and assigned to the same way as other **Python sequences (e.g. lists)**:

In [0]:
a = np.arange(10)

print(a[5])  #indices begin at 0, like other Python sequences (and C/C++)

5


In [34]:
# For multidimensional arrays, indexes are tuples of integers:

a = np.diag([1, 2, 3])

print(a[2, 2])

3


In [35]:
a[2, 1] = 5 #assigning value

a

array([[1, 0, 0],
       [0, 2, 0],
       [0, 5, 3]])

**3.2 Slicing**

In [36]:
a = np.arange(10)

a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [0]:
a[1:8:2] # [startindex: endindex(exclusive) : step]

array([1, 3, 5, 7])

In [37]:
#we can also combine assignment and slicing:

a = np.arange(10)
a[5:] = 10
a

array([ 0,  1,  2,  3,  4, 10, 10, 10, 10, 10])

In [38]:
b = np.arange(5)
a[5:] = b[::-1]  #assigning

a

array([0, 1, 2, 3, 4, 4, 3, 2, 1, 0])

# 4. Copies and Views

A slicing operation creates a view on the original array, which is just a way of accessing array data. Thus the original array is not copied in memory. You can use **np.may_share_memory()** to check if two arrays share the same memory block. 

**When modifying the view, the original array is modified as well:**

In [39]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [40]:
b = a[::2]
b

array([0, 2, 4, 6, 8])

In [41]:
np.shares_memory(a, b)

True

In [42]:
b[0] = 10
b

array([10,  2,  4,  6,  8])

In [43]:
a  #eventhough we modified b,  it updated 'a' because both shares same memory

array([10,  1,  2,  3,  4,  5,  6,  7,  8,  9])

In [44]:


a = np.arange(10)

c = a[::2].copy()     #force a copy
c

array([0, 2, 4, 6, 8])

In [45]:
np.shares_memory(a, c)

False

In [46]:
c[0] = 10

a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# 5. Fancy Indexing

NumPy arrays can be indexed with slices, but also with boolean or integer arrays **(masks)**. This method is called **fancy indexing**. It creates copies not views.

**Using Boolean Mask**

In [47]:
a = np.random.randint(0, 20, 15)
a

array([17, 10,  1, 10, 13, 11,  3, 14, 12, 18, 10, 19, 13,  6, 10])

In [0]:
mask = (a % 2 == 0)

In [50]:
extract_from_a = a[mask]

extract_from_a

array([10, 10, 14, 12, 18, 10,  6, 10])

**Indexing with a mask can be very useful to assign a new value to a sub-array:**

In [51]:
a[mask] = -1
a

array([17, -1,  1, -1, 13, 11,  3, -1, -1, -1, -1, 19, 13, -1, -1])

**Indexing with an array of integers**

In [52]:
a = np.arange(0, 100, 10)

a

array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [53]:
#Indexing can be done with an array of integers, where the same index is repeated several time:

a[[2, 3, 2, 4, 2]]

array([20, 30, 20, 40, 20])

In [54]:
# New values can be assigned 

a[[9, 7]] = -200

a

array([   0,   10,   20,   30,   40,   50,   60, -200,   80, -200])