# Numpy: Numeric computing library

NumPy (Numerical Python) is one of the core packages for numerical computing in Python. Pandas, Matplotlib, Statmodels and many other Scientific libraries rely on NumPy.

In [35]:
import numpy as np

## Basic Numpy Arrays
One dimensional arrays

How can you Create arrays?

In [12]:
np.array([1, 2, 3, 4])

array([1, 2, 3, 4])

In [4]:
a= np.array([1, 2, 3, 4])

In [5]:
b = np.array([0, .5, 1, 1.5, 2])

How can you access to individual elements of a Numpy array?

- Extract the first and second element of the array a?

In [6]:
a[0], a[1]

(1, 2)

Slicing works the same way as a list

### Slicing arrays:
Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this: [start:end].

We can also define the step, like this: [start:end:step].

If we don't pass start its considered 0

If we don't pass end its considered length of array in that dimension

If we don't pass step its considered 1


In [7]:
a[0:]

array([1, 2, 3, 4])

- Extract the elements from the second element to the third one 

In [14]:
a[1:3]

array([2, 3])

- Extract the elements from the second element to the third one (using negative indexing)

In [9]:
a[1:-1]

array([2, 3])

In [10]:
a[::2]

array([1, 3])

Multi indexing

In [11]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [12]:
b[0], b[2], b[-1]

(0.0, 1.0, 2.0)

Here using multi indexing you are creating another array

In [13]:
b[[0, 2, -1]]

array([0., 1., 2.])

## Array Types





In [20]:
a



array([1, 2, 3, 4])

In [21]:


a.dtype



dtype('int32')

In [22]:


b

array([0. , 0.5, 1. , 1.5, 2. ])

In [23]:

b.dtype



dtype('float64')

You can always change the type of your numpy array

In [19]:


np.array([1, 2, 3, 4], dtype=np.float)



array([1., 2., 3., 4.])

You can also change the number of bytes for better performance

In [20]:


np.array([1, 2, 3, 4], dtype=np.int8)



array([1, 2, 3, 4], dtype=int8)

Numpy stores numbers, date and booleans but not a regular individual objects but there is a way to store strings and it has its own type and its related to unicode representation memory. But again numpy is usually used for numeric processing

In [25]:

c = np.array(['av', 'bv', 'cv'])

c.dtype



dtype('<U2')




## Dimensions and shapes
The idea of Numpy is that we can create multi dimensional arrays. and numpy has many attributes and functions to work with multi dimensional arrays

In [27]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])



In [28]:


A.shape



(2, 3)

- How many dimensions it has? 

In [29]:

A.ndim



2

In [30]:

A.size



6

Let's create a three dimensional array which is basically a cube 

In [37]:

B = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [3, 2, 1]
    ]
])



In [38]:
B.dtype

dtype('int32')

In [33]:


B.shape



(2, 2, 3)

In [34]:
B.ndim

3

In [35]:
B.size

12

You have always to be carefull when you are creating these multi dimensional arrays. if the dimensions don't match.  

If the shape isn't consistent, it'll just fall back to regular Python objects:



In [26]:


C = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4]
    ]
])



In [27]:

C.dtype



dtype('O')

In [28]:


C.shape




(2,)

In [29]:
C.size

2

# Indexing and Slicing of Matrices

In [42]:
A = np.array([
#.   0. 1. 2
    [1, 2, 3], # 0
    [4, 5, 6], # 1
    [7, 8, 9]  # 2
])

How can you get the seconde element(row)?

In [43]:
A[1]

array([4, 5, 6])

How can you get the first element of the second row?

In [44]:
A[1][0]

4

There is a better way by using the the multi dimensional selection of Numpy [line,column]--> [dim1,dim2,dim3,..].

In [45]:
A[1, 0]

4

- Extract the first two lines

In [46]:
A[0:2]

array([[1, 2, 3],
       [4, 5, 6]])

- Extract the first two elements of all rows 

In [47]:
A[:, :2]

array([[1, 2],
       [4, 5],
       [7, 8]])

- Extract the first two elements of the first two rows

In [48]:
A[:2, :2]

array([[1, 2],
       [4, 5]])

- Extract the last element of the first two rows

In [49]:
A[:2, 2:]

array([[3],
       [6]])

Moving forward for modification 

In [50]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Assign new array to an entire row 

In [51]:
A[1] = np.array([10, 10, 10])

In [52]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [ 7,  8,  9]])

Or we can use what we call an expand operation 

In [53]:
A[2] = 99

In [54]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [99, 99, 99]])

## Summary statistics
Numpy has a huge advantage of containing many operations you can perform on top of your arrays. So the first one is all the summers basic methods. so many methods are build in 

In [55]:
a = np.array([1, 2, 3, 4])

- How to get the sum of the elements of an array?

In [56]:
a.sum()

10

- How to get the average of the elements of an array?

In [None]:
How to get the average of the elements of an array

In [57]:


a.mean()



2.5

- How to get the standard deviation?

In [58]:
a.std()

1.118033988749895

- how to get the variance? 

In [59]:
a.var()

1.25

We can apply these functions on matrices.

In [60]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [61]:
A.sum()

45

In [62]:
A.mean()

5.0

In [63]:


A.std()



2.581988897471611

We can apply these methods per axis

In [64]:
A.sum(axis=0)

array([12, 15, 18])

In [65]:
A.sum(axis=1)

array([ 6, 15, 24])

In [66]:
A.mean(axis=0)

array([4., 5., 6.])

In [67]:
A.mean(axis=1)

array([2., 5., 8.])

In [68]:
A.std(axis=0)

array([2.44948974, 2.44948974, 2.44948974])

In [69]:
A.std(axis=1)

array([0.81649658, 0.81649658, 0.81649658])

# Broadcasting and Vectorized operations


In [70]:
a = np.arange(4)

In [71]:
a

array([0, 1, 2, 3])

In [72]:
a+10

array([10, 11, 12, 13])

Numpy is an immutable library (operation that we perform will not modify the array but it will return a new array)

In [73]:
a

array([0, 1, 2, 3])

In [74]:
a * 10

array([ 0, 10, 20, 30])

In [75]:


a += 100



In [76]:
a

array([100, 101, 102, 103])

In [42]:
l = [0, 1, 2, 3]

In [43]:
[i * 10 for i in l]

[0, 10, 20, 30]

In [80]:
a = np.arange(4)

In [82]:
a

array([0, 1, 2, 3])

In [83]:
b = np.array([10, 10, 10, 10])

In [84]:
b

array([10, 10, 10, 10])

In [86]:
a + b

array([10, 11, 12, 13])

In [87]:
a*b

array([ 0, 10, 20, 30])

# Boolean arrays

What happens when you apply boolean operations?

In [89]:
a = np.arange(4)

In [90]:
a

array([0, 1, 2, 3])

In [91]:
a[0], a[-1]

(0, 3)

Second way: i can select these elements using multi index selection

In [92]:
a[[0, -1]]

array([0, 3])

Third way: with boolean arrays

In [94]:
a[[True, False, False, True]]

array([0, 3])

In [95]:
a

array([0, 1, 2, 3])

In [96]:
a >= 2

array([False, False,  True,  True])

to select the numbers greater or equal to 2

In [97]:
a[a >= 2]

array([2, 3])

In [98]:
a.mean()

1.5

In [99]:
a[a > a.mean()]

array([2, 3])

- If i want to select the elements that are not greater than the average 

In [101]:
a[~(a > a.mean())]

array([0, 1])

- We can include other boolean operators 

In [103]:
a[(a == 0) | (a == 1)]

array([0, 1])

In [104]:
a[(a <= 2) & (a % 2 == 0)]

array([0, 2])

In [105]:
A = np.random.randint(100, size=(3, 3))

In [107]:
A

array([[38, 59,  1],
       [65, 39, 29],
       [22, 74, 41]])

- Select some elements using booleana arrays

In [108]:
A[np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])]

array([38,  1, 39, 22, 41])

In [109]:
A > 30

array([[ True,  True, False],
       [ True,  True, False],
       [False,  True,  True]])

In [111]:
A[A > 30]

array([38, 59, 65, 39, 74, 41])

# Linear Algebra

In [112]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [113]:
B = np.array([
    [6, 5],
    [4, 3],
    [2, 1]
])

Dot product of two arrays

In [114]:
A.dot(B)

array([[20, 14],
       [56, 41],
       [92, 68]])

In [115]:


A @ B



array([[20, 14],
       [56, 41],
       [92, 68]])

In order to get the transposed array.

In [116]:
B.T

array([[6, 4, 2],
       [5, 3, 1]])

In [117]:
B.T @ A

array([[36, 48, 60],
       [24, 33, 42]])

# Size of objects in Memory


 Int, floats

In [47]:
# An integer in Python is > 28bytes
sys.getsizeof(1)

28

In [119]:
# Longs are even larger
sys.getsizeof(10**100)

72

Lists are even larger

In [124]:
# A one-element list
sys.getsizeof([1])

64

In [125]:
# An array of one element in numpy
np.array([1]).nbytes

4

And performance is also important

In [126]:
l = list(range(100000))

In [127]:
a = np.arange(100000)

In [128]:
%time np.sum(a ** 2)

Wall time: 942 µs


216474736

In [129]:
%time sum([x ** 2 for x in l])

Wall time: 46.3 ms


333328333350000

# Useful Numpy functions

### random

In [130]:
np.random.random(size=2)

array([0.94187431, 0.98689537])

In [131]:
np.random.normal(size=2)

array([ 0.09946516, -0.00122053])

In [132]:
np.random.rand(2, 4)

array([[0.42080495, 0.27145559, 0.31558637, 0.12054533],
       [0.56327072, 0.16642928, 0.83017258, 0.28719899]])

### arange





In [133]:


np.arange(10)




array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [134]:

np.arange(5, 10)


array([5, 6, 7, 8, 9])

In [135]:

np.arange(0, 1, .1)


array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

### reshape

np.arange(10).reshape(2, 5)

np.arange(10).reshape(5, 2)



### linspace

np.linspace(0, 1, 5)

np.linspace(0, 1, 20)

np.linspace(0, 1, 20, False)



### zeros, ones, empty

np.zeros(5)

np.zeros((3, 3))

np.zeros((3, 3), dtype=np.int)

np.ones(5)

np.ones((3, 3))

np.empty(5)

np.empty((2, 2))



### identity and eye

np.identity(3)

np.eye(3, 3)

np.eye(8, 4)

np.eye(8, 4, k=1)

np.eye(8, 4, k=-3)



