# Why NumPy?
In Data science, ML, many a times, large arrays need to be operated on. However, normal lists in python are pretty slow, as they come with facilities like ability to store multiple types of data, being able to operate on it, etc.
Numpy is specifically designed for this task. Numpy arrays are written in C/C++, which contributes to their speed.<br>
Computers have GHz rating, the frequency at which the internal clock ticks. 10^6 is normal nowadays, however, lists' operations in python are 10-100 times slower.


# Lists operations in python
Before starting with python, summarising different ways of dealing with lists.

In [1]:
N = 100000000;


In [2]:
%%time
list1 = list(range(N));

CPU times: user 1.41 s, sys: 1.77 s, total: 3.19 s
Wall time: 3.17 s


In [4]:
%%time
for i in range(N):
    list1[i] = list1[i] ** 2;

# This is pretty slow - For 10^8 operations with 10^9 clock, ideally, it should've taken ~ 0.1s(order), but it took time of the order 10s, so almost 100 times slower.

CPU times: user 39 s, sys: 61.8 ms, total: 39 s
Wall time: 39.1 s


In [5]:
%%time
list2 = [x*x for x in list1];
# A little better than the above. (Takes around 1/3 of the time, but still slow, as order still 10)

CPU times: user 9.37 s, sys: 3.55 s, total: 12.9 s
Wall time: 13 s


In [6]:
%%time
list1 = map(lambda x: x * x, list1);
# This is much much faster
# So basically, using inbuilt functions which are designed for better performance, give better results

CPU times: user 6 µs, sys: 1 µs, total: 7 µs
Wall time: 10.7 µs


In [4]:
map?

#Numpy

In [2]:
import numpy as np

In [7]:
%%time
arr = np.arange(N); # Similar to list(range(N))
# Normal one took ~ 3s., this is 10 times faster

CPU times: user 78.7 ms, sys: 243 ms, total: 321 ms
Wall time: 322 ms


In [8]:
%%time
arr = arr * arr;
# This is also much faster than most methods above, combining with array creating time, it is the fastest

CPU times: user 123 ms, sys: 150 ms, total: 273 ms
Wall time: 278 ms


In [9]:
arr = np.sum(arr);

# High Dimensional Arrays

In [4]:
arr = np.arange(5);

In [5]:
print(arr, type(arr))

[0 1 2 3 4] <class 'numpy.ndarray'>


In [6]:
# Typecasting a normal python list into a numpy array
arr = np.array([0, 1, 'a']);

In [8]:
print(arr) # Can see that 0, 1 got auto. converted to strings => numpy arrays can actually not store different data types

['0' '1' 'a']


In [3]:
arr3d = np.array([
                  [
                   [1, 2, 3],
                   [4, 5, 6]
                  ],
                  [
                   [7, 8, 9],
                   [10, 11, 12]
                  ]
])

In [4]:
arr3d

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [5]:
arr3d.shape # No. of elements in each dimension

(2, 2, 3)

In [6]:
arr3d.ndim # Total no. of dimensions

3

In [7]:
arr3d.size # Total no. of elements, or, 2 * 2 * 3 (= product of all values from arr3d.shape)

12

In [14]:
# Arrays returned by this method have data type of float.
onesArray = np.ones((2,3)) # Shape should be in the form of a tuple

In [15]:
print(onesArray, onesArray.dtype)

[[1. 1. 1.]
 [1. 1. 1.]] float64


In [19]:
print(1769 * onesArray, 2 + onesArray) # like in matlab, called broadcasting here

[[1769. 1769. 1769.]
 [1769. 1769. 1769.]] [[3. 3. 3.]
 [3. 3. 3.]]


In [20]:
zerosArray = np.zeros((2, 3, 4))

In [21]:
print(zerosArray)

[[[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]

 [[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]


In [22]:
print((zerosArray + 4) * 3)

[[[12. 12. 12. 12.]
  [12. 12. 12. 12.]
  [12. 12. 12. 12.]]

 [[12. 12. 12. 12.]
  [12. 12. 12. 12.]
  [12. 12. 12. 12.]]]


In [24]:
np.random.randn(2, 3) # From a normal dist. of mean 0 & variance 1, selects random values and creates matrix shape given as input

array([[-0.74318517,  0.05331505,  0.6257689 ],
       [ 1.36346866, -1.252186  , -0.52303754]])

In [25]:
np.random.rand(2, 3) # Gives 2 x 3, elements from uniform distribution between 0 and 1.

array([[0.38877393, 0.29499114, 0.61545922],
       [0.49292666, 0.58084515, 0.823579  ]])

In [26]:
np.random.randint(0, 100, (2, 3)) # random ints in shape 2x3, from 0 to 100.

array([[91, 22, 43],
       [81, 29, 21]])

In [30]:
np.arange(7, 71, 5) # From 7 to 71(excluded), array of numbers at intervals of 5(stepsize)

array([ 7, 12, 17, 22, 27, 32, 37, 42, 47, 52, 57, 62, 67])

In [33]:
np.linspace(1, 100, 11) # Between 1 and 100(included), array of 11 linearly spaced numbers

array([  1. ,  10.9,  20.8,  30.7,  40.6,  50.5,  60.4,  70.3,  80.2,
        90.1, 100. ])

In [34]:
# Numpy arrays can be of any type
np.array([True, False]);
np.array(['Hi', 'ray', 'hi']);

In [35]:
arr = np.array(['1.1', '2.3', '3.4']);

In [40]:
np.array(arr, dtype = 'float') # As np arrays have one data, this is how to kind of type cast np arrays.

array([1.1, 2.3, 3.4])

# Indexing