# Introduction to NumPy

In [Tutorial 1](https://github.com/krittikaiitb/tutorials/tree/master/Tutorial_1), you have seen list, tuple and dictionaries (and we hinted at sets). However, these are not particularly useful when you want to do numerical calculations, or matrix operations. You could try to write a for loop that would iterate through each element, however Python is very slow with operations like that. 

A better (more Pythonic) way is to use the package known as `numpy`. Numpy arrays are similar to lists and tuples etc. in that they hold multiple values. However, all elements in a numpy array must be of the same data type. 

Disregarding that small setback, however, we now have a lot more functionality in numpy arrays. It is convenient to think of the behaviour of numpy arrays as similar to a row vector.

In [1]:
import numpy as np # This is how you can import packages in python. 

The above line imports the package numpy (which includes all the functions we will be using). The part `as np` renames it to `np` (just for this program, don't worry), which is slightly more convenient. 

Let us assign a numpy array to a variable x. 

In [2]:
x=np.array([1,2,3,4])
#x=[1,2,3,4]
y=2*x
print("{}\n{}".format(x,y))

[1 2 3 4]
[2 4 6 8]


Try the above code with a normal list. Now, hopefully it is clear why `numpy` is powerful. But that is just the beginning!

In [3]:
x = np.arange(0,100, 1) # This is similar to the range function from earlier, except that it returns a numpy array
print(x[[4,5,6,7,8,9,10]])

[ 4  5  6  7  8  9 10]


Trying a similar trick with a list instead:

In [4]:
x = [1,2,3,4,5,6,7,8,9,10]
print(x[[0,1,2]])

TypeError: list indices must be integers or slices, not list

So `numpy` also supports indexing with lists, instead of only integers or slices, whereas lists don't. 

Similar to the `len` function, numpy also has several functions to describe an array. These are also available as methods, which can be used as follows:

In [5]:
x=np.array([[1,2,3],[4,5,6],[7,8,9]])
print(x)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [6]:
print(x.ndim) # number of dimensions. This is a 2D array
print(x.shape) # Shape of array
print(x.size) # Total number of elements in array (product of elements in shape)

2
(3, 3)
9


Multiplication, like addition is element wise. If matrix multiplication is required, we can either convert it into a numpy matrix or use `np.dot()`

In [7]:
x*x

array([[ 1,  4,  9],
       [16, 25, 36],
       [49, 64, 81]])

In [8]:
np.matrix(x)*np.matrix(x)

matrix([[ 30,  36,  42],
        [ 66,  81,  96],
        [102, 126, 150]])

In [9]:
np.dot(x,x)

array([[ 30,  36,  42],
       [ 66,  81,  96],
       [102, 126, 150]])

Comparisons on numpy arrays are easier now

In [10]:
x>5

array([[False, False, False],
       [False, False,  True],
       [ True,  True,  True]])

We can also use conditional expressions like indices

In [11]:
y=np.copy(x) # Simply typing y=x doesnot make a new array, it just adds a reference to the old one.
             # Changes in y will then be reflected in x
y[y>5]=0
y

array([[1, 2, 3],
       [4, 5, 0],
       [0, 0, 0]])

If we just want the indices where a certain condition is satisified, we can use `np.where()`

In [12]:
x=np.array([0,1,2,3,4,5,6,7,0.5,0.6])
indices=np.where(x<5)
x[indices]

array([0. , 1. , 2. , 3. , 4. , 0.5, 0.6])

We can also delete elements using `numpy.delete`

In [13]:
x=np.array([[1,2,3],[4,5,6],[7,8,9]])
index=np.where(x[:,0]>5)
y=np.delete(x,index,axis=1)
print(y)

[[1 2]
 [4 5]
 [7 8]]


Other useful functions in numpy like `numpy.arange`, `numpy.linspace` often come in handy. We have already seen `arange`

In [14]:
np.arange(0,1,0.1) # start(inclusive):stop(exclusive):step

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

In [15]:
np.linspace(0,1,5) #start:stop(inclusive):number of elements

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

That is not even it. Several mathematical functions are also available:

In [16]:
x = np.arange(-np.pi, np.pi, 0.1)
y = np.sin(x)

In [17]:
y2 = np.exp(x)

If you thought `numpy` has revealed all its cards, it hasn't yet! <br>
You can also find the mean, standard deviation, maximum, minimum of an array, sort an array, do matrix computations and so much more. 

If you feel right now that NumPy is vast, remember we have barely scratched the surface. In this module and the next is where your Googling skills will be put to the test. 

If you want to find the official docstrings for any function, an easy way to get it from Jupyter is to use `?`. For example

In [18]:
# Note that this is a very useful function, since we can use it to filter out data. 
np.where?

### Your assignment ...
... should you choose to accept it, will be the following:
1. Use the data from _<data_file> to find the distance to the Beehive Cluster.
(Magnitude Systems)