# Exploring Data Science tools in Python
> Libraries like NumPy, SciPy, Matplotlib and Pandas are few powerful libraries to make your life easier.
- toc: true
- branch: master
- badges: true
- author: Mayukh Datta
- comments: true
- image: https://raw.githubusercontent.com/thecoducer/blog/master/images/social-media-data-science-memes.png
- categories: [tools, libraries]
- hide: false

![](https://raw.githubusercontent.com/thecoducer/blog/master/images/social-media-data-science-memes.png)

# NumPy

NumPy stands for Numerical Python. It is a library for the Python, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

## Why Numpy is faster than lists?

- Numpy is faster than lists since Numpy elements have fixed data type.
- Numpy has contiguous memory
- No type checking when reading objects
- Read bytes of memory faster

## Applications of Numpy:

- Mathematics (MATLAB replacement)
- Plotting (using matloptlib)
- Backend (Pandas, Connect 4, Digital photography)
- Machine learning (Numpy helps to give a foundation to tensors)

In [68]:
import numpy as np

## One-dimensional array

In [69]:
one = np.array([45, 58, 96], dtype="int16")
print(one)

[45 58 96]


- `dtype` explicitly defines the datatype for all elements in a NumPy array. Read more https://www.geeksforgeeks.org/data-type-object-dtype-numpy-python/

## Two-dimensional array

In [70]:
two = np.array([[4, 5, 8], [2, 7, 6]]) # list of lists to 2D array 
print(two)

[[4 5 8]
 [2 7 6]]


## Three-dimensional array

In [71]:
three = np.array([[[1, 2], [3, 4], [1, 9], [6, 0]], 
                  [[5, 6], [3, 4], [7, 8], [7, 4]],
                  [[5, 2], [9, 8], [6, 0], [5, 6]]])
print(three)

[[[1 2]
  [3 4]
  [1 9]
  [6 0]]

 [[5 6]
  [3 4]
  [7 8]
  [7 4]]

 [[5 2]
  [9 8]
  [6 0]
  [5 6]]]


## Dimension?

In [72]:
print(one.ndim)
print(two.ndim)
print(three.ndim)

1
2
3


## Shape?

In [73]:
print(one.shape)
print(two.shape) # row * column
print(three.shape) # row * column * depth

(3,)
(2, 3)
(3, 4, 2)


## Datatype?

In [74]:
print(one.dtype)
print(three.dtype)

int16
int64


## Bytes of memory each dtype consumes

In [88]:
print(one.itemsize)
print(three.itemsize)
# 2 bytes for int16
# 4 bytes for int32
# 8 bytes for int64, float 

2
8


## Length?

In [89]:
# Get length of np array
print(one.size)
print(two.size)
print(three.size)

3
6
24


## Total bytes consumed by a container

In [94]:
print(three.size, "*", three.itemsize, "=", three.size * three.itemsize)

# other way to do this
print(three.nbytes)

24 * 8 = 192
192


## Acccessing/Changing specific elements, rows, columns etc.

In [96]:
d = np.array([[1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14]])
print(d)

[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14]]


In [79]:
# Get a specific element - [r, c]
d[0, 3]

4

In [80]:
# Get a specific row
print(d[0, :]) # first row
print(d[0, 1:6])

[1 2 3 4 5 6 7]
[2 3 4 5 6]


In [81]:
# Get a specific column
print(d[:, 2]) # third column

[ 3 10]


In [82]:
# Getting little more fancy [startindex:endindex:stepsize]
print(d[0, 1:-1:2])

[2 4 6]


In [83]:
# Replace elements
d[1, 4] = 55
print(d)

[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 55 13 14]]


In [84]:
d[:, 2] = [117, 17]
print(d)

[[  1   2 117   4   5   6   7]
 [  8   9  17  11  55  13  14]]


In [85]:
# Just another matrix
f = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(f)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [86]:
print(f[0, 1, 0]) # accessing a elements is easy if we do it pathwise

3


In [87]:
print(f[:, ])

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


## Transpose 

How to a transpose row vector? Will it change into a column vector? Read here https://stackoverflow.com/a/5954747/7936081