numpy scipy pandas matplotlib scikit-learn

# NumPy: Numerical Python

NumPy provides an efficient way to store and manipulate multi-dimensional dense arrays in Python. The important features of NumPy are:

    It provides an ndarray structure, which allows efficient storage and manipulation of vectors, matrices, and higher-dimensional datasets.
    It provides a readable and efficient syntax for operating on this data, from simple element-wise arithmetic to more complicated linear algebraic operations.


In [1]:
import numpy as np
x = np.arange(1, 10)
x

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [2]:
x ** 2

array([ 1,  4,  9, 16, 25, 36, 49, 64, 81])

Unlike Python lists (which are limited to one dimension), NumPy arrays can be multi-dimensional. For example, here we will reshape our x array into a 3x3 array:


In [3]:
M = x.reshape((3, 3))
M



array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

A two-dimensional array is one representation of a matrix, and NumPy knows how to efficiently do typical matrix operations. For example, you can compute the transpose using .T:

In [4]:
M.T

array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

# Pandas: Labeled Column-oriented Data

Pandas is a much newer package than NumPy, and is in fact built on top of it. What Pandas provides is a labeled interface to multi-dimensional data, in the form of a DataFrame object that will feel very familiar to users of R and related languages. DataFrames in Pandas look something like this:


In [7]:
import pandas as pd
df = pd.DataFrame({'label': ['A', 'B', 'C', 'A', 'B', 'C'],
                   'value': [1, 2, 3, 4, 5, 6]})
df

Unnamed: 0,label,value
0,A,1
1,B,2
2,C,3
3,A,4
4,B,5
5,C,6


In [9]:
df['label']

0    A
1    B
2    C
3    A
4    B
5    C
Name: label, dtype: object

In [11]:
df['value'].sum()

21

In [14]:
df.groupby('label').sum()

Unnamed: 0_level_0,value
label,Unnamed: 1_level_1
A,5
B,7
C,9


Bookmark: Metplotlib to be started:http://nbviewer.jupyter.org/github/jakevdp/WhirlwindTourOfPython/blob/master/15-Preview-of-Data-Science-Tools.ipynb
