## Lesson 2 - Functions, Pandas and Matplotlib

Here will introduce Functions, Pandas, and Matplotlib. Pandas uses DataFrames (tables, much like R DataFrames) and Series (columns of a DataFrame) with powerful SQL-like queries. Matplotlib is a package for plotting, which uses a MATLAB-style syntax.



### Table of Contents

* [Functions](#functions)
* [Pandas](#pandas)
* [Matplotlib](#matplotlib)

<a id="functions"></a>

### Functions

A function is a block of code which only runs when it is called.

You can pass data, known as parameters, into a function.

A function can return data as a result.

In [11]:
def call_me(s):
    print(s)

call_me("Yo Man!")

Yo Man!


In [13]:
# call function multiple times
call_me("David")
call_me("Hippo")
call_me("Emily")

David
Hippo
Emily


Return Values
To let a function return a value, use the return statement:

In [16]:
def get_ntd_dollar(usd):
    return 32 * usd

print(get_ntd_dollar(100))

3200


In [2]:
def sum(a, b):
    s = 0
    s = a + b
    return s

sum(1, 2)

3

In [7]:
# once function is declared, we can re-use funtion in the whole python code
sum(1.3, 2.9)

4.2

Additionally, you can define functions to take `*x` and `**y` arguments. This allows a function to accept any number of positional and/or named arguments that aren't specifically named in the declaration. 

Example with `*` (positional arguments):

In [8]:
def sum(*values):
    s = 0
    for v in values:
        s = s + v
    return s

sum(1, 2, 3, 4, 5)

15

<a id="pandas"></a>

### Pandas

#### What is Pandas?

Pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. 

#### Library features

* DataFrame object for data manipulation with integrated indexing
* Tools for reading and writing data between in-memory data structures and different file formats
* Data alignment and integrated handling of missing data
* Reshaping and pivoting of data sets
* Label-based slicing, fancy indexing, and subsetting of large data sets
* Data structure column insertion and deletion
* Group-by engine allowing split-apply-combine operations on data sets
* Data set merging and joining
* Hierarchical axis indexing to work with high-dimensional data in a lower-dimensional data structure
* Time series-functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging

The library is highly optimized for performance, with critical code paths written in Cython or C.

#### Install packages

Install pandas and matplotlib using if you haven't already. If you're not sure, you can type `conda list` at a terminal prompt.

```
conda install pandas
conda install matplotlib
```

#### Import modules

In [17]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

#### Read data from CSV

In [None]:
# to be continue