# Plotting and Programming in Python
## Reading Tabular Data into DataFrames
Questions
* How can I read tabular data?

Objectives
* Import the Pandas library.
* Use Pandas to load a simple CSV data set.
* Get some basic information about a Pandas DataFrame.

## Statistics on tabular data with the Pandas library
* Borrows many features from R’s dataframes:
 * A 2-dimenstional table whose columns have names and potentially have different data types

In [None]:
import pandas

data = pandas.read_csv('../data/gapminder_gdp_oceania.csv')
print(data)

### Use `index_col` to specify that a column’s values should be used as row headings

In [None]:
data = pandas.read_csv('../data/gapminder_gdp_oceania.csv', index_col='country')
print(data)

### Use `DataFrame.info` to find out more about a dataframe

In [None]:
data.info()

### The `DataFrame.columns` variable stores information about the dataframe’s columns

In [None]:
print(data.columns)

### Use `DataFrame.T` to transpose a dataframe

In [None]:
print(data.T)

### Use `DataFrame.describe` to get summary statistics about data

In [None]:
print(data.describe())

## Exercise - Reading Other Data
Read the data in `gapminder_gdp_americas.csv` (which should be in the same directory as `gapminder_gdp_oceania.csv`) into a variable called `americas` and display its summary statistics.

In [None]:
americas = pandas.read_csv('../data/gapminder_gdp_americas.csv', index_col='country')
print(americas.describe())

### Exercise - Inspecting Data
1. What method call will display the first three rows of this data?
1. What method call will display the last three columns of this data?

Hint: `help(americas.head)` and `help(americas.tail)`

In [None]:
help(americas.head)

In [None]:
americas.head(n=3)

In [None]:
americas.T.tail(n=3).T

### Exercise - Writing Data
Pandas provides a `to_csv` function to write dataframes to files. Write the dataframe `americas` to a file called `processed.csv`. You can use `help` to get information on how to use `to_csv`.

In [None]:
help(americas.to_csv)

In [None]:
americas.to_csv('processed.csv')

## About the NumPy library
* Optimized for matrix operations

In [None]:
import numpy
numpy.identity(3)

* Scalar multiplication and per-element multiplication

In [None]:
ones3x3 = numpy.ones([3,3])
print(ones3x3)

In [None]:
print(42 * ones3x3)
print(ones3x3 * ones3x3)

* The dot-product (the real matrix multiplication)

In [None]:
numpy.dot(ones3x3, ones3x3)

* To convert a dataframe into a 2D NumPy array

In [None]:
my_array = americas.values[:,1:].astype(float)
print(my_array.shape)
my_array[:3]

### NumPy Linear Algebra Example
Singular Value Decomposition:

In [None]:
u, s, v = numpy.linalg.svd(my_array)
u.shape, s.shape, v.shape