## What is pandas?
### Working with pandas
*Curtis Miller*

In this notebook we see a preview of pandas `Series` and `DataFrame`s.

We load in a data set from the hard drive and compare the NumPy object to a pandas object.

In [None]:
import numpy as np
import pandas as pd

In [None]:
schema = np.dtype([('sepal_length', np.float16),    # Need to define a custom dtype to read CSV of mixed data type
                   ('sepal_width',  np.float16),
                   ('petal_length', np.float16),
                   ('petal_width',  np.float16),
                   ('species',      '<U16')])

In [None]:
np_data = np.loadtxt("iris.csv", skiprows=1, dtype=schema, delimiter=',')

In [None]:
np_data

In [None]:
type(np_data)

In [None]:
np_data[:5]    # Slicing operations

In [None]:
np_data[:5]['sepal_length']

In [None]:
np_data[:5][['petal_length', 'species']]

But there is a better way, with pandas.

In [None]:
pd_data = pd.read_csv("iris.csv")

In [None]:
pd_data

In [None]:
type(pd_data)

In [None]:
pd_data.head()

In [None]:
pd_data.head().sepal_length

In [None]:
pd_data.head().loc[:, ['petal_length', 'species']]

In [None]:
type(pd_data.sepal_length)

At its core, though, pandas is built on top of NumPy.

In [None]:
pd_data.values

In [None]:
np_pd_data = pd.DataFrame(np_data)    # Converting to a DataFrame
np_pd_data