# Chapter 3: Data Manipulation with Pandas

`Series` and `DataFrame` objects for dealing with compound, heterogeneous data.

In [1]:
import pandas
pandas.__version__

'0.23.0'

In [5]:
# start by introducing Series, DataFrame, and Index objects
import numpy as np
import pandas as pd

## The Pandas Series Object

The Pandas `Series` object is a 1D array of indexed data. It can be created from a list or array:

In [6]:
data = pd.Series([0.25, 0.5, 0.75, 1.0])
data

0    0.25
1    0.50
2    0.75
3    1.00
dtype: float64

Each value has an index, which we can access with the `values` and `index` attributes. Similar to NumPy array.

In [7]:
data.values

array([0.25, 0.5 , 0.75, 1.  ])

The index is an array-like object of type `pd.Index` which we will discuss in more detail momentarily.

In [8]:
data.index

RangeIndex(start=0, stop=4, step=1)

Like numpy, data is accessed by the familiar square brackets.

In [9]:
data[1]

0.5

In [10]:
data[1:3]

1    0.50
2    0.75
dtype: float64

### Series as generalized NumPy array

`Series` seems very similar to a NumPy array, but the essential difference is the presence of the index. NumPy arrays have an *implicitly defined* index, whereas `Series` objects have an *explicitly defined* index associated with values.  

This index doesn't need to be an integer. They can be values of any type. For example, we can use strings as an index:

In [13]:
data = pd.Series([0.25, 0.5, 0.75, 1.0],
                 index = ["a", "b", "c", "d"])
data

a    0.25
b    0.50
c    0.75
d    1.00
dtype: float64

In [14]:
# item access works as expected
data["a"]

0.25

In [15]:
# we can even use concontiguous or nonsequential indices:
data = pd.Series([0.25, 0.5, 0.75, 1.0],
                 index = [2, 5, 3, 7])
data

2    0.25
5    0.50
3    0.75
7    1.00
dtype: float64

In [16]:
data[5]

0.5

### Series as a specialized dictionary

