# 1-1 Create  Pandas Series Object

How do we create a series object in Pandas?

The Pandas Series data structure is a one-dimensional labeled array with both dict and array properties.

* data can be any type
* data must be homogenous

Syntax to create a Series:

```
s = pandas.Series(data, index=index)
```



# # Import Pandas Library

In [71]:
import pandas
import numpy

# # Example 1 - Create a Series Object

In [72]:
s1 = pandas.Series( [33, 19, 15, 89, 11, -5, 9] )

# the default index is a series of integers
s1

0    33
1    19
2    15
3    89
4    11
5    -5
6     9
dtype: int64

In [3]:
# series type is a pandas series
type(s1)

pandas.core.series.Series

In [6]:
# access series values with the 'values' method
s1.values

array([33, 19, 15, 89, 11, -5,  9])

In [7]:
# the values method takes index arguments
s1.values[0]

33

In [8]:
# type of data WITHIN the series object is a NumPy ndarray
type(s1.values)

numpy.ndarray

In [9]:
# get the array indices with the 'index' method
s1.index

Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64')

In [10]:
# any series is a mapping from index to values
s1

0    33
1    19
2    15
3    89
4    11
5    -5
6     9
dtype: int64

# # Example 2 - Create a Series Object with an Index

In [13]:
data1 = [33, 19, 15, 89, 11, -5,  9]
index1 = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']

s2 = pandas.Series(data=data1, index=index1)

s2

Mon    33
Tue    19
Wed    15
Thu    89
Fri    11
Sat    -5
Sun     9
dtype: int64

In [14]:
# verify the index
s2.index

Index([u'Mon', u'Tue', u'Wed', u'Thu', u'Fri', u'Sat', u'Sun'], dtype='object')

In [21]:
# the series can function much like a data frame
# add 'column' labels to the series and index

s2.name = 'Daily Temperatures'
s2.index.name = 'Weekday'

s2

Weekday
Mon        33
Tue        19
Wed        15
Thu        89
Fri        11
Sat        -5
Sun         9
Name: Daily Temperatures, dtype: int64

## Example 3 - Series Data is Homogenous


In [26]:
# all the data elements will be cast to the highest required type

# element[2] is a float

data2 = [33, 19.3, 15, 89, 11, -5, 9]

s3 = pandas.Series(data=data2, index=index1)

# verify the series data type - all elements are now floats
s3

Mon    33.0
Tue    19.3
Wed    15.0
Thu    89.0
Fri    11.0
Sat    -5.0
Sun     9.0
dtype: float64

## Example 4 - Create a Series from a Dict

In [28]:
dict1 = {'Mon':33, 'Tue':19, 'Wed':15, 'Thu':89, 'Fri':11, 'Sat':-5, 'Sun':9}

# don't need to specify an index
s4 = pandas.Series(data = dict1)

# automatically ordered by alphabetical index
s4

Fri    11
Mon    33
Sat    -5
Sun     9
Thu    89
Tue    19
Wed    15
dtype: int64

# Discussion

A Series can be thought of as an ordered associative array (k => v).

* order is represented by index offset
* k=>v maps from index/label to a data value
* use both position/offset and key/label indices

## Similarity to NumPy ndarray

### Vectorized Operations

In [36]:
# operations are vectorized
print(s4)

print(s4*2)

Fri    11
Mon    33
Sat    -5
Sun     9
Thu    89
Tue    19
Wed    15
dtype: int64


Fri     22
Mon     66
Sat    -10
Sun     18
Thu    178
Tue     38
Wed     30
dtype: int64


In [37]:
# still vectorized
numpy.log(s4)

Fri    2.397895
Mon    3.496508
Sat         NaN
Sun    2.197225
Thu    4.488636
Tue    2.944439
Wed    2.708050
dtype: float64

### Getting and Setting Values

In [38]:
# access like array and 
# slice series using index labels
s4['Thu':'Wed']

Thu    89
Tue    19
Wed    15
dtype: int64

In [47]:
# slice using postion

print( s4 )
print("\n")

# [1:3] means positions 1,2 (up to, but not including 3)
print( s4[1:3] )

Fri    11
Mon    33
Sat    -5
Sun     9
Thu    89
Tue    19
Wed    15
dtype: int64


Mon    33
Sat    -5
dtype: int64


In [48]:
# retrive value using offset
s4[1]

33

In [50]:
# set value using offset
s4[1] = 199

s4

Fri     11
Mon    199
Sat     -5
Sun      9
Thu     89
Tue     19
Wed     15
dtype: int64

### Passing Series as Argument

In [55]:
# Series is a subclass of ndarray, thus is a valid arg to most NumPy functions
s4

Fri     11
Mon    199
Sat     -5
Sun      9
Thu     89
Tue     19
Wed     15
dtype: int64

In [52]:
# median
s4.median()

15.0

In [53]:
# maximum
s4.max()

199

In [61]:
# cumulative sum
s4.cumsum()

Fri     11
Mon    210
Sat    205
Sun    214
Thu    303
Tue    322
Wed    337
dtype: int64

### Looping over Collections and Indices

In [70]:
# loop via index offset ( use enumerate() )
for i,v in enumerate(s4):
    print i,v

0 11
1 199
2 -5
3 9
4 89
5 200
6 15


In [65]:
# create new list via list comprehension
new_list = [x**2 for x in s4]

new_list

[121, 39601, 25, 81, 7921, 361, 225]

## Dict-Like Behavior of Series Object

### Access by key

In [66]:
# does a key exist?
'Sun' in s4

True

In [67]:
# get value by key
s4['Tue']

19

In [68]:
# assign value using key
s4['Tue'] = 200

s4

Fri     11
Mon    199
Sat     -5
Sun      9
Thu     89
Tue    200
Wed     15
dtype: int64

### Loop over Dict Keys and Values

In [69]:
# loop by keys ( use iteritems() )
for k,v in s4.iteritems():
    print k,v

Fri 11
Mon 199
Sat -5
Sun 9
Thu 89
Tue 200
Wed 15


In [20]:
help(pandas.Series)

Help on class Series in module pandas.core.series:

class Series(pandas.core.base.IndexOpsMixin, pandas.core.generic.NDFrame)
 |  One-dimensional ndarray with axis labels (including time series).
 |  
 |  Labels need not be unique but must be any hashable type. The object
 |  supports both integer- and label-based indexing and provides a host of
 |  methods for performing operations involving the index. Statistical
 |  methods from ndarray have been overridden to automatically exclude
 |  missing data (currently represented as NaN)
 |  
 |  Operations between Series (+, -, /, *, **) align values based on their
 |  associated index values-- they need not be the same length. The result
 |  index will be the sorted union of the two indexes.
 |  
 |  Parameters
 |  ----------
 |  data : array-like, dict, or scalar value
 |      Contains data stored in Series
 |  index : array-like or Index (1d)
 |      Values must be unique and hashable, same length as data. Index
 |      object (or other 