# Pandas Series

- matplotlib, numpy, **pandas**, seaborn, sklearn, scipy, ... 


## Some comparisons

______


- Numpy: Arrays
- Pandas: Series and Dataframes

___________ 

- Series: like a numpy array, but with some additional functionality. 


- Series: Imagine a single column of a table.  

- Dataframes: Imagine the entire table. 

______

How is a series different from a list? 

- Series contains an index, which can be thought of as a row name (often is a row number), which is a way to reference items. The index is stored with other meta-information (information about the series).   

- the elements are of a specific data type. The data type is inferred, but can be manually specified. 

_____ 



## Import Pandas

`import pandas as pd`

In [14]:
import pandas as pd
import numpy as np
from pydataset import data

## Create a Series

1. from a list
2. from a numpy array
3. from a dataframe

**From a List**

In [7]:
my_list = [2, 3, 5]

# create series from list
my_series = pd.Series(my_list)

# what kind of python object is it? 
print(type(my_series))

<class 'pandas.core.series.Series'>


In [8]:
# what's inside the series? 

my_series

0    2
1    3
2    5
dtype: int64

- 3 rows, with the row indices (or row names) as [0, 1, 2]
- the values are [2, 3, 5]
- the datatype is int64 (i.e. will store LARGE integers ;))


**From an array**

In [11]:
my_array = np.array([8.0, 13.0, 21.0])

# create series from array
my_series = pd.Series(my_array)

type(my_series)

pandas.core.series.Series

In [12]:
my_series

0     8.0
1    13.0
2    21.0
dtype: float64

- 3 rows, with the row indices as [0, 1, 2]
- the values are [8.0, 13.0, 21.0]
- the datatype is float64

**From a dataframe**

In [19]:
my_df = data('titanic')

my_series = my_df.survived
print(type(my_series))

my_series = my_df['survived']
print(type(my_series))

my_not_quite_series = my_df[['survived']]
print(type(my_not_quite_series))


<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.frame.DataFrame'>


In [21]:
my_series.head()

1    yes
2    yes
3    yes
4    yes
5    yes
Name: survived, dtype: object

In [22]:
my_not_quite_series.head()

Unnamed: 0,survived
1,yes
2,yes
3,yes
4,yes
5,yes


## Pandas data types

Data types you will see in series and dataframes: 

- int: integer, whole number values  
- float: decimal numbers  
- bool: true or false values  
- object: strings  
- category: a fixed set of string values  
- a name, an optional human-friendly name for the series  


1. inferring
2. using `astype()`

In [25]:
# inferring

In [24]:
# `astype()`

## Vectorized Operations

- Like numpy arrays, pandas series are vectorized by default. E.g., we can easily use the basic arithmetic operators to manipulate every element in the series.

1. arithmetic operations
2. comparison operations

## Series Methods

- `.any()`
- `.all()`
- `.head()`
- `.tail()`
- `.value_counts()`
- `.isin()`
- Descriptive stats
- 