# Series

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

### Create a `Series` object from a Python list
Notice that index positions are assigned to the objects in the `Series`. By default, the indexes will be numerical starting from 0.

In [2]:
ice_cream = ['Strawberry', 'Vanilla', 'Chocolate', 'Rum Raisin']
pd.Series(ice_cream)

0    Strawberry
1       Vanilla
2     Chocolate
3    Rum Raisin
dtype: object

In [3]:
lottery = [10, 20, 12, 23, 43]
pd.Series(lottery)

0    10
1    20
2    12
3    23
4    43
dtype: int64

### Create a `Series` object from a Python dictionary
The keys of the dictionary become index names for the Series object.
The values of the dictionary become values for the Series object.

In [4]:
webster = {'Banana':'A Fruit', 'Aardvark' : 'An Animal', 'Cyan':'A Color'}
pd.Series(webster)

Banana        A Fruit
Aardvark    An Animal
Cyan          A Color
dtype: object

### Intro to `Series` Attributes

In [5]:
about_me = ['smart','handsome','charming','brilliant','humble']
s = pd.Series(about_me)
s.values

array(['smart', 'handsome', 'charming', 'brilliant', 'humble'],
      dtype=object)

In [6]:
s.index

RangeIndex(start=0, stop=5, step=1)

### Intro to `Series` Methods

In [7]:
prices = [2.99, 3.45, 1.99]
s = pd.Series(prices)
s

0    2.99
1    3.45
2    1.99
dtype: float64

In [8]:
s.sum()

8.43

In [9]:
s.product()

20.527845000000003

In [10]:
s.mean()

2.81

Index values for a `Pandas Series` need **not** be unique. However, this will limit some operations that can be performed on the `Series` object.

In [11]:
fruits = ['Apple', 'Banana', 'Watermelon', 'Plum', 'Tomato', 'Banana']
weekdays = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday']
pd.Series(data=weekdays, index=fruits)

Apple            Monday
Banana          Tuesday
Watermelon    Wednesday
Plum           Thursday
Tomato           Friday
Banana         Saturday
dtype: object

### Importing `Series` from a CSV file
- Use the `usecols` parameter to specify the column(s) that you need. 
- `pd.read_csv(..)` always returns a `DataFrame` object. To obtain a `Pandas Series` object, use the `squeeze=True` parameter.

**Note:** By default, only the top 30 and bottom 30 values are printed.

In [12]:
pokemon = pd.read_csv("pandas/pokemon.csv", usecols=['Pokemon'], squeeze=True)
google = pd.read_csv("pandas/google_stock_price.csv", squeeze=True)

### The `.head(..)` and `.tail(..)` methods
A new `Series` object is returned

In [13]:
pokemon.head(5) # default -> 5

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Pokemon, dtype: object

In [14]:
google.tail(4)

3008    771.07
3009    773.18
3010    771.61
3011    782.22
Name: Stock Price, dtype: float64

**Built-in Python functions work well with `Series` objects**

In [15]:
len(pokemon)

721

In [16]:
type(google)

pandas.core.series.Series

In [17]:
sorted(pokemon)

['Abomasnow',
 'Abra',
 'Absol',
 'Accelgor',
 'Aegislash',
 'Aerodactyl',
 'Aggron',
 'Aipom',
 'Alakazam',
 'Alomomola',
 'Altaria',
 'Amaura',
 'Ambipom',
 'Amoonguss',
 'Ampharos',
 'Anorith',
 'Arbok',
 'Arcanine',
 'Arceus',
 'Archen',
 'Archeops',
 'Ariados',
 'Armaldo',
 'Aromatisse',
 'Aron',
 'Articuno',
 'Audino',
 'Aurorus',
 'Avalugg',
 'Axew',
 'Azelf',
 'Azumarill',
 'Azurill',
 'Bagon',
 'Baltoy',
 'Banette',
 'Barbaracle',
 'Barboach',
 'Basculin',
 'Bastiodon',
 'Bayleef',
 'Beartic',
 'Beautifly',
 'Beedrill',
 'Beheeyem',
 'Beldum',
 'Bellossom',
 'Bellsprout',
 'Bergmite',
 'Bibarel',
 'Bidoof',
 'Binacle',
 'Bisharp',
 'Blastoise',
 'Blaziken',
 'Blissey',
 'Blitzle',
 'Boldore',
 'Bonsly',
 'Bouffalant',
 'Braixen',
 'Braviary',
 'Breloom',
 'Bronzong',
 'Bronzor',
 'Budew',
 'Buizel',
 'Bulbasaur',
 'Buneary',
 'Bunnelby',
 'Burmy',
 'Butterfree',
 'Cacnea',
 'Cacturne',
 'Camerupt',
 'Carbink',
 'Carnivine',
 'Carracosta',
 'Carvanha',
 'Cascoon',
 'Castform',


In [18]:
list(pokemon.head(5))

['Bulbasaur', 'Ivysaur', 'Venusaur', 'Charmander', 'Charmeleon']

In [19]:
dict(google.head(5))

{0: 50.12, 1: 54.1, 2: 54.65, 3: 52.38, 4: 52.95}

In [20]:
max(pokemon)

'Zygarde'

In [21]:
min(google)

49.95

## More `Series` attributes

In [22]:
pokemon = pd.read_csv("pandas/pokemon.csv", usecols=['Pokemon'], squeeze=True)
google = pd.read_csv("pandas/google_stock_price.csv", squeeze=True)

**`.values` provides an array of values of the series <br>
`.index` provides the indexes of the series
`.dtype` provides the datatype of the series**

In [23]:
google.values

array([ 50.12,  54.1 ,  54.65, ..., 773.18, 771.61, 782.22])

In [24]:
pokemon.index

RangeIndex(start=0, stop=721, step=1)

In [25]:
pokemon.dtype

dtype('O')

**`.is_unique` returns `True` if the values in the `Series` are unique.**

In [26]:
print(pokemon.is_unique)
print(google.is_unique)

True
False


**`.ndim` to obtain the number of dimensions of the object** <br>
Since a `Series` is one-dimendional object, 1 is returned.

**`.shape` returns a tuple that denotes the shape of the object** <br>

In [27]:
print('pokemon.ndim:', pokemon.ndim)
print('google.shape', google.shape)
print('pokemon.shape:', pokemon.shape)
print('google.shape:', google.shape)

pokemon.ndim: 1
google.shape (3012,)
pokemon.shape: (721,)
google.shape: (3012,)


In [28]:
pokemon.shape

(721,)

**`.size` gives total number of values.** <br/>
NOTE: It counts the null values as well

In [29]:
print('pokemon.size:', pokemon.size)
print('google.size:', google.size)

pokemon.size: 721
google.size: 3012


**`.name` attribute for the name of the `Series`** <br/>
The name of the Series can be reassigned.

In [30]:
pokemon.name

'Pokemon'

In [31]:
pokemon.name = 'Pocket Monsters'

In [32]:
pokemon.head(3)

0    Bulbasaur
1      Ivysaur
2     Venusaur
Name: Pocket Monsters, dtype: object