# Series
* A series is a data structure in Pandas that holds an array of information along with a named index.
* The named index differentiaties this from a simple Numpy array
* <b>Formal Definition</b>: One-dimensional ndarray with axis labels
<h5> Numpy array has numeric index </h5>




<h5> Numpy array has numeric index </h5>

<table class="center">
<tr>
<th>Index</th>
<th>Data</th>
</tr>

<tr>
<th>0</th>
<th>1776</th>
</tr>

<tr>
<th>1</th>
<th>1867</th>
</tr>

<tr>
<th>2</th>
<th>1821</th>
</tr>
</table>

<h5> Pandas Series adds on a labeled index </h5>

<table class="center">
<tr>
<th>Labeled Index</th>
<th>Data</th>
</tr>

<tr>
<th>USA</th>
<th>1776</th>
</tr>

<tr>
<th>CANADA</th>
<th>1867</th>
</tr>

<tr>
<th>MEXICO</th>
<th>1821</th>
</table>

<h5> Data is still numerically organized </h5>

<table class="center">
<tr>
<th>Numeric Index</th>
<th>Labeled Index</th>
<th>Data</th>
</tr>

<tr>
<th>0</th>
<th>USA</th>
<th>1776</th>
</tr>

<tr>
<th>1</th>
<th>CANADA</th>
<th>1867</th>
</tr>

<tr>
<th>2</th>
<th>MEXICO</th>
<th>1821</th>
</table>

In [34]:
import numpy as np

In [35]:
import pandas as pd

In [36]:
# help(pd.Series)

In [37]:
# the labeled index
myindex = ['USA', 'Canada', 'Mexico']

In [38]:
# the value
mydata = [1776,1867,1821]

In [39]:
# in the case if index is not defined then index will be
myser = pd.Series(data=mydata, index=myindex)

In [40]:
myser

USA       1776
Canada    1867
Mexico    1821
dtype: int64

In [41]:
# the series object now retains both this numeric index and label index
myser[0]

1776

In [42]:
myser['USA']

1776

In [43]:
# Create series from dictionary
ages = {'Sam': 5, 'Frank':10, 'Spike':7}


In [44]:
pd.Series(ages)

Sam       5
Frank    10
Spike     7
dtype: int64

In [45]:
# Imaginary Sales Data for 1st and 2nd Quarters for Global Company
q1 = {'Japan': 80, 'China': 450, 'India': 200, 'USA': 250}
q2 = {'Brazil': 100,'China': 500, 'India': 210,'USA': 260}

In [46]:
sales_q1 = pd.Series(q1)

In [47]:
sales_q2 = pd.Series(q2)

In [48]:
sales_q1

Japan     80
China    450
India    200
USA      250
dtype: int64

In [49]:
sales_q2

Brazil    100
China     500
India     210
USA       260
dtype: int64

In [50]:
# the word must be the same as key
sales_q1['Japan']

80

In [51]:
sales_q1[0]

80

In [52]:
sales_q1.keys()

Index(['Japan', 'China', 'India', 'USA'], dtype='object')

In [53]:
[1,2] * 2

[1, 2, 1, 2]

In [54]:
np.array([1,2]) * 2

array([2, 4])

In [55]:
# Series is built off Numpy => work like Numpy => work with broadcast operation
sales_q1 / 100


Japan    0.8
China    4.5
India    2.0
USA      2.5
dtype: float64

In [56]:
sales_q1

Japan     80
China    450
India    200
USA      250
dtype: int64

In [57]:
sales_q2

Brazil    100
China     500
India     210
USA       260
dtype: int64

In [58]:
# the total of sales
# in the case, the key not being present in both series
# return Nan (not a number) # show that Japan and Brazil not in both 2 time
sales_q1 + sales_q2

Brazil      NaN
China     950.0
India     410.0
Japan       NaN
USA       510.0
dtype: float64

In [59]:
# fill_value - fill existing missing (Nan) values, and any new element needed for successful Series alignment
first_half = sales_q1.add(sales_q2, fill_value=0)

As soon as you start doing numeric computations with any objects within pandas
it going to quickly convert them into <b> floating point numbers </b>

<p> Brazil   100 -> 100.0 </p>
<p> China    500 -> 500.0</p>
<p>India     210 -> 210.0 </p>
<p>USA       260 -> 260.0 </p>

In [60]:
sales_q1.dtype

dtype('int64')

In [61]:
first_half.dtype

dtype('float64')

In [62]:
# change the float to int and int 64
first_half.apply(np.int64)

Brazil    100
China     950
India     410
Japan      80
USA       510
dtype: int64

In [63]:
first_half.astype(int)


Brazil    100
China     950
India     410
Japan      80
USA       510
dtype: int32

In [65]:
first_half.aggregate(func=np.sum,axis=0)

2050.0