## Pandas Operations

### Series

#### Loading packages and initializations

In [1]:
import numpy as np

In [2]:
import pandas as pd

In [3]:
labels = ['a', 'b', 'c']
my_data = [10, 20, 30]
arr = np.array(my_data)
dic = {'a':10, 'b': 20, 'c':30}

print("Labels: ", labels)
print("My Data: ", my_data)
print("My Np Array: ", arr)
print("Dictionary: ", dic)

Labels:  ['a', 'b', 'c']
My Data:  [10, 20, 30]
My Np Array:  [10 20 30]
Dictionary:  {'a': 10, 'b': 20, 'c': 30}


### Creating a Series (Pandas class)
* From numerical data only
* From numerical data and corresponding index (row labels)
* From NumPy array as the source of numerical data
* Just using a pre-defined dictionary

In [4]:
pd.Series(my_data) # Output looks very similar to a NumPy array

0    10
1    20
2    30
dtype: int64

In [5]:
pd.Series(my_data, index=labels) # Note the extra information about index

a    10
b    20
c    30
dtype: int64

In [6]:
pd.Series(my_data, labels)

a    10
b    20
c    30
dtype: int64

In [7]:
# Inputs are in order of the expected parameters (not explicitly named), NumPy array is used for data
pd.Series(arr, labels)

a    10
b    20
c    30
dtype: int32

In [8]:
pd.Series(dic)

a    10
b    20
c    30
dtype: int64

### What type of values can a Pandas Series hold?

In [9]:
print ("\nHolding numerical data\n",'-'*25, sep='')
print(pd.Series(arr))
print ("\nHolding text labels\n",'-'*20, sep='')
print(pd.Series(labels))
print ("\nHolding functions\n",'-'*20, sep='')
print(pd.Series(data=[sum,print,len]))
print ("\nHolding objects from a dictionary\n",'-'*40, sep='')
print(pd.Series(data=[dic.keys, dic.items, dic.values]))


Holding numerical data
-------------------------
0    10
1    20
2    30
dtype: int32

Holding text labels
--------------------
0    a
1    b
2    c
dtype: object

Holding functions
--------------------
0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

Holding objects from a dictionary
----------------------------------------
0    <built-in method keys of dict object at 0x0000...
1    <built-in method items of dict object at 0x000...
2    <built-in method values of dict object at 0x00...
dtype: object


### Indexing and slicing

In [10]:
ser1 = pd.Series([1,2,3,4],['CA', 'OR', 'CO', 'AZ'])
ser2 = pd.Series([1,2,5,4],['CA', 'OR', 'NV', 'AZ'])

print(ser1,ser2,sep="\n")

print ("\nIndexing by name of the item/object (string identifier)\n",'-'*56, sep='')
print("Value for CA in ser1:", ser1['CA'])
print("Value for AZ in ser1:", ser1['AZ'])
print("Value for NV in ser2:", ser2['NV'])

print ("\nIndexing by number (positional value in the series)\n",'-'*52, sep='')
print("Value for CA in ser1:", ser1[0])
print("Value for AZ in ser1:", ser1[3])
print("Value for NV in ser2:", ser2[2])

print ("\nIndexing by a range\n",'-'*25, sep='')
print ("Value for OR, CO, and AZ in ser1:\n", ser1[1:4], sep='')

CA    1
OR    2
CO    3
AZ    4
dtype: int64
CA    1
OR    2
NV    5
AZ    4
dtype: int64

Indexing by name of the item/object (string identifier)
--------------------------------------------------------
Value for CA in ser1: 1
Value for AZ in ser1: 4
Value for NV in ser2: 5

Indexing by number (positional value in the series)
----------------------------------------------------
Value for CA in ser1: 1
Value for AZ in ser1: 4
Value for NV in ser2: 5

Indexing by a range
-------------------------
Value for OR, CO, and AZ in ser1:
OR    2
CO    3
AZ    4
dtype: int64


### Adding/Merging two series with common indices

In [11]:
ser1 = pd.Series([1,2,3,4],['CA', 'OR', 'CO', 'AZ'])
ser2 = pd.Series([1,2,5,4],['CA', 'OR', 'NV', 'AZ'])
ser3 = ser1+ser2

print ("\nAfter adding the two series, the result looks like this...\n",'-'*59, sep='')
print(ser3)
print("\nPython tries to add values where it finds common index name, and puts NaN where indices are missing\n")

print ("\nThe idea works even for multiplication...\n",'-'*43, sep='')
print (ser1*ser2)

print ("\nOr even for combination of mathematical operations!\n",'-'*53, sep='')
print (np.exp(ser1)+np.log10(ser2))


After adding the two series, the result looks like this...
-----------------------------------------------------------
AZ    8.0
CA    2.0
CO    NaN
NV    NaN
OR    4.0
dtype: float64

Python tries to add values where it finds common index name, and puts NaN where indices are missing


The idea works even for multiplication...
-------------------------------------------
AZ    16.0
CA     1.0
CO     NaN
NV     NaN
OR     4.0
dtype: float64

Or even for combination of mathematical operations!
-----------------------------------------------------
AZ    55.200210
CA     2.718282
CO          NaN
NV          NaN
OR     7.690086
dtype: float64
