<a id="top"></a>

### Contents
- [01 - Pandas Package](#section-1) 
- [02 - Series](#section-2) 
- [03 - DataFrame](#section-3) 
- [04 - Index Objects](#section-4) 
- [05 - Essential Functionality](#section-5) 
- [06 - Summarizing and Computing Descriptive Statistics](#section-6) 

<a id="section-1"></a>
<details open> 
<summary> 01 - Pandas Package </summary>
</details>

In [1]:
try:
    import pandas
except ImportError:
    !pip install pandas
else:
    import pandas as pd
    import numpy as np


[Back to Contents](#top)

<a id="section-2"></a>
<details open> 
<summary> 02 - Series </summary> <br>
    <li> Series - values and index
    <li> Series - custom index
    <li> Using lables to access the series
    <li> Assigning value using label
    <li> Applying numpy like operations
    <li> Checking if a index is present in the series
    <li> Creating a series from a dictionary
    <li> Usage of isnull() and notnull()
    <li> Setting up index.name and object.name
    <li> Modfying a index in place
</details>

In [2]:
# Series - values and index

obj = pd.Series([1, 2, 3])

print(obj)
print('Values in the Series are: ' + str(obj.values))
print('Index in the Series are: ' + str(obj.index))

0    1
1    2
2    3
dtype: int64
Values in the Series are: [1 2 3]
Index in the Series are: RangeIndex(start=0, stop=3, step=1)


In [3]:
# Series - custom index

obj = pd.Series([1, 2, 3], index=['a', 'b', 'c'])

print(obj)
print('Values in the Series are: ' + str(obj.values))
print('Index in the Series are: ' + str(obj.index))

a    1
b    2
c    3
dtype: int64
Values in the Series are: [1 2 3]
Index in the Series are: Index(['a', 'b', 'c'], dtype='object')


In [4]:
# Using lables to access the series

print('Data available at index "a": ' + str(obj['a']))
print('\nData available at index "a" and "b": \n' + str(obj[['a', 'b']])) # ['a','b'] is a list of indices

# Assigning value using label

obj['d'] = 4
print('\nData available at index "d": ' + str(obj['d']))

Data available at index "a": 1

Data available at index "a" and "b": 
a    1
b    2
dtype: int64

Data available at index "d": 4


In [5]:
# Applying numpy like operations

obj[obj>1]

b    2
c    3
d    4
dtype: int64

In [6]:
# Checking if a index is present in the series

'b' in obj

True

In [7]:
# Creating a series from a dictionary

dict_name = {'a': 100, 'b': 200, 'c': 300}

obj = pd.Series(dict_name)
print(obj)

a    100
b    200
c    300
dtype: int64


In [8]:
# Usage of isnull() and notnull()

obj = pd.Series(dict_name,index = ['a', 'b', 'c', 'd', 'e'])
print(obj)
print('\nChecking for null values: \n' + str(obj.isnull()))
print('\nChecking for not null values: \n' + str(obj.notnull()))

a    100.0
b    200.0
c    300.0
d      NaN
e      NaN
dtype: float64

Checking for null values: 
a    False
b    False
c    False
d     True
e     True
dtype: bool

Checking for not null values: 
a     True
b     True
c     True
d    False
e    False
dtype: bool


In [9]:
# Setting up index.name and object.name

obj = pd.Series([1, 2, 3], index=['a', 'b', 'c'])

obj.name = 'ex_name'
obj.index.name = 'index_name'
obj

index_name
a    1
b    2
c    3
Name: ex_name, dtype: int64

In [10]:
# Modfying a index in place

obj.index = ['d','e','f']
obj

d    1
e    2
f    3
Name: ex_name, dtype: int64

[Back to Contents](#top)

<a id="section-3"></a>
<details open> 
<summary> 03 - DataFrame </summary> <br>
    <li> Dataframe - creating a dataframe from dictionary
    <li> Getting a sample from the dataframe using head()
    <li> Accessing the dataframe
    <li> Assigning the values to a dataframe
    <li> Delete a column from a dataframe
    <li> Transpose of a dataframe
    <li> Values in a dataframe
</details>

In [11]:
# Dataframe - creating a dataframe from dictionary

data = {'state': ['A', 'B', 'C', 'D', 'E', 'F'],
            'year': [2000, 2001, 2002, 2001, 2002, 2003],
            'pop': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}

frame = pd.DataFrame(data)
frame

Unnamed: 0,state,year,pop
0,A,2000,1.5
1,B,2001,1.7
2,C,2002,3.6
3,D,2001,2.4
4,E,2002,2.9
5,F,2003,3.2


In [12]:
# Getting a sample from the dataframe using head()

frame.head(2)

Unnamed: 0,state,year,pop
0,A,2000,1.5
1,B,2001,1.7


In [13]:
# Accessing the dataframe

print("The output of frame['state'] is :")
print(frame['state'])
print("\nThe output of frame.year is :")
print(frame.year)

The output of frame['state'] is :
0    A
1    B
2    C
3    D
4    E
5    F
Name: state, dtype: object

The output of frame.year is :
0    2000
1    2001
2    2002
3    2001
4    2002
5    2003
Name: year, dtype: int64


In [14]:
# Assigning the values to a dataframe

print("Assigning the values to a dataframe\n")

print("Output of frame['pop']=1.5 is :")
frame['pop']=1.5
print(frame)

print("\nOutput of frame['pop']=np.arange(6) is :")
frame['pop']=np.arange(6)
print(frame)

sample = pd.Series([1.5, 1.7, 3.6, 2.4])
frame['pop']=sample
print("\nOutput of frame['pop]=sample")
print(frame)


Assigning the values to a dataframe

Output of frame['pop']=1.5 is :
  state  year  pop
0     A  2000  1.5
1     B  2001  1.5
2     C  2002  1.5
3     D  2001  1.5
4     E  2002  1.5
5     F  2003  1.5

Output of frame['pop']=np.arange(6) is :
  state  year  pop
0     A  2000    0
1     B  2001    1
2     C  2002    2
3     D  2001    3
4     E  2002    4
5     F  2003    5

Output of frame['pop]=sample
  state  year  pop
0     A  2000  1.5
1     B  2001  1.7
2     C  2002  3.6
3     D  2001  2.4
4     E  2002  NaN
5     F  2003  NaN


In [15]:
# Delete a column from a dataframe

print("Deleting a column from a dataframe using 'del frame['pop']'\n")
del frame['pop']
print(frame)

Deleting a column from a dataframe using 'del frame['pop']'

  state  year
0     A  2000
1     B  2001
2     C  2002
3     D  2001
4     E  2002
5     F  2003


In [16]:
# Transpose of a dataframe

print("Transposing a dataframe using 'frame.T'\n")
print(frame.T)

Transposing a dataframe using 'frame.T'

          0     1     2     3     4     5
state     A     B     C     D     E     F
year   2000  2001  2002  2001  2002  2003


In [17]:
# Values in a dataframe

print("Values in a dataframe using 'frame.values'\n")
print(frame.values)

Values in a dataframe using 'frame.values'

[['A' 2000]
 ['B' 2001]
 ['C' 2002]
 ['D' 2001]
 ['E' 2002]
 ['F' 2003]]


[Back to Contents](#top)

<a id="section-4"></a>
<details open> 
<summary> 04 - Index Objects</summary> <br>
    <li> Index object is immutable
</details>

In [18]:
# Index object is immutable

obj = pd.Series(range(3), index=['a', 'b', 'c'])

i = obj.index
print(i)

print("\nTrying to modify the value in a index")
i[0]='d'

Index(['a', 'b', 'c'], dtype='object')

Trying to modify the value in a index


TypeError: Index does not support mutable operations

[Back to Contents](#top)

<a id="section-5"></a>
<details open> 
<summary> 05 - Essential Functionality</summary> <br>
    <li> Reindexing
    <li> Dropping a column/row using drop() method
    <li> Indexing, selection and filtering
    <li> Selection with loc and iloc
    <li> Arithmetic operations
    <li> Function Application and Mapping
    <li> Sorting and Ranking
</details>

In [19]:
# Reindexing

obj = pd.Series([4.5, 7.2, -5.3, 3.6], index=['d', 'b', 'a', 'c'])
obj2 = obj.reindex(['a', 'b', 'c', 'd', 'e'])

obj2

a   -5.3
b    7.2
c    3.6
d    4.5
e    NaN
dtype: float64

In [20]:
# Dropping a column/row using drop() method

print("\nDropping a row using 'frame.drop(0,axis=0)'")
frame.drop(0,axis=0, inplace=True)
print(frame)

print("\nDropping a column using 'frame.drop(['state'],axis=1)'")
frame.drop(['state'],axis=1, inplace=True)
print(frame)



Dropping a row using 'frame.drop(0,axis=0)'
  state  year
1     B  2001
2     C  2002
3     D  2001
4     E  2002
5     F  2003

Dropping a column using 'frame.drop(['state'],axis=1)'
   year
1  2001
2  2002
3  2001
4  2002
5  2003


In [21]:
# Indexing, selection and filtering

obj = pd.Series(np.arange(4.), index=['a', 'b', 'c', 'd'])
print('obj is:')
print(obj)

print("\nOutput of obj[1] is :")
print(obj[1])

print("\nOutput of obj[1:3] is :")
print(obj[1:3])

print("\nOutput of obj[['a','b']] is :")
print(obj[['a','b']])

print("\nOutput of obj[obj>2] is :")
print(obj[obj>2])

print("\nOutput of obj[:3] is :")
print(obj[:3])

obj is:
a    0.0
b    1.0
c    2.0
d    3.0
dtype: float64

Output of obj[1] is :
1.0

Output of obj[1:3] is :
b    1.0
c    2.0
dtype: float64

Output of obj[['a','b']] is :
a    0.0
b    1.0
dtype: float64

Output of obj[obj>2] is :
d    3.0
dtype: float64

Output of obj[:3] is :
a    0.0
b    1.0
c    2.0
dtype: float64


  print(obj[1])


In [22]:
# Selection with loc and iloc

frame.index=['a','b','c','d','e']

print("\nOutput of frame.loc['a'] is :")
print(frame.loc['a'])

print("\nOutput of frame.iloc[1:3] is :")
print(frame.iloc[1:3])



Output of frame.loc['a'] is :
year    2001
Name: a, dtype: int64

Output of frame.iloc[1:3] is :
   year
b  2002
c  2001


In [23]:
# Arithmetic operations

s1 = pd.Series([7.3, -2.5, 3.4, 1.5], index=['a', 'c', 'd', 'e'])
s2 = pd.Series([1.3, -1.5, 1.4, 0.5], index=['a', 'c', 'd', 'e'])

print(s1)
print('\n')
print(s2)

print('\nOutput of s1+s2 is:')
print(s1+s2)

a    7.3
c   -2.5
d    3.4
e    1.5
dtype: float64


a    1.3
c   -1.5
d    1.4
e    0.5
dtype: float64

Output of s1+s2 is:
a    8.6
c   -4.0
d    4.8
e    2.0
dtype: float64


In [24]:
# Function Application and Mapping

frame = pd.DataFrame(np.random.randn(4, 3), columns=list('123'),index=['a', 'b', 'c', 'd'])
print(frame)

f = lambda x: x.max() - x.min()
print('\nOutput of frame.apply(f) is:')
print(frame.apply(f))


          1         2         3
a  0.424404  0.651753  0.755897
b  0.803377 -0.836715 -1.472840
c -0.125690  0.218469 -1.401232
d -1.503746  0.067083 -1.397226

Output of frame.apply(f) is:
1    2.307122
2    1.488467
3    2.228737
dtype: float64


In [25]:
# Sorting and Ranking

obj = pd.Series(range(4), index=['d', 'a', 'b', 'c'])
print('Sorting the dataframe using index:')
print(obj.sort_index())

print('\nSorting the dataframe using values:')
print(obj.sort_values())

print('\nRanking the dataframe:')
print(obj.rank())

Sorting the dataframe using index:
a    1
b    2
c    3
d    0
dtype: int64

Sorting the dataframe using values:
d    0
a    1
b    2
c    3
dtype: int64

Ranking the dataframe:
d    1.0
a    2.0
b    3.0
c    4.0
dtype: float64


[Back to Contents](#top)

<a id="section-6"></a>
<details open> 
<summary> 06 -  Summarizing and Computing Descriptive Statistics</summary> <br>
    <li> Sum of columns using sum()
    <li> Correlation and Covariance
    <li> Unique Values, Value Counts, and Membership
</details>

In [26]:
# Sum of columns using sum()

df = pd.DataFrame([[1.4, np.nan], [7.1, -4.5],[np.nan, np.nan], [0.75, -1.3]],
index=['a', 'b', 'c', 'd'],
columns=['one', 'two'])

print(df)

print('\nOutput of df.sum() is:')
print(df.sum())

print('\nOutput of df.mean() is:')
print(df.mean())

    one  two
a  1.40  NaN
b  7.10 -4.5
c   NaN  NaN
d  0.75 -1.3

Output of df.sum() is:
one    9.25
two   -5.80
dtype: float64

Output of df.mean() is:
one    3.083333
two   -2.900000
dtype: float64


In [27]:
# Correlation and Covariance

print('\nOutput of df.corr() is:')
print(df.corr())

print('\nOutput of df.cov() is:')
print(df.cov())


Output of df.corr() is:
     one  two
one  1.0 -1.0
two -1.0  1.0

Output of df.cov() is:
           one    two
one  12.205833 -10.16
two -10.160000   5.12


In [28]:
# Unique Values, Value Counts, and Membership

obj = pd.Series(['c', 'a', 'd', 'a', 'a', 'b', 'b', 'c', 'c'])
print('\n Unique values in the series are:')
print(obj.unique())

print('\n Value counts in the series are:')
print(obj.value_counts())       

mask = ['a','c']
print('\nOutput of obj.isin(mask) is:')
print(obj.isin(mask))



 Unique values in the series are:
['c' 'a' 'd' 'b']

 Value counts in the series are:
c    3
a    3
b    2
d    1
Name: count, dtype: int64

Output of obj.isin(mask) is:
0     True
1     True
2    False
3     True
4     True
5    False
6    False
7     True
8     True
dtype: bool


[Back to Contents](#top)