#Working with pandas

**First things first, we need to import the pandas library**

In [None]:
import pandas as pd

**[Pandas Documentation](https://pandas.pydata.org/docs/)**

**Pandas Series**

In [None]:
# creating a series from an array
a = pd.Series([1, 2, 3], index=['a', 'b', 'c'])
print(f'The pandas series with index a,b,c is \n{a}')
print(f'\nJust for the peace of mind, type of the variable storing the series is:\n{type(a)}')

The pandas series with index a,b,c is 
a    1
b    2
c    3
dtype: int64

Just for the peace of mind, type of the variable storing the series is:
<class 'pandas.core.series.Series'>


In [None]:
# creating a series from a dictionary
a = pd.Series({'a': 1, 'b': 2, 'c':3})
print(f'The pandas series with index a,b,c is \n{a}')
print(f'\nJust for the peace of mind, type of the variable storing the series is:\n{type(a)}')

The pandas series with index a,b,c is 
a    1
b    2
c    3
dtype: int64

Just for the peace of mind, type of the variable storing the series is:
<class 'pandas.core.series.Series'>


In [None]:
# accessing the values in a series via index
print(f'Value of 1st index = {a["a"]}')
print(f'Value of 2nd index = {a["b"]}')
print(f'Value of 3rd index = {a["c"]}')

Value of 1st index = 1
Value of 2nd index = 2
Value of 3rd index = 3


**Pandas Dataframe**

In [None]:
# creating a dataframe
dictionary = {
    'column1': [1,2,3],
    'column2':[4,5,6],
    'column3':[7,8,9],
    'column4':[10,11,12]
}
our_df = pd.DataFrame(dictionary, index=["index1", "index2", "index3"])
print(f'The data frame we created :\n{our_df}')

The data frame we created :
        column1  column2  column3  column4
index1        1        4        7       10
index2        2        5        8       11
index3        3        6        9       12


In [None]:
#playing with the above dataframe
print(f'The first column: \n{our_df.column1}')
print(f'\nThe second column: \n{our_df.column2}')
print(f'\nThe third column: \n{our_df.column3}')

The first column: 
index1    1
index2    2
index3    3
Name: column1, dtype: int64

The second column: 
index1    4
index2    5
index3    6
Name: column2, dtype: int64

The third column: 
index1    7
index2    8
index3    9
Name: column3, dtype: int64


In [None]:
# find the details or information about the dataframe in hand
print(f'The number of rows and columns (shape of the df) in this dataframe = {our_df.shape}')
print(f'The number of rows in this dataframe = {our_df.shape[0]}')
print(f'The number of columns in this dataframe = {our_df.shape[1]}')
print(f'The number of elements in this dataframe = {our_df.size}')

The number of rows and columns (shape of the df) in this dataframe = (3, 4)
The number of rows in this dataframe = 3
The number of columns in this dataframe = 4
The number of elements in this dataframe = 12


**There are more exciting features to explore. For that, we would need a larger dataset.**<br>
**Let us use this dataset available at** [https://api.covid19india.org/csv/latest/state_wise.csv](https://api.covid19india.org/csv/latest/state_wise.csv)

In [None]:
# importing a csv file from the mentioned source in form of dataframe
df = pd.read_csv("https://api.covid19india.org/csv/latest/state_wise.csv")
df

Unnamed: 0,State,Confirmed,Recovered,Deaths,Active,Last_Updated_Time,Migrated_Other,State_code,Delta_Confirmed,Delta_Recovered,Delta_Deaths,State_Notes
0,Total,1558447,999318,34485,524220,29/07/2020 18:24:28,424,TT,26313,10536,257,
1,Maharashtra,391440,232277,14165,144694,28/07/2020 20:50:34,304,MH,0,0,0,304 cases are marked as non-covid deaths in MH...
2,Tamil Nadu,227688,166956,3659,57073,28/07/2020 19:30:28,0,TN,0,0,0,[July 22]: 444 backdated deceased entries adde...
3,Delhi,133310,118633,3907,10770,29/07/2020 18:24:32,0,DL,1035,1126,26,[July 14]: Value for the total tests conducted...
4,Karnataka,112504,42901,2147,67447,29/07/2020 18:24:33,9,KA,5503,2397,90,9 cases are classified as non-covid related de...
5,Andhra Pradesh,120390,55406,1213,63771,29/07/2020 18:24:35,0,AP,10093,2784,65,Total includes patients from other states and ...
6,Uttar Pradesh,77334,45807,1530,29997,29/07/2020 16:56:30,0,UP,3383,1287,33,
7,Gujarat,57982,42514,2368,13100,28/07/2020 21:07:27,0,GJ,0,0,0,
8,West Bengal,62964,42022,1449,19493,28/07/2020 21:07:29,0,WB,0,0,0,
9,Telangana,58906,43751,492,14663,29/07/2020 10:17:27,0,TG,1764,842,12,[July 27]\nTelangana bulletin for the previous...


In [None]:
# find the details or information about the dataframe in hand
print(f'The number of rows and columns (shape of the df) in this dataframe = {df.shape}')
print(f'The number of rows in this dataframe = {df.shape[0]}')
print(f'The number of columns in this dataframe = {df.shape[1]}')
print(f'The number of elements in this dataframe = {df.size}')

The number of rows and columns (shape of the df) in this dataframe = (38, 12)
The number of rows in this dataframe = 38
The number of columns in this dataframe = 12
The number of elements in this dataframe = 456


In [None]:
# first five rows of the dataframe
df.head() # by default it prints first five rows
# we can state the number of rows we want in the paranthesis like - head.df(2)

Unnamed: 0,State,Confirmed,Recovered,Deaths,Active,Last_Updated_Time,Migrated_Other,State_code,Delta_Confirmed,Delta_Recovered,Delta_Deaths,State_Notes
0,Total,1558447,999318,34485,524220,29/07/2020 18:24:28,424,TT,26313,10536,257,
1,Maharashtra,391440,232277,14165,144694,28/07/2020 20:50:34,304,MH,0,0,0,304 cases are marked as non-covid deaths in MH...
2,Tamil Nadu,227688,166956,3659,57073,28/07/2020 19:30:28,0,TN,0,0,0,[July 22]: 444 backdated deceased entries adde...
3,Delhi,133310,118633,3907,10770,29/07/2020 18:24:32,0,DL,1035,1126,26,[July 14]: Value for the total tests conducted...
4,Karnataka,112504,42901,2147,67447,29/07/2020 18:24:33,9,KA,5503,2397,90,9 cases are classified as non-covid related de...


In [None]:
#last five rows of the dataframe
df.tail() # by default it returns the last 5 rows
# we can state the number of rows we want in the paranthesis like - tail.df(2)

Unnamed: 0,State,Confirmed,Recovered,Deaths,Active,Last_Updated_Time,Migrated_Other,State_code,Delta_Confirmed,Delta_Recovered,Delta_Deaths,State_Notes
33,Sikkim,592,186,1,392,29/07/2020 15:45:34,13,SK,0,0,0,13 Migrated cases reduced from Active count
34,Mizoram,395,215,0,180,29/07/2020 17:39:35,0,MZ,11,17,0,
35,Andaman and Nicobar Islands,390,196,1,192,29/07/2020 09:10:30,1,AN,27,0,0,
36,State Unassigned,0,0,0,0,19/07/2020 09:40:01,0,UN,0,0,0,MoHFW website reports that these are the 'case...
37,Lakshadweep,0,0,0,0,26/03/2020 07:19:29,0,LD,0,0,0,
