```[pandas] is derived from the term "panel data", an econometrics term for data sets that include observations over multiple time periods for the same individuals. — Wikipedia```

## What can pandas do ?
- Calculate statistics and answer questions about the data, like
        - What's the average, median, max, or min of each column?
        - Does column A correlate with column B?
        - What does the distribution of data in column C look like?
- Clean the data by doing things like removing missing values and filtering rows or columns by some criteria
- Visualize the data with help from Matplotlib. Plot bars, lines, histograms, bubbles, and more.
- Store the cleaned, transformed data back into a CSV, other file or database

## How to Install pandas

pip install --upgrade pandas

## Start working

In [1]:
import pandas as pd
import sys

In [2]:
print('Python: ' + sys.version.split('|')[0])
print('Pandas: ' + pd.__version__)

Python: 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)]
Pandas: 0.25.3


In [3]:
my_list = [99, 88, 77, 66, 44, 22]
print(type(my_list), my_list)

<class 'list'> [99, 88, 77, 66, 44, 22]


In [4]:
my_df = pd.DataFrame(my_list)
print(type(my_df), my_df)

<class 'pandas.core.frame.DataFrame'>     0
0  99
1  88
2  77
3  66
4  44
5  22


In [5]:
my_df

Unnamed: 0,0
0,99
1,88
2,77
3,66
4,44
5,22


In [6]:
my_df = pd.DataFrame(data= my_list)
my_df

Unnamed: 0,0
0,99
1,88
2,77
3,66
4,44
5,22


In [7]:
my_df = pd.DataFrame(data= my_list, columns = ('ages',))
my_df

Unnamed: 0,ages
0,99
1,88
2,77
3,66
4,44
5,22


In [8]:
my_df = pd.DataFrame({'ages': [99, 88, 66, 22]})
my_df

Unnamed: 0,ages
0,99
1,88
2,66
3,22


In [9]:
my_df = pd.DataFrame({'ages': [99, 88, 66, 22],
                      'names': ('Ramesh', 'suresh', 'Ganesh', 'Mahesh')
                     })  # values must be of same length
my_df

Unnamed: 0,ages,names
0,99,Ramesh
1,88,suresh
2,66,Ganesh
3,22,Mahesh


In [10]:
my_df.to_json('persons.json')

In [11]:
import os
os.listdir()

['.ipynb_checkpoints',
 '01_pandas_csv.py',
 '02_pandas_csv.py',
 'additional_references.txt',
 'Pandas DataFrame Notes.pdf',
 'PandasPythonForDataScience.pdf',
 'Pandas_Cheat_Sheet.pdf',
 'pandas_ex1.py',
 'pandas_ex2.py',
 'pandas_material.ipynb',
 'persons.csv',
 'persons.json',
 'persons1.csv',
 'persons2.csv',
 'Python_Pandas_Cheat_Sheet_2.pdf',
 'Scikit_Learn_Cheat_Sheet_Python.pdf',
 'TODO']

In [12]:
! type persons.json

{"ages":{"0":99,"1":88,"2":66,"3":22},"names":{"0":"Ramesh","1":"suresh","2":"Ganesh","3":"Mahesh"}}


In [13]:
my_df.to_csv('persons.csv')

In [14]:
! type persons.csv

,ages,names
0,99,Ramesh
1,88,suresh
2,66,Ganesh
3,22,Mahesh


In [15]:
my_df.to_csv('persons1.csv', index=False)

In [16]:
! type persons1.csv

ages,names
99,Ramesh
88,suresh
66,Ganesh
22,Mahesh


In [17]:
my_df.to_csv('persons2.csv', index=False, header=False)

In [18]:
! type persons2.csv

99,Ramesh
88,suresh
66,Ganesh
22,Mahesh


### Reading Data

In [19]:
new_df = pd.read_csv('persons.csv')
new_df

Unnamed: 0.1,Unnamed: 0,ages,names
0,0,99,Ramesh
1,1,88,suresh
2,2,66,Ganesh
3,3,22,Mahesh


In [20]:
new_df = pd.read_csv('persons.csv', header=None)
new_df

Unnamed: 0,0,1,2
0,,ages,names
1,0.0,99,Ramesh
2,1.0,88,suresh
3,2.0,66,Ganesh
4,3.0,22,Mahesh


In [21]:
new_df = pd.read_csv('persons.csv', names=['index', 'age', 'name'])
new_df

Unnamed: 0,index,age,name
0,,ages,names
1,0.0,99,Ramesh
2,1.0,88,suresh
3,2.0,66,Ganesh
4,3.0,22,Mahesh


In [22]:
new_df.dtypes

index    float64
age       object
name      object
dtype: object

In [23]:
new_df.age.dtypes

dtype('O')

In [24]:
new_df.name.dtypes

dtype('O')

#### Anayzing Data

In [25]:
# Method 1:
Sorted = new_df.sort_values(['age'], ascending=False)
Sorted.head(1)

Unnamed: 0,index,age,name
0,,ages,names


In [27]:
# Method 2:
new_df['age'].max()

'ages'

In [28]:
matches_df = pd.read_csv('matches.csv')

In [29]:
matches_df

Unnamed: 0,id,season,city,date,team1,team2,toss_winner,toss_decision,result,dl_applied,winner,win_by_runs,win_by_wickets,player_of_match,venue,umpire1,umpire2,umpire3
0,1,2017,Hyderabad,05-04-17,Sunrisers Hyderabad,Royal Challengers Bangalore,Royal Challengers Bangalore,field,normal,0,Sunrisers Hyderabad,35,0,Yuvraj Singh,"Rajiv Gandhi International Stadium, Uppal",AY Dandekar,NJ Llong,
1,2,2017,Pune,06-04-17,Mumbai Indians,Rising Pune Supergiant,Rising Pune Supergiant,field,normal,0,Rising Pune Supergiant,0,7,SPD Smith,Maharashtra Cricket Association Stadium,A Nand Kishore,S Ravi,
2,3,2017,Rajkot,07-04-17,Gujarat Lions,Kolkata Knight Riders,Kolkata Knight Riders,field,normal,0,Kolkata Knight Riders,0,10,CA Lynn,Saurashtra Cricket Association Stadium,Nitin Menon,CK Nandan,
3,4,2017,Indore,08-04-17,Rising Pune Supergiant,Kings XI Punjab,Kings XI Punjab,field,normal,0,Kings XI Punjab,0,6,GJ Maxwell,Holkar Cricket Stadium,AK Chaudhary,C Shamshuddin,
4,5,2017,Bangalore,08-04-17,Royal Challengers Bangalore,Delhi Daredevils,Royal Challengers Bangalore,bat,normal,0,Royal Challengers Bangalore,15,0,KM Jadhav,M Chinnaswamy Stadium,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
631,632,2016,Raipur,22-05-16,Delhi Daredevils,Royal Challengers Bangalore,Royal Challengers Bangalore,field,normal,0,Royal Challengers Bangalore,0,6,V Kohli,Shaheed Veer Narayan Singh International Stadium,A Nand Kishore,BNJ Oxenford,
632,633,2016,Bangalore,24-05-16,Gujarat Lions,Royal Challengers Bangalore,Royal Challengers Bangalore,field,normal,0,Royal Challengers Bangalore,0,4,AB de Villiers,M Chinnaswamy Stadium,AK Chaudhary,HDPK Dharmasena,
633,634,2016,Delhi,25-05-16,Sunrisers Hyderabad,Kolkata Knight Riders,Kolkata Knight Riders,field,normal,0,Sunrisers Hyderabad,22,0,MC Henriques,Feroz Shah Kotla,M Erasmus,C Shamshuddin,
634,635,2016,Delhi,27-05-16,Gujarat Lions,Sunrisers Hyderabad,Sunrisers Hyderabad,field,normal,0,Sunrisers Hyderabad,0,4,DA Warner,Feroz Shah Kotla,M Erasmus,CK Nandan,


In [34]:
matches_df.groupby('season').count()

Unnamed: 0_level_0,id,city,date,team1,team2,toss_winner,toss_decision,result,dl_applied,winner,win_by_runs,win_by_wickets,player_of_match,venue,umpire1,umpire2,umpire3
season,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2008,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,0
2009,57,57,57,57,57,57,57,57,57,57,57,57,57,57,57,57,0
2010,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,0
2011,73,73,73,73,73,73,73,73,73,72,73,73,72,73,73,73,0
2012,74,74,74,74,74,74,74,74,74,74,74,74,74,74,74,74,0
2013,76,76,76,76,76,76,76,76,76,76,76,76,76,76,76,76,0
2014,60,53,60,60,60,60,60,60,60,60,60,60,60,60,60,60,0
2015,59,59,59,59,59,59,59,59,59,57,59,59,57,59,59,59,0
2016,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,60,0
2017,59,59,59,59,59,59,59,59,59,59,59,59,59,59,58,58,0
