```[pandas] is derived from the term "panel data", an econometrics term for data sets that include observations over multiple time periods for the same individuals. — Wikipedia```

## What can pandas do ?
- Calculate statistics and answer questions about the data, like
        - What's the average, median, max, or min of each column?
        - Does column A correlate with column B?
        - What does the distribution of data in column C look like?
- Clean the data by doing things like removing missing values and filtering rows or columns by some criteria
- Visualize the data with help from Matplotlib. Plot bars, lines, histograms, bubbles, and more.
- Store the cleaned, transformed data back into a CSV, other file or database

## How to Install pandas

In [6]:
!pip install --upgrade pandas



## Start working

In [7]:
import pandas as pd
import sys

In [8]:
print("Python: " + sys.version.split("|")[0])
print("Pandas: " + pd.__version__)

Python: 3.11.2 (tags/v3.11.2:878ead1, Feb  7 2023, 16:38:35) [MSC v.1934 64 bit (AMD64)]
Pandas: 2.0.0


In [9]:
my_list = [99, 88, 77, 66, 44, 22]
print(type(my_list), my_list)

<class 'list'> [99, 88, 77, 66, 44, 22]


In [10]:
my_df = pd.DataFrame(my_list)
print(type(my_df))
print(my_df)

<class 'pandas.core.frame.DataFrame'>
    0
0  99
1  88
2  77
3  66
4  44
5  22


In [11]:
my_df

Unnamed: 0,0
0,99
1,88
2,77
3,66
4,44
5,22


In [12]:
my_df = pd.DataFrame(data=my_list)
my_df

Unnamed: 0,0
0,99
1,88
2,77
3,66
4,44
5,22


In [13]:
my_df = pd.DataFrame(data=my_list, columns=("ages",))
my_df

Unnamed: 0,ages
0,99
1,88
2,77
3,66
4,44
5,22


In [14]:
my_df = pd.DataFrame({"ages": [99, 88, 77, 66, 44, 22]})
my_df

Unnamed: 0,ages
0,99
1,88
2,77
3,66
4,44
5,22


In [15]:
my_df = pd.DataFrame(
    {"ages": [99, 88, 66, 22], "names": ("Ramesh", "suresh", "Ganesh", "Mahesh")}
)  # values must be of same length
my_df

Unnamed: 0,ages,names
0,99,Ramesh
1,88,suresh
2,66,Ganesh
3,22,Mahesh


In [16]:
my_df.to_json("persons.json")

In [17]:
import os

os.listdir()

['0-todo',
 '000_pandas_material.ipynb',
 '00_pandas.ipynb',
 '01_Loading_data_from_different_data_sources.ipynb',
 '01_pandas_csv.py',
 '02_Overview_of_data.ipynb',
 '02_pandas_csv.py',
 '03a_flatten_json.ipynb',
 '03_Selecting_in_data.ipynb',
 '04_Sorting_data.ipynb',
 '05_Filtering_data.ipynb',
 '06_Aggregation_Of_Data.ipynb',
 '07_Groupby.ipynb',
 '08_Apply.ipynb',
 '09_Merge_dataframes.ipynb',
 '10_Visualization_Of_Data.ipynb',
 '11_Saving_data.ipynb',
 '12_Cleanup_Data.ipynb',
 '13_Time series solution.ipynb',
 '14_Project 1 - google apps store.ipynb',
 '15_Project 2 - ted talk dataset.ipynb',
 '16_Project 3 - fifa19.ipynb',
 '17_Project 4 - Ted talk dataset.ipynb',
 'additional_references.txt',
 'add_sno_column.py',
 'Christmas Tree Animation.ipynb',
 'datasets',
 'groupby-aggregate.ipynb',
 'Lucida Calligraphy Italic.ttf',
 'Nightingale.ipynb',
 'output_datasets',
 'pandas_ex1.py',
 'pandas_ex2.py',
 'pandas_material.ipynb',
 'pandas_vs_SQL.ipynb',
 'persons.json',
 'Practical 

In [18]:
! type persons.json

{"ages":{"0":99,"1":88,"2":66,"3":22},"names":{"0":"Ramesh","1":"suresh","2":"Ganesh","3":"Mahesh"}}


In [19]:
my_df.to_csv("persons.csv")

In [20]:
! type persons.csv

,ages,names
0,99,Ramesh
1,88,suresh
2,66,Ganesh
3,22,Mahesh


In [21]:
my_df.to_csv("persons1.csv", index=False)

In [22]:
! type persons1.csv

ages,names
99,Ramesh
88,suresh
66,Ganesh
22,Mahesh


In [23]:
my_df.to_csv("persons2.csv", index=False, header=False)

In [24]:
! type persons2.csv

99,Ramesh
88,suresh
66,Ganesh
22,Mahesh


### Reading Data

In [25]:
new_df = pd.read_csv("persons.csv")
new_df

Unnamed: 0.1,Unnamed: 0,ages,names
0,0,99,Ramesh
1,1,88,suresh
2,2,66,Ganesh
3,3,22,Mahesh


In [26]:
new_df = pd.read_csv("persons.csv", header=None)
new_df

Unnamed: 0,0,1,2
0,,ages,names
1,0.0,99,Ramesh
2,1.0,88,suresh
3,2.0,66,Ganesh
4,3.0,22,Mahesh


In [27]:
new_df = pd.read_csv("persons.csv", names=["index", "age", "name"])
new_df

Unnamed: 0,index,age,name
0,,ages,names
1,0.0,99,Ramesh
2,1.0,88,suresh
3,2.0,66,Ganesh
4,3.0,22,Mahesh


In [28]:
new_df.dtypes

index    float64
age       object
name      object
dtype: object

In [29]:
new_df.age.dtypes

dtype('O')

In [30]:
new_df.name.dtypes

dtype('O')

#### Anayzing Data

In [31]:
# Method 1:
Sorted = new_df.sort_values(["age"], ascending=False)
Sorted.head(1)

Unnamed: 0,index,age,name
0,,ages,names


In [32]:
# Method 2:
new_df["age"].max()

'ages'

In [34]:
try:
    matches_df = pd.read_csv("matches.csv")
except FileNotFoundError as ex:
    print(repr(ex))

FileNotFoundError(2, 'No such file or directory')


In [35]:
try:
    matches_df = pd.read_csv("matches.csv")
except OSError as ex:
    print(repr(ex))

FileNotFoundError(2, 'No such file or directory')
