<h1 style="color:Orange; font-weight:900; text-align:center">Pandas Intro</h1>

Pandas is a ***`Python library`*** used for **`data manipulation`** and **`analysis`**. It provides data structures like **DataFrames** and **Series** for handling and analyzing structured data efficiently. In data science, it is crucial for:

- ***Data Cleaning***: Handling missing data, filtering, and transforming datasets.
- ***Data Exploration***: Summarizing, visualizing, and understanding data patterns.
- ***Data Transformation***: Merging, reshaping, and aggregating datasets.

Pandas simplifies working with large datasets and is essential for preparing data for machine learning models.
<br/>
<br>
<img src="pandas_uses.png" alt="pandas_uses" style="display: block; margin-left: auto; margin-right: auto; width: 100%; max-width: 400px;">
<br/>

In [1]:
import pandas as pd

### ***1. Creating a simple DataFrame from python Data Structures***

In [2]:
dict1 = {
    "name":["Akash","Shruti","Akhi","Puchku","Golu"],
    "marks":[99,80,98,99,97],
    "city":["Kasba","NewYork","Hridaypur","Satragachi","Barasat"],
}

In [3]:
df = pd.DataFrame(dict1)
df

Unnamed: 0,name,marks,city
0,Akash,99,Kasba
1,Shruti,80,NewYork
2,Akhi,98,Hridaypur
3,Puchku,99,Satragachi
4,Golu,97,Barasat


In [4]:
df.to_csv('friends.csv', index=False)

In [5]:
df.head(3)

Unnamed: 0,name,marks,city
0,Akash,99,Kasba
1,Shruti,80,NewYork
2,Akhi,98,Hridaypur


In [6]:
df.tail(2)

Unnamed: 0,name,marks,city
3,Puchku,99,Satragachi
4,Golu,97,Barasat


In [7]:
df.describe()

Unnamed: 0,marks
count,5.0
mean,94.6
std,8.203658
min,80.0
25%,97.0
50%,98.0
75%,99.0
max,99.0


### ***2. Reading a CSV file and performing Data Manipulation***

In [8]:
df1 = pd.read_csv("nyc_weather.csv")
df1.head(10)

Unnamed: 0,EST,Temperature,DewPoint,Humidity,Sea Level PressureIn,VisibilityMiles,WindSpeedMPH,PrecipitationIn,CloudCover,Events,WindDirDegrees
0,1/1/2016,38,23,52,30.03,10,8.0,0,5,,281
1,1/2/2016,36,18,46,30.02,10,7.0,0,3,,275
2,1/3/2016,40,21,47,29.86,10,8.0,0,1,,277
3,1/4/2016,25,9,44,30.05,10,9.0,0,3,,345
4,1/5/2016,20,-3,41,30.57,10,5.0,0,0,,333
5,1/6/2016,33,4,35,30.5,10,4.0,0,0,,259
6,1/7/2016,39,11,33,30.28,10,2.0,0,3,,293
7,1/8/2016,39,29,64,30.2,10,4.0,0,8,,79
8,1/9/2016,44,38,77,30.16,9,8.0,T,8,Rain,76
9,1/10/2016,50,46,71,29.59,4,,1.8,7,Rain,109


In [9]:
print(df1['Temperature'].max())

50


In [10]:
df1['EST'][df1['Events']=='Rain']

8      1/9/2016
9     1/10/2016
15    1/16/2016
26    1/27/2016
Name: EST, dtype: object

In [11]:
df1.fillna(0,inplace=True)
print(df1["WindSpeedMPH"].mean())

6.225806451612903
