<div align="center"> <h1> <font color=#594D5B> Pandas </font> </h1> </div>

In [1]:
import pandas as pd

In [36]:
# reading from a csv file
df = pd.read_csv('./Data/weather.csv')

In [3]:
df

Unnamed: 0,date,maximum temperature,minimum temperature,average temperature,precipitation,snow fall,snow depth
0,1-1-2016,42,34,38.0,0.00,0.0,0
1,2-1-2016,40,32,36.0,0.00,0.0,0
2,3-1-2016,45,35,40.0,0.00,0.0,0
3,4-1-2016,36,14,25.0,0.00,0.0,0
4,5-1-2016,29,11,20.0,0.00,0.0,0
...,...,...,...,...,...,...,...
361,27-12-2016,60,40,50.0,0,0,0
362,28-12-2016,40,34,37.0,0,0,0
363,29-12-2016,46,33,39.5,0.39,0,0
364,30-12-2016,40,33,36.5,0.01,T,0


- We have data about the New York weather for the year 2016. It has the following columns.

- **Date** : The date of the weather record, in MM/DD/YYYY format.
- **Maximum Temperature** : The maximum temperature recorded that day ( `Fahrenheit` ).
- **Minimum Temperature** : The minimum temperature recorded that day.
- **Average Temperature** : The average temperature recorded that day.
- **Precipitation** : The amount of rain or snow precipitation recorded that day.
- **Snow fall** : The amount of snowfall recorded that day (`T - trace of precipitation`).
- **Snow depth** : The depth of snow recorded that day ( `inches` ). 

In [37]:
# some helper functions

def to_celsius(x: int) -> float:
    return round((((x-32) * 5)/9), 2)


In [44]:
# let us see some handy functions by exploring the data

# maximum temperature ever recorded
print("Maximum temperature ever recorded (celsius) :", to_celsius(df['maximum temperature'].max()))

# the row with max temperature
print("\n\nrow :\n", df[df['maximum temperature'] == df['maximum temperature'].max()])

# min temperature ever recorded
print("\n\nMinimum temperature ever recorded (celsius) :", to_celsius(df['minimum temperature'].min()))

# the row
print("\n\nrow : ", df[df['minimum temperature'] == df['minimum temperature'].min()])

# the day
print("\n\nDay : ", df.date[df['minimum temperature'] == df['minimum temperature'].min()])

# days having rain
print(f"\n\nDays which rained:\n{df['date'][df['precipitation'] == 'T']}")

Maximum temperature ever recorded (celsius) : 35.56


row :
           date  maximum temperature  minimum temperature  average temperature  \
204  23-7-2016                   96                   80                 88.0   
225  13-8-2016                   96                   81                 88.5   

    precipitation snow fall snow depth  
204             0         0          0  
225             0         0          0  


Minimum temperature ever recorded (celsius) : -18.33


row :           date  maximum temperature  minimum temperature  average temperature  \
44  14-2-2016                   15                   -1                  7.0   

   precipitation snow fall snow depth  
44          0.00       0.0          0  


Day :  44    14-2-2016
Name: date, dtype: object


Days which rained:
8        9-1-2016
14      15-1-2016
17      18-1-2016
23      24-1-2016
26      27-1-2016
34       4-2-2016
41      11-2-2016
72      13-3-2016
76      17-3-2016
112     22-4-2016
138     18-5-20

In [20]:
# Creating a data-frame
weather_data = [
    ('1/1/2023', 38, 10, 'sunny'),
    ('2/1/2023', 30, 12, 'windy'),
    ('3/1/2023', 20, 10, 'rain'),
    ('4/1/2023', 1, 5, 'snow'),
    ('5/1/2023', 15, 9, 'snow'),
    ('6/1/2023', 25, 10, 'rain'),
]

df_2 = pd.DataFrame(weather_data, columns=['day', 'temperature', 'wind_speed', 'event'])

In [30]:
print("Shape of data :", df_2.shape)

print(df_2, end='\n\n')

# slicing data
print("Data [2:3] :\n", df_2[1:3])

# columns
print("\n\nColumns :", df_2.columns)

Shape of data : (6, 4)
        day  temperature  wind_speed  event
0  1/1/2023           38          10  sunny
1  2/1/2023           30          12  windy
2  3/1/2023           20          10   rain
3  4/1/2023            1           5   snow
4  5/1/2023           15           9   snow
5  6/1/2023           25          10   rain

Data [2:3] :
         day  temperature  wind_speed  event
1  2/1/2023           30          12  windy
2  3/1/2023           20          10   rain


Columns : Index(['day', 'temperature', 'wind_speed', 'event'], dtype='object')


In [34]:
# accessing one col
print(df_2.day)
print(df['event'])

0    1/1/2023
1    2/1/2023
2    3/1/2023
3    4/1/2023
4    5/1/2023
5    6/1/2023
Name: day, dtype: object
0    sunny
1    windy
2     rain
3     snow
4     snow
5     rain
Name: event, dtype: object


In [35]:
# some basic info about the data frame
df['temperature'].describe()

count     6.000000
mean     21.500000
std      12.817956
min       1.000000
25%      16.250000
50%      22.500000
75%      28.750000
max      38.000000
Name: temperature, dtype: float64