# **The weather dataset**

Here the weather dataset is a time series data set with per hour information about the weather conditions at a particular location. It records the temperature, dew point temperature, relative humidity, wind speed, visibility, pressure and conditions. This data is available as a CSV file and we are going to analyze this data.

In [None]:
import pandas as pd

In [None]:
data = pd.read_csv(r"../input/weatherdataanalysis/Weather Data.csv")

In [None]:
data

# **Analyzing data frames**

# **.head()**

It shows the first N rows in the data (by default N=5)

In [None]:
data.head()

# **.shape**

It shows the total number of rows and columns of the dataframe

In [None]:
data.shape

# **.index**

This attribute provides the index of the dataframe

In [None]:
data.index

# **.columns**

It shows the name of each column

In [None]:
data.columns

# **.dtypes**

It shows the datatype of each column

In [None]:
data.dtypes

# **.unique**

In a column, it shows all the unique values. It can be applied on a single column only. not the whole data frame

In [None]:
data['Weather'].unique()

# **.nunique()**

It shows the total number of unique values in each column. It can be applied on a single column only, not on the whole data frame

In [None]:
data.nunique()

# **.count**

It shows the total number of non-null in each column. It can be applied on a single column as well as on the whole data frame

In [None]:
data.count()

# **.value_counts**

In a column, it shows all the unique values with their count. It can be applied on a single column only

In [None]:
data['Weather'].value_counts()

# **.info()**

Provides basic information about the data frame

In [None]:
data.info()

# The unique "wind speed" values in the data

In [None]:
data['Wind Speed_km/h'].unique()

# The number of times when the weather was exactly clear

In [None]:
data[data.Weather == 'Clear']

# The number of times when the wind speed was exactly 4 km/h

In [None]:
data[data['Wind Speed_km/h'] == 4]

# All the NULL values in the data set

In [None]:
data.isnull().sum()

In [None]:
data.notnull().sum()

# Renaming the column name 'Weather' of the data frame to 'Weather Condition'

In [None]:
data.rename(columns = {'Weather' : 'Weather Condition'}, inplace = True)

In [None]:
data.head()

# Finding the mean 'Visibility'

In [None]:
data.Visibility_km.mean()

# Finding the standard deviation of 'Pressure' in the data set

In [None]:
data.Press_kPa.std()

# Finding the Variance of 'Relative Humidity' in this data

In [None]:
data['Rel Hum_%'].var()

# Find all the instances when 'Snow' was recorded

In [None]:
data[data['Weather Condition'] == 'Snow']

# Find all the instances when Wind Speed is above 24 and Visibility is 25

In [None]:
data[(data['Wind Speed_km/h'] > 24) & (data['Visibility_km'] == 25)]

# What is the Mean value of each column against each 'Weather Condition' ?

In [None]:
data.groupby('Weather Condition').mean()

# The minimum and maximum value of each column against each 'Weather Condition' 

Minimum Values

In [None]:
data.groupby('Weather Condition').min()

Maximum Values

In [None]:
data.groupby('Weather Condition').max()

# Show all the records where the weather condition is fog

In [None]:
data[data['Weather Condition'] == 'Fog']

# All the instances when the weather is clear or visibility is above 40

In [None]:
data[(data['Weather Condition'] == 'Clear') | (data['Visibility_km'] > 40)].tail(50)