# head() and tail()

In pandas, the head() and tail() functions are used to view a specified number of rows from the beginning or the end of a DataFrame. These functions are particularly useful when you have a large dataset, and you want to get a quick overview of its structure or examine a small portion of the data.


head():
---

The head() function returns the first n rows of a DataFrame.

Syntax: DataFrame.head(n=5)

By default, it returns the first 5 rows, but you can specify the number of rows you want by passing the n parameter


tail():
---

The tail() function returns the last n rows of a DataFrame.

Syntax: DataFrame.tail(n=5)

By default, it returns the last 5 rows, but you can specify the number of rows you want by passing the n parameter.

In [9]:
import pandas as pd

url ="https://www.timeanddate.com/weather/india/pune/ext"

df = pd.read_html(url)

df = df[0]

df.head(2)

Unnamed: 0_level_0,Unnamed: 0_level_0,Conditions,Conditions,Conditions,Comfort,Comfort,Comfort,Comfort,Precipitation,Precipitation,Sun,Sun,Sun
Unnamed: 0_level_1,Day,Unnamed: 1_level_1,Temperature,Weather,Feels Like,Wind,Unnamed: 6_level_1,Humidity,Chance,Amount,UV,Sunrise,Sunset
0,गुरु 11 जनवरी,,31 / 19 °C,Partly cloudy.,31 °C,10 km/h,↑,45%,0%,-,5 (Moderate),7.09,18.15
1,शुक्र 12 जनवरी,,32 / 17 °C,Mostly sunny.,31 °C,10 km/h,↑,39%,0%,-,3 (Moderate),7.09,18.15


# df.columns

columns attribute is used to retrieve the column labels (column names) of a DataFrame. It returns an Index object containing the column names. This attribute is useful when you want to get a list of column names or perform operations based on the column names.

In [10]:
column_name = df.columns

print(column_name)

MultiIndex([('Unnamed: 0_level_0',                'Day'),
            (        'Conditions', 'Unnamed: 1_level_1'),
            (        'Conditions',        'Temperature'),
            (        'Conditions',            'Weather'),
            (           'Comfort',         'Feels Like'),
            (           'Comfort',               'Wind'),
            (           'Comfort', 'Unnamed: 6_level_1'),
            (           'Comfort',           'Humidity'),
            (     'Precipitation',             'Chance'),
            (     'Precipitation',             'Amount'),
            (               'Sun',                 'UV'),
            (               'Sun',            'Sunrise'),
            (               'Sun',             'Sunset')],
           )


In [11]:
df.head(1)

Unnamed: 0_level_0,Unnamed: 0_level_0,Conditions,Conditions,Conditions,Comfort,Comfort,Comfort,Comfort,Precipitation,Precipitation,Sun,Sun,Sun
Unnamed: 0_level_1,Day,Unnamed: 1_level_1,Temperature,Weather,Feels Like,Wind,Unnamed: 6_level_1,Humidity,Chance,Amount,UV,Sunrise,Sunset
0,गुरु 11 जनवरी,,31 / 19 °C,Partly cloudy.,31 °C,10 km/h,↑,45%,0%,-,5 (Moderate),7.09,18.15


In [13]:
#Change Column Names :

df.columns = ["Day","NotReQ","Temp","Weather","Feels Like","Wind","NotReQ","Humidity","chance","Amount","UV","Sunrise","Sunset"]

In [14]:
df.head(1)

Unnamed: 0,Day,NotReQ,Temp,Weather,Feels Like,Wind,NotReQ.1,Humidity,chance,Amount,UV,Sunrise,Sunset
0,गुरु 11 जनवरी,,31 / 19 °C,Partly cloudy.,31 °C,10 km/h,↑,45%,0%,-,5 (Moderate),7.09,18.15


In [17]:
#To Viwe Specific Column from Data Frame

df[['Sunrise','UV']]

Unnamed: 0,Sunrise,UV
0,07.09,5 (Moderate)
1,07.09,3 (Moderate)
2,07.09,5 (Moderate)
3,07.09,5 (Moderate)
4,07.09,5 (Moderate)
5,07.09,3 (Moderate)
6,07.09,3 (Moderate)
7,07.09,3 (Moderate)
8,07.09,5 (Moderate)
9,07.09,5 (Moderate)


# df.dropna()

In pandas, the dropna() function is used to remove missing or NaN (Not a Number) values from a DataFrame. Missing values can be a common issue in datasets, and dropna() provides a convenient way to handle them.

In [20]:
#Row = axis = 0
#column = axis = 1

df = df.dropna(axis=1,inplace=False)#column drop

df.head(2)

Unnamed: 0,Day,Temp,Weather,Feels Like,Wind,NotReQ,Humidity,chance,Amount,UV,Sunrise,Sunset
0,गुरु 11 जनवरी,31 / 19 °C,Partly cloudy.,31 °C,10 km/h,↑,45%,0%,-,5 (Moderate),7.09,18.15
1,शुक्र 12 जनवरी,32 / 17 °C,Mostly sunny.,31 °C,10 km/h,↑,39%,0%,-,3 (Moderate),7.09,18.15


# df.drop()

drop() function is used to remove specified rows or columns from a DataFrame

Dropping Rows:
----
To drop specific rows, you can use the index parameter. You can specify either a single index or a list of indices to remove.

Dropping Columns:
---
To drop specific columns, you can use the columns parameter.

Inplace Parameter:
---
By default, the drop() function returns a new DataFrame with the specified rows or columns removed. If you want to modify the original DataFrame in place, you can use the inplace=True parameter.

In [23]:
#Single Column at a Time

#df = df.drop("NotReQ",axis=1,inplace=False)

#Multiple column at a Time

df = df.drop(columns=["NotReQ","Amount"],axis=1,inplace=False)




KeyError: "['NotReQ', 'Amount'] not found in axis"

In [26]:
#To delete Row
df = df.drop(15,axis=0,inplace=False)

df

Unnamed: 0,Day,Temp,Weather,Feels Like,Wind,Humidity,chance,UV,Sunrise,Sunset
0,गुरु 11 जनवरी,31 / 19 °C,Partly cloudy.,31 °C,10 km/h,45%,0%,5 (Moderate),7.09,18.15
1,शुक्र 12 जनवरी,32 / 17 °C,Mostly sunny.,31 °C,10 km/h,39%,0%,3 (Moderate),7.09,18.15
2,शनि 13 जनवरी,32 / 16 °C,Sunny.,31 °C,7 km/h,35%,0%,5 (Moderate),7.09,18.16
3,रवि 14 जनवरी,32 / 15 °C,Scattered clouds.,31 °C,5 km/h,32%,0%,5 (Moderate),7.09,18.17
4,सोम 15 जनवरी,32 / 15 °C,Mostly sunny.,30 °C,14 km/h,30%,0%,5 (Moderate),7.09,18.17
5,मंगल 16 जनवरी,32 / 14 °C,Partly cloudy.,30 °C,16 km/h,20%,0%,3 (Moderate),7.09,18.18
6,बुध 17 जनवरी,30 / 14 °C,Mostly sunny.,28 °C,19 km/h,22%,0%,3 (Moderate),7.09,18.19
7,गुरु 18 जनवरी,28 / 13 °C,Cloudy.,27 °C,6 km/h,27%,4%,3 (Moderate),7.09,18.19
8,शुक्र 19 जनवरी,27 / 14 °C,Morning clouds.,26 °C,8 km/h,22%,3%,5 (Moderate),7.09,18.2
9,शनि 20 जनवरी,29 / 15 °C,Scattered clouds.,27 °C,10 km/h,19%,2%,5 (Moderate),7.09,18.2


# df.replace()


In pandas, the replace() function is used to replace values in a DataFrame. This function allows you to replace specified values with new values, providing flexibility in data cleaning and transformation.

In [29]:
#Single Data at a Time

df['UV'] = df['UV'].str.replace("(Moderate)"," ")

df.head()

Unnamed: 0,Day,Temp,Weather,Feels Like,Wind,Humidity,chance,UV,Sunrise,Sunset
0,गुरु 11 जनवरी,31 / 19 °C,Partly cloudy.,31 °C,10 km/h,45%,0%,5,7.09,18.15
1,शुक्र 12 जनवरी,32 / 17 °C,Mostly sunny.,31 °C,10 km/h,39%,0%,3,7.09,18.15
2,शनि 13 जनवरी,32 / 16 °C,Sunny.,31 °C,7 km/h,35%,0%,5,7.09,18.16
3,रवि 14 जनवरी,32 / 15 °C,Scattered clouds.,31 °C,5 km/h,32%,0%,5,7.09,18.17
4,सोम 15 जनवरी,32 / 15 °C,Mostly sunny.,30 °C,14 km/h,30%,0%,5,7.09,18.17


In [31]:
#Multiple data at Time

df['Day'] = df['Day'].replace({
    "गुरु":"Thus",
    "शुक्र":"Fri",
    "शनि":"Sat",
    "रवि":"Sun",
    "सोम":"Mon",
    "मंगल":"Tue",
    "बुध":"Wed",
    "जनवरी":"Jan"
    
},regex=True)

df

Unnamed: 0,Day,Temp,Weather,Feels Like,Wind,Humidity,chance,UV,Sunrise,Sunset
0,Thus 11 Jan,31 / 19 °C,Partly cloudy.,31 °C,10 km/h,45%,0%,5,7.09,18.15
1,Fri 12 Jan,32 / 17 °C,Mostly sunny.,31 °C,10 km/h,39%,0%,3,7.09,18.15
2,Sat 13 Jan,32 / 16 °C,Sunny.,31 °C,7 km/h,35%,0%,5,7.09,18.16
3,Sun 14 Jan,32 / 15 °C,Scattered clouds.,31 °C,5 km/h,32%,0%,5,7.09,18.17
4,Mon 15 Jan,32 / 15 °C,Mostly sunny.,30 °C,14 km/h,30%,0%,5,7.09,18.17
5,Tue 16 Jan,32 / 14 °C,Partly cloudy.,30 °C,16 km/h,20%,0%,3,7.09,18.18
6,Wed 17 Jan,30 / 14 °C,Mostly sunny.,28 °C,19 km/h,22%,0%,3,7.09,18.19
7,Thus 18 Jan,28 / 13 °C,Cloudy.,27 °C,6 km/h,27%,4%,3,7.09,18.19
8,Fri 19 Jan,27 / 14 °C,Morning clouds.,26 °C,8 km/h,22%,3%,5,7.09,18.2
9,Sat 20 Jan,29 / 15 °C,Scattered clouds.,27 °C,10 km/h,19%,2%,5,7.09,18.2


# df.fillna()

In pandas, the fillna() method is used to fill missing (NaN) values in a DataFrame or Series with specified values. It provides a convenient way to handle missing data by either replacing NaN values with a constant, using forward or backward fill methods

In [32]:
df1 = pd.read_excel("C:/Users/Vishal/Desktop/WatchData.xlsx")

df1

Unnamed: 0,Date,Brand,Units_Sold,Revenue (in lakhs)
0,2023-01-01,Titan,100.0,150.0
1,2023-01-02,Fastrack,120.0,180.0
2,2023-01-03,Casio,,
3,2023-01-04,Fossil,80.0,120.0
4,2023-01-05,Titan,,
5,2023-01-06,Casio,90.0,135.0
6,2023-01-07,Fastrack,110.0,165.0
7,2023-01-08,Fossil,,
8,2023-01-09,Titan,95.0,142.5
9,2023-01-10,Casio,100.0,150.0


In [33]:
constantValue = df1.fillna(0)
constantValue

Unnamed: 0,Date,Brand,Units_Sold,Revenue (in lakhs)
0,2023-01-01,Titan,100.0,150.0
1,2023-01-02,Fastrack,120.0,180.0
2,2023-01-03,Casio,0.0,0.0
3,2023-01-04,Fossil,80.0,120.0
4,2023-01-05,Titan,0.0,0.0
5,2023-01-06,Casio,90.0,135.0
6,2023-01-07,Fastrack,110.0,165.0
7,2023-01-08,Fossil,0.0,0.0
8,2023-01-09,Titan,95.0,142.5
9,2023-01-10,Casio,100.0,150.0


In [34]:
backfill = df1.fillna(method="bfill")

backfill

Unnamed: 0,Date,Brand,Units_Sold,Revenue (in lakhs)
0,2023-01-01,Titan,100.0,150.0
1,2023-01-02,Fastrack,120.0,180.0
2,2023-01-03,Casio,80.0,120.0
3,2023-01-04,Fossil,80.0,120.0
4,2023-01-05,Titan,90.0,135.0
5,2023-01-06,Casio,90.0,135.0
6,2023-01-07,Fastrack,110.0,165.0
7,2023-01-08,Fossil,95.0,142.5
8,2023-01-09,Titan,95.0,142.5
9,2023-01-10,Casio,100.0,150.0


In [35]:
forwardfill = df1.fillna(method="ffill")

forwardfill

Unnamed: 0,Date,Brand,Units_Sold,Revenue (in lakhs)
0,2023-01-01,Titan,100.0,150.0
1,2023-01-02,Fastrack,120.0,180.0
2,2023-01-03,Casio,120.0,180.0
3,2023-01-04,Fossil,80.0,120.0
4,2023-01-05,Titan,80.0,120.0
5,2023-01-06,Casio,90.0,135.0
6,2023-01-07,Fastrack,110.0,165.0
7,2023-01-08,Fossil,110.0,165.0
8,2023-01-09,Titan,95.0,142.5
9,2023-01-10,Casio,100.0,150.0
