# Part 2: Cleaning the Scraped Data

### Kiran TIRUMALE LAKSHMANA RAO

##    

### Importing the Necessary Libraries and loading the scraped data

In [48]:
#Importing the Libraries
import pandas as pd
import numpy as np

#Reading the csv file:
raw_df = pd.read_csv(r"C:\Users\ktirumalelakshmana\Desktop\hotels.csv", header=None)

In [49]:
#Viewing the head of the dataset
raw_df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11
0,Entire apartment in VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,Rating 4.71 out of 5; 18 reviews,"Previous price:$4,087; Discounted price:$2,134",2 guests,1 bedroom,1 bed,1 bath. Washer,Kitchen,Wifi,Pets allowed.
1,Entire apartment in Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,Rating 4.15 out of 5; 102 reviews,"Previous price:$2,804; Discounted price:$1,314",2 guests,Studio,1 bed,1 bath. Kitchen,Wifi.,,
2,Entire apartment in Châtelet - Les Halles - Be...,Cosy and new apartment center of Paris,4.63,Rating 4.63 out of 5; 46 reviews,"Previous price:$9,566; Discounted price:$2,745",4 guests,1 bedroom,2 beds,1 bath. Washer,Kitchen,Elevator,Wifi.
3,Entire apartment in II Arrondissement,Charming apartment Grands Boulevards,No Ratings Available,No Ratings Available,"Price:$2,207",6 guests,2 bedrooms,4 beds,1.5 baths. Hosted by a business,Washer,Kitchen,Wifi.
4,Entire apartment in X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,Rating 4.47 out of 5; 49 reviews,"Previous price:$3,484; Discounted price:$2,423",4 guests,1 bedroom,2 beds,1.5 baths. Kitchen,Wifi,Dishwasher,Air conditioning.


From the above `raw_df`, I take columns, split them, clean them and move them to the main dataframe called `new_df`. The below steps detail each moving and cleaning of columns from the `raw_df` to the `new_df`.

In [50]:
#Creating an empty DataFrame
new_df = pd.DataFrame()

### Step 1: Cleaning first column & splitting into 2 columns: `Hotel_Type` & `Location_in_Paris`

In [51]:
#Taking the first column from raw_df and appending it to the new_df
new_df['Hotel_Type_with_Location'] = raw_df[0]

In [52]:
#Splitting the first column into separate columns to determine the hotel type and the location 
new_df[['Hote_Type', 'Location_in_Paris']] = new_df['Hotel_Type_with_Location'].str.split(' in ', 1, expand=True)

In [53]:
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement


### Step 2: Adding the Hotel Name to the main dataframe `new_df`

In [54]:
#Adding the description of the hotel from the raw_df to the new_df
new_df['Hotel_Name'] = raw_df[1]

In [55]:
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!"
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République


### Step 3: Adding the `Ratings` column to the main dataframe `new_df`

In [56]:
#Adding the ratings of the hotel from the raw_df to the new_df
new_df['Ratings'] = raw_df[2]
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,No Ratings Available
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47


#### Cleaning the Ratings column (No Ratings Available = 0, and changing type from string to integer) 

Note: I am changing all 'No Ratings Available' to 0, as having no ratings can be considered as having a ratings of 0.

In [57]:
#Confirming that the Ratings column has no null values
new_df.Ratings.isna().sum()

0

In [58]:
#Replacing all the "No Ratings Available" to "0", and then converting column to float
new_df['Ratings'] = new_df['Ratings'].str.replace('No Ratings Available', '0').astype(float)
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47


In [59]:
#Conforming that the Ratings column is now of float type:
new_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 280 entries, 0 to 279
Data columns (total 5 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Hotel_Type_with_Location  280 non-null    object 
 1   Hote_Type                 280 non-null    object 
 2   Location_in_Paris         280 non-null    object 
 3   Hotel_Name                280 non-null    object 
 4   Ratings                   280 non-null    float64
dtypes: float64(1), object(4)
memory usage: 11.1+ KB


### Step 4: Adding and cleaning the `No_of_Reviews` column to the main dataframe `new_df`

##### Splitting the Reviews column and taking only the number of reviews into the `No_of_Reviews` column 

In [60]:
#Splitting the Ratings and Reviews and adding them to the new_df
new_df[['Dummy_Rating', 'Dummy_Reviews']] = raw_df[3].str.split(';', 1, expand=True)

#Extracting the number from the Dummy_Reviews column
new_df['No_of_Reviews'] = new_df.Dummy_Reviews.str.extract('(\d+)')

#Dropping the dummy columns Dummy_Rating & Dummy_Reviews
new_df = new_df.drop(['Dummy_Rating', 'Dummy_Reviews'], 1)

In [61]:
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings,No_of_Reviews
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,18.0
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,102.0
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63,46.0
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0,
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,49.0


##### Cleaning the `No_of_Reviews` column by getting rid of the null values by replacing null values by 0 

In [62]:
# Checking the total number of null values in this column
new_df['No_of_Reviews'].isna().sum()

55

In [63]:
# Replacing the Null values by 0 and then converting the column to integer type
new_df['No_of_Reviews'] = new_df['No_of_Reviews'].fillna('0')

In [64]:
# Again checking the number of null values in this column
new_df['No_of_Reviews'].isna().sum()

0

In [65]:
# Changing the type of the column to integer type
new_df['No_of_Reviews'] = new_df['No_of_Reviews'].astype(int)

In [66]:
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings,No_of_Reviews
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,18
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,102
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63,46
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0,0
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,49


### Step 5: Cleaning the Previous Price & Discounted Price

For this step, I created a dummy dataframe called `dummy_df`. Cleaned the columns onto this dataframe, and them copied the cleaned columns to the main dataframe `new_df`

In [67]:
#Creating a dummy_df dataframe
dummy_df = pd.DataFrame()
#appending the column number 4 from raw_df (which contains  previous price & discounted price) to this dummy_df
dummy_df[['Prev_Price', 'Disc_Price']] = raw_df[4].str.split(';', 1, expand=True)
dummy_df.head()

Unnamed: 0,Prev_Price,Disc_Price
0,"Previous price:$4,087","Discounted price:$2,134"
1,"Previous price:$2,804","Discounted price:$1,314"
2,"Previous price:$9,566","Discounted price:$2,745"
3,"Price:$2,207",
4,"Previous price:$3,484","Discounted price:$2,423"


In [68]:
#First Splitting the Prev_Price column using '&'
dummy_df[['Dummy_Previous', 'Previous_Price']] = dummy_df['Prev_Price'].str.split('$', 1, expand=True)

In [69]:
#Now Splitting the Disc_Price column using the '&'
dummy_df[['Dummy_Disc', 'Discounted_Price']] = dummy_df['Disc_Price'].str.split('$', 1, expand=True)

#Dropping unwanted columnsj:
dummy_df = dummy_df.drop(['Prev_Price', 'Disc_Price', 'Dummy_Previous', 'Dummy_Disc'], 1)
dummy_df.head()

Unnamed: 0,Previous_Price,Discounted_Price
0,4087,2134.0
1,2804,1314.0
2,9566,2745.0
3,2207,
4,3484,2423.0


#### Converting the Previous_Price column to int

In [70]:
#Checking for null values for the Previous_Price column
dummy_df['Previous_Price'].isna().sum()

0

In [71]:
#As there are no null values, we replace the comma ',' with blank '', then convert it to an integer
dummy_df['Previous_Price'] = dummy_df['Previous_Price'].str.replace(',', '').astype(int)

#### Converting the Discounted_Price column to int

In [72]:
#Checking for null values for the 
dummy_df['Discounted_Price'].isna().sum()

41

In [73]:
#First replacing all NA values with '0' for the Discounted_Price column
dummy_df['Discounted_Price'] = dummy_df['Discounted_Price'].fillna('0')

In [74]:
#Confirming that all na values are replaced
dummy_df['Discounted_Price'].isna().sum()

0

In [75]:
#Now, after replacing all null values, we replace the comma ',' with blank '', then convert it to an integer
dummy_df['Discounted_Price'] = dummy_df['Discounted_Price'].str.replace(',', '').astype(int)

In [76]:
#Verifying that both columns Previous_Price and Discounted_Price are integer columns
dummy_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 280 entries, 0 to 279
Data columns (total 2 columns):
 #   Column            Non-Null Count  Dtype
---  ------            --------------  -----
 0   Previous_Price    280 non-null    int32
 1   Discounted_Price  280 non-null    int32
dtypes: int32(2)
memory usage: 2.3 KB


#### Copying these two columsn to the main `new_df` dataframe

In [77]:
new_df[['Previous_Price', 'Discounted_Price']] = dummy_df[['Previous_Price', 'Discounted_Price']]
#new_df.['Discounted_Price'] = dummy_df['Discounted_Price']
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings,No_of_Reviews,Previous_Price,Discounted_Price
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,18,4087,2134
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,102,2804,1314
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63,46,9566,2745
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0,0,2207,0
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,49,3484,2423


#### Correcting the `Discounted_Price` from `0` to the value of `Previous_Price`

In [78]:
for i in range(len(new_df)):
    if new_df.loc[i, 'Discounted_Price'] == 0:
        new_df.loc[i, 'Discounted_Price'] = new_df.loc[i, 'Previous_Price']
 

#### Creating a calculated_field column `Discount` = `Previous_Price` - `Discounted_Price`

In [79]:
new_df['Discount'] = new_df['Previous_Price'] - new_df['Discounted_Price'] 

In [80]:
#Converting the discount column from float into integer type
new_df['Discount'] = new_df['Discount'].astype(int)

In [81]:
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings,No_of_Reviews,Previous_Price,Discounted_Price,Discount
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,18,4087,2134,1953
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,102,2804,1314,1490
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63,46,9566,2745,6821
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0,0,2207,2207,0
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,49,3484,2423,1061


### Step 6: Cleaning the `No_of_Guests` Column

In [82]:
#Extracting the number of guests from column 5 of the raw_df dataframe
new_df['No_of_Guests'] = raw_df[5].str.extract('(\d+)')

#Changing the type of this No_of_Guests column to integet type
new_df['No_of_Guests'] = new_df['No_of_Guests'].astype(int)

In [83]:
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings,No_of_Reviews,Previous_Price,Discounted_Price,Discount,No_of_Guests
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,18,4087,2134,1953,2
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,102,2804,1314,1490,2
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63,46,9566,2745,6821,4
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0,0,2207,2207,0,6
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,49,3484,2423,1061,4


### Step 7: Cleaning the `Bedroom_or_Studio` Column

In [84]:
#Splitting the column 6 from the raw dataframe and copying to new_df
new_df[['No_of_Bedrooms', 'Type_Bed_Studio']] = raw_df[6].str.split(' ', 1, expand=True)

In [85]:
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings,No_of_Reviews,Previous_Price,Discounted_Price,Discount,No_of_Guests,No_of_Bedrooms,Type_Bed_Studio
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,18,4087,2134,1953,2,1,bedroom
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,102,2804,1314,1490,2,Studio,
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63,46,9566,2745,6821,4,1,bedroom
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0,0,2207,2207,0,6,2,bedrooms
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,49,3484,2423,1061,4,1,bedroom


In [86]:
# Replacing the values of 'Studio' with 0 and converting the column type to int
new_df['No_of_Bedrooms'] = new_df['No_of_Bedrooms'].str.replace('Studio', '0').astype(int)

In [87]:
# Replacing bedrooms in the Type_Bed_Studio column with bedroom
new_df['Type_Bed_Studio'] = new_df['Type_Bed_Studio'].str.replace('bedrooms', 'Bedroom')

# Replacing bedroom in the Type_Bed_Studio column with Bedroom (capitalizing first B)
new_df['Type_Bed_Studio'] = new_df['Type_Bed_Studio'].str.replace('bedroom', 'Bedroom')

# Replacing None in the Type_Bed_Studio column with Studio
new_df['Type_Bed_Studio'] = new_df['Type_Bed_Studio'].fillna('Studio')

In [88]:
new_df.head()

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings,No_of_Reviews,Previous_Price,Discounted_Price,Discount,No_of_Guests,No_of_Bedrooms,Type_Bed_Studio
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,18,4087,2134,1953,2,1,Bedroom
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,102,2804,1314,1490,2,0,Studio
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63,46,9566,2745,6821,4,1,Bedroom
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0,0,2207,2207,0,6,2,Bedroom
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,49,3484,2423,1061,4,1,Bedroom


### Step 8: Cleaning the `No_of_Beds` Column

In [89]:
#Extracting the number of beds the Dummy_Reviews column
new_df['No_of_Beds'] = raw_df[7].str.extract('(\d+)')

#Changing the type of this No_of_Guests column to integet type
new_df['No_of_Beds'] = new_df['No_of_Beds'].astype(int)

## FINAL DATAFRAME

In [90]:
new_df.head(10)

Unnamed: 0,Hotel_Type_with_Location,Hote_Type,Location_in_Paris,Hotel_Name,Ratings,No_of_Reviews,Previous_Price,Discounted_Price,Discount,No_of_Guests,No_of_Bedrooms,Type_Bed_Studio,No_of_Beds
0,Entire apartment in VIII Arrondissement,Entire apartment,VIII Arrondissement,"Amazing 2 rooms, heart of the Champs Elysees!",4.71,18,4087,2134,1953,2,1,Bedroom,1
1,Entire apartment in Commerce - Dupleix,Entire apartment,Commerce - Dupleix,CHARMING STUDIO - PARIS 15 DISTRICT,4.15,102,2804,1314,1490,2,0,Studio,1
2,Entire apartment in Châtelet - Les Halles - Be...,Entire apartment,Châtelet - Les Halles - Beaubourg,Cosy and new apartment center of Paris,4.63,46,9566,2745,6821,4,1,Bedroom,2
3,Entire apartment in II Arrondissement,Entire apartment,II Arrondissement,Charming apartment Grands Boulevards,0.0,0,2207,2207,0,6,2,Bedroom,4
4,Entire apartment in X Arrondissement,Entire apartment,X Arrondissement,Fabulous Design LOFT 4P- High Marais / République,4.47,49,3484,2423,1061,4,1,Bedroom,2
5,Entire condominium in XIX Arrondissement,Entire condominium,XIX Arrondissement,STUDIO IN THE GREEN HEART OF PARIS,4.84,303,2142,1501,641,2,0,Studio,1
6,Entire apartment in Saint-Germain-des-Prés - O...,Entire apartment,Saint-Germain-des-Prés - Odéon,JARDIN DU LUXEMBOURG / Bail Mobilité,0.0,0,4670,3135,1535,4,2,Bedroom,2
7,Entire apartment in Le Marais,Entire apartment,Le Marais,COEUR du MARAIS Tres Bel Appartement avec Balcon,4.91,44,5722,3201,2521,4,1,Bedroom,2
8,Entire serviced apartment in XVII Arrondissement,Entire serviced apartment,XVII Arrondissement,Chambre séjours longue durée - sans cuisine,4.67,3,3400,1163,2237,2,1,Bedroom,1
9,Entire condominium in Nanterre,Entire condominium,Nanterre,Cosy flat close to tourist sites,4.29,7,1025,1025,0,2,1,Bedroom,1


In [91]:
new_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 280 entries, 0 to 279
Data columns (total 13 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Hotel_Type_with_Location  280 non-null    object 
 1   Hote_Type                 280 non-null    object 
 2   Location_in_Paris         280 non-null    object 
 3   Hotel_Name                280 non-null    object 
 4   Ratings                   280 non-null    float64
 5   No_of_Reviews             280 non-null    int32  
 6   Previous_Price            280 non-null    int32  
 7   Discounted_Price          280 non-null    int32  
 8   Discount                  280 non-null    int32  
 9   No_of_Guests              280 non-null    int32  
 10  No_of_Bedrooms            280 non-null    int32  
 11  Type_Bed_Studio           280 non-null    object 
 12  No_of_Beds                280 non-null    int32  
dtypes: float64(1), int32(7), object(5)
memory usage: 20.9+ KB


### Final Step: Saving the file to a CSV file

In [92]:
new_df.to_csv(r"C:\Users\ktirumalelakshmana\Desktop\Paris_Airbnb_Final.csv", index = False)