## Project Title  
**Airbnb Top 10 Listings by Country**

---

## Objective  
Identify the **top 10** listings for each country using a combined score derived from listing rating and number of reviews.

***

## Dataset  
- Source file: `airbnb_v2.csv` with 12,805 rows and 23 columns.
- Core columns used: `id`, `name`, `rating`, `reviews`, `country`, `price`, `bathrooms`, `bedrooms`, `beds`, `guests`, `features`, `amenities`.

***

## Data Cleaning  
- Remove unused columns to simplify the dataset.
- Convert `rating`, `reviews` to numeric.
- Strip extra whitespace to standardize categories.

***

## Feature Engineering  
- Create a composite metric:  
  - `ratingandreviewsfilter = rating * reviews`  
  - This captures both listing quality (rating) and popularity (review volume) in a single score.

***

## Analysis Logic and Outputs  
- For each country in the dataset:  
  - Filter listings by `country`.  
  - Sort in descending order by `ratingandreviewsfilter`.  
  - Select the top 10 listings (`head(10)`).
- Export results:  
  - Save each country's top 10 listings to `csvs/<country>.csv`.
  - Archive all CSVs into `csvs.zip` for easy download and sharing.

  ## Sample Data


| id       | name                                 | country | rating | reviews | price | bathrooms | bedrooms | beds | guests | ratingandreviewsfilter |
|----------|---------------------------------------|---------|--------|---------|-------|-----------|----------|------|--------|------------------------|
| 49849504 | Perla bungalov                       | Turkey  | 4.71   | 64      | 8078  | 1         | 2        | 1    | 2      | 301.44                 |
| 49871422 | Sapanca Breathable Bungalow          | Turkey  | 5.00   | 13      | 11339 | 1         | 1        | 2    | 4      | 65.00                  |
| 51245886 | Bungalov Ev 2                        | Turkey  | 0.00   | 0       | 6673  | 1         | 1        | 1    | 2      | 0.00                   |
| 48650769 | CasaMia White Suite Treehouse        | Turkey  | 0.00   | 0       | 14729 | 1         | 1        | 2    | 2      | 0.00                   |
| 50765985 | Ladin Bungalow                       | Turkey  | 0.00   | 0       | 12312 | 1         | 1        | 1    | 2      | 0.00                   |
| 4047216  | Lavender House                       | Turkey  | 0.00   | 0       | 13655 | 1         | 1        | 2    | 8      | 0.00                   |
| 53192531 | New Chalets on Farm with Fireplace 6 | Turkey  | 4.93   | 15      | 12845 | 1         | 2        | 3    | 6      | 73.95                  |
| 53151582 | Sapancaguldibibugalov                | Turkey  | 4.96   | 26      | 8128  | 1         | 1        | 1    | 4      | 128.96                 |
| 48255254 | VAD BUNGALOV SAPANCA                 | Turkey  | 4.95   | 20      | 11289 | 1         | 1        | 2    | 4      | 99.00                  |
| 62024530 | 9297447077 Villa Tuba                | Turkey  | 5.00   | 8       | 22758 | 1         | 2        | 4    | 6      | 40.00                  |


### 1) Exploratory Data Anaysis

In [None]:
import pandas as pd

In [None]:
df=pd.read_csv('/content/airbnb_v2.csv')
df.head()

#### 1.1) Check basic details

In [None]:
df.info()

#### 1.2) Check Null Values

In [None]:
df.isnull().sum()

#### 1.3) Check Duplicates

In [None]:
print(len(df['id'].unique())-df['id'].nunique())

#### 1.4) Removing the Unnecessary Columns

In [None]:
del df['Unnamed: 0']
del df['host_name']
del df['host_id']
del df['img_links']
del df['checkin']
del df['checkout']

### 2) Data Cleaning

#### 2.1) Explore Necessary Columns

In [None]:
df['reviews'].unique()

In [None]:
df['rating'].unique()

In [None]:
df['country'].unique()

#### 2.2) Clean and Typecast Rating column

In [None]:
df['rating']=df['rating'].str.replace('New','0')
df['rating']=df['rating'].astype('float')


#### 2.3) Clean and Typecast Reviews column

In [None]:
df['reviews']=df['reviews'].str.replace(',','').astype('int')

#### 2.3) Clean Country column

In [None]:
df['country']=df['country'].str.strip()



### 3) Analyse The Data

#### 3.1) Create a new column rating_and_reviews_filter



In [None]:
df['rating_and_reviews_filter']=df['rating']*df['reviews']

#### 3.2) Find Top 10 Based on Ratings and Reviews For Each Country

In [None]:
!mkdir csvs

In [None]:
from tqdm.auto import tqdm

countries=df['country'].unique()

for country in tqdm(countries):
  data=df.copy()

  data=data[data['country']==country]

  data.sort_values(by='rating_and_reviews_filter', ascending=False).head(10).to_csv('csvs/'+country+'.csv')

In [None]:
# Zip the result into a folder
import shutil

shutil.make_archive('csvs', 'zip', 'csvs')