# Improving the AI Product of Your Startup

## Enhancing User Engagement and Insights through AI-driven Collaboration

- Notebook by [Nasr-edine DRAI](https://www.hackerrank.com/d_nasredine)
- [Openclassrooms](https://openclassrooms.com/en/)


<div style="text-align:center;">
    <img src="../../imgs/custom_seg.jpeg" width="400" />
</div>


#### Introduction

Welcome to the project "Improving the AI Product of Your Startup." In this project, you are an AI Engineer working for the startup "Avis Restau," which connects customers with restaurants. Your company aims to enhance its platform by introducing a new collaboration feature, allowing users to post reviews and photos of their favorite restaurants. Additionally, the company wants to gain better insights into the user-posted reviews.

#### Problem Statement and Objective

The primary objective of this project is to conduct a feasibility study for two specific functionalities: detecting dissatisfaction topics in user comments and automatically labeling the photos posted on the platform. To achieve this, you need to analyze existing data and collect new data to train your AI models.

#### Overview of the Dataset

The problem statement highlights that there is insufficient data available on the Avis Restau platform. Therefore, the solution is to utilize an existing dataset. The recommended dataset for this project is the Yelp dataset, which contains general information about restaurants, including consumer reviews. You can access the dataset through the following link: https://www.yelp.com/dataset

In [29]:
import requests

url1 = "https://api.yelp.com/v3/businesses/search?sort_by=best_match&limit=50&offset=0&location=paris&term=restaurants"
url2 = "https://api.yelp.com/v3/businesses/search?sort_by=best_match&limit=50&offset=50&location=paris&term=restaurants"
url3 = "https://api.yelp.com/v3/businesses/search?sort_by=best_match&limit=50&offset=100&location=paris&term=restaurants"
url4 = "https://api.yelp.com/v3/businesses/search?sort_by=best_match&limit=50&offset=150&location=paris&term=restaurants"

filename = "Drai_Nasredine_1_csv_062023.csv"
path = "../data/"

payload={}
headers = {
  'Authorization': 'Bearer 89QOJqZreIzU6JspAXANUfPygmzCW6EH5eyyrH334iiF_fTA8ceLexsv_BY4Br3UVTAh6j7kj_wtmIk4Dhn-6mpodQkdIhrwNOUnFeQHL_4rbmckz0tceOE6PxmHZHYx'
}

test = 0
for url in [url1, url2, url3, url4]:
    response = requests.request("GET", url, headers=headers, data=payload)
    
    # print(response.text)
    data = json.loads(response.text)
    
    # CSV file path
    csv_file = path + filename
    
    # Open the CSV file in write mode
    with open(csv_file, mode='a', newline='') as file:
        # Create a CSV writer
        writer = csv.writer(file)
    
        # Write the header row
        if test == 0:
            writer.writerow(data['businesses'][0].keys())
            test = 1
    
        # Write each category as a row
        for category in data['businesses']:
            writer.writerow(category.values())
        file.close()
print("JSON data converted to CSV successfully.")


JSON data converted to CSV successfully.


In [18]:
import pandas as pd

In [20]:
restaurants = pd.read_csv('../notebooks/data.csv')
restaurants.head()

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,-0iLH7iQNYtoURciDpJf6w,le-comptoir-de-la-gastronomie-paris,Le Comptoir de la Gastronomie,https://s3-media3.fl.yelpcdn.com/bphoto/xT4YkC...,False,https://www.yelp.com/biz/le-comptoir-de-la-gas...,1240,"[{'alias': 'french', 'title': 'French'}]",4.5,"{'latitude': 48.8645157999652, 'longitude': 2....",[],€€,"{'address1': '34 rue Montmartre', 'address2': ...",33142333132,+33 1 42 33 31 32,370.827517
1,IU9_wVOGBKjfqTTpAXpKcQ,bistro-des-augustins-paris,Bistro des Augustins,https://s3-media2.fl.yelpcdn.com/bphoto/ctHDHM...,False,https://www.yelp.com/biz/bistro-des-augustins-...,471,"[{'alias': 'bistros', 'title': 'Bistros'}, {'a...",4.5,"{'latitude': 48.854754, 'longitude': 2.342119}",[],€€,"{'address1': '39 quai des Grands Augustins', '...",33143540441,+33 1 43 54 04 41,801.11761
2,cEjF41ZQB8-SST8cd3EsEw,l-avant-comptoir-paris-3,L'Avant Comptoir,https://s3-media3.fl.yelpcdn.com/bphoto/mVwgxg...,False,https://www.yelp.com/biz/l-avant-comptoir-pari...,649,"[{'alias': 'tapas', 'title': 'Tapas Bars'}, {'...",4.5,"{'latitude': 48.85202, 'longitude': 2.3388}",[],€€,"{'address1': ""3 carrefour de l'Odéon"", 'addres...",33142384755,+33 1 42 38 47 55,1131.333887
3,ijqSzadlZ9SCXvUEpMimcA,angelina-paris,Angelina,https://s3-media1.fl.yelpcdn.com/bphoto/LRnBdO...,False,https://www.yelp.com/biz/angelina-paris?adjust...,1498,"[{'alias': 'breakfast_brunch', 'title': 'Break...",4.0,"{'latitude': 48.865092, 'longitude': 2.328464}",[],€€€,"{'address1': '226 rue de Rivoli', 'address2': ...",33142608200,+33 1 42 60 82 00,1059.877518
4,WHHt_Jb8Tgidn9mW7oDnIg,la-coïncidence-paris-4,La Coïncidence,https://s3-media2.fl.yelpcdn.com/bphoto/5O4QPn...,False,https://www.yelp.com/biz/la-co%C3%AFncidence-p...,509,"[{'alias': 'french', 'title': 'French'}]",4.5,"{'latitude': 48.868105, 'longitude': 2.284365}",[],€€,"{'address1': '15 rue Mesnil', 'address2': '', ...",33147559644,+33 1 47 55 96 44,4281.588159


In [21]:
restaurants.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 16 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   id             200 non-null    object 
 1   alias          200 non-null    object 
 2   name           200 non-null    object 
 3   image_url      200 non-null    object 
 4   is_closed      200 non-null    bool   
 5   url            200 non-null    object 
 6   review_count   200 non-null    int64  
 7   categories     200 non-null    object 
 8   rating         200 non-null    float64
 9   coordinates    200 non-null    object 
 10  transactions   200 non-null    object 
 11  price          200 non-null    object 
 12  location       198 non-null    object 
 13  phone          194 non-null    object 
 14  display_phone  196 non-null    object 
 15  distance       176 non-null    float64
dtypes: bool(1), float64(2), int64(1), object(12)
memory usage: 23.8+ KB


In [22]:
restaurants.describe()

Unnamed: 0,review_count,rating,distance
count,200.0,200.0,176.0
mean,141.435,4.3875,1460.749956
std,231.221567,0.34158,1000.435696
min,2.0,3.5,36.28048
25%,23.75,4.0,651.681385
50%,61.0,4.5,1204.674346
75%,152.5,4.5,2036.758138
max,1920.0,5.0,4724.525002


In [25]:
# Check for duplicate rows
duplicate_rows = restaurants.duplicated()

# Print the duplicate rows
restaurants[duplicate_rows]


Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
150,eKBoT4coMRZdGmgHZuKHaw,le-petit-médicis-paris-3,Le Petit Médicis,https://s3-media3.fl.yelpcdn.com/bphoto/3tpaQd...,False,https://www.yelp.com/biz/le-petit-m%C3%A9dicis...,18,"[{'alias': 'french', 'title': 'French'}]",4.5,"{'latitude': 48.84787, 'longitude': 2.3399}",[],€€€,"{'address1': '13 rue de Médicis', 'address2': ...",33143269163,+33 1 43 26 91 63,1575.188414


In [24]:
restaurants[restaurants.id == 'eKBoT4coMRZdGmgHZuKHaw']

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
149,eKBoT4coMRZdGmgHZuKHaw,le-petit-médicis-paris-3,Le Petit Médicis,https://s3-media3.fl.yelpcdn.com/bphoto/3tpaQd...,False,https://www.yelp.com/biz/le-petit-m%C3%A9dicis...,18,"[{'alias': 'french', 'title': 'French'}]",4.5,"{'latitude': 48.84787, 'longitude': 2.3399}",[],€€€,"{'address1': '13 rue de Médicis', 'address2': ...",33143269163,+33 1 43 26 91 63,1575.188414
150,eKBoT4coMRZdGmgHZuKHaw,le-petit-médicis-paris-3,Le Petit Médicis,https://s3-media3.fl.yelpcdn.com/bphoto/3tpaQd...,False,https://www.yelp.com/biz/le-petit-m%C3%A9dicis...,18,"[{'alias': 'french', 'title': 'French'}]",4.5,"{'latitude': 48.84787, 'longitude': 2.3399}",[],€€€,"{'address1': '13 rue de Médicis', 'address2': ...",33143269163,+33 1 43 26 91 63,1575.188414
