# Improving the AI Product of Your Startup

## Enhancing User Engagement and Insights through AI-driven Collaboration

- Notebook by [Nasr-edine DRAI](https://www.hackerrank.com/d_nasredine)
- [Openclassrooms](https://openclassrooms.com/en/)


<div style="text-align:center;">
    <img src="../../imgs/custom_seg.jpeg" width="400" />
</div>


#### Introduction

Welcome to the project "Improving the AI Product of Your Startup." In this project, you are an AI Engineer working for the startup "Avis Restau," which connects customers with restaurants. Your company aims to enhance its platform by introducing a new collaboration feature, allowing users to post reviews and photos of their favorite restaurants. Additionally, the company wants to gain better insights into the user-posted reviews.

#### Problem Statement and Objective

The primary objective of this project is to conduct a feasibility study for two specific functionalities: detecting dissatisfaction topics in user comments and automatically labeling the photos posted on the platform. To achieve this, you need to analyze existing data and collect new data to train your AI models.

#### Overview of the Dataset

The problem statement highlights that there is insufficient data available on the Avis Restau platform. Therefore, the solution is to utilize an existing dataset. The recommended dataset for this project is the Yelp dataset, which contains general information about restaurants, including consumer reviews. You can access the dataset through the following link: https://www.yelp.com/dataset

### Get restaurants

In [1]:

import sys
import os
import csv
import json
import requests

# Get the current working directory
current_dir = os.getcwd()
display(current_dir)
# Get the parent directory path
parent_dir = os.path.dirname(current_dir)
display(parent_dir)
# Add the parent directory to the Python path
sys.path.append(parent_dir)
import config  # Import the config module



'/Users/drainasr-edine/github/ingenieur_ia/P6_drai_nasr-edine/notebooks'

'/Users/drainasr-edine/github/ingenieur_ia/P6_drai_nasr-edine'

In [2]:
url = "https://api.yelp.com/v3/businesses/search"

# Define the Yelp API access token
access_token = config.API_KEY  # Access the API key from the config module

offset = 0
params = {
    "sort_by": "best_match",
    "limit": 50,
    "offset": offset,
    "location": "paris",
    "term": "restaurants"
}
headers = {
    "Authorization": f"Bearer {access_token}"    
}

# Define the file path and name
path = '../data/'
filename = "Drai_Nasredine_1_csv_062023.csv"
csv_file = path + filename

test = 0
for  i in range(4):
    response = requests.get(url, headers=headers, params=params)
    
    data = json.loads(response.text)
    offset += 50
    params['offset'] = offset
    # Open the CSV file in write mode
    with open(csv_file, mode='a', newline='') as file:
        writer = csv.writer(file)

        if test == 0:
            # Write the header row
            writer.writerow(data['businesses'][0].keys())
            test = 1
    
        # Write each business as a row
        for business in data['businesses']:
            writer.writerow(business.values())
    
print(f"Data successfully written to '{csv_file}'.")


Data successfully written to '../data/Drai_Nasredine_1_csv_062023.csv'.


In [4]:
import pandas as pd

In [5]:
restaurants = pd.read_csv(csv_file)
restaurants.head()

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,-0iLH7iQNYtoURciDpJf6w,le-comptoir-de-la-gastronomie-paris,Le Comptoir de la Gastronomie,https://s3-media3.fl.yelpcdn.com/bphoto/xT4YkC...,False,https://www.yelp.com/biz/le-comptoir-de-la-gas...,1240,"[{'alias': 'french', 'title': 'French'}]",4.5,"{'latitude': 48.8645157999652, 'longitude': 2....",[],€€,"{'address1': '34 rue Montmartre', 'address2': ...",33142333132,+33 1 42 33 31 32,370.827517
1,IU9_wVOGBKjfqTTpAXpKcQ,bistro-des-augustins-paris,Bistro des Augustins,https://s3-media2.fl.yelpcdn.com/bphoto/ctHDHM...,False,https://www.yelp.com/biz/bistro-des-augustins-...,472,"[{'alias': 'bistros', 'title': 'Bistros'}, {'a...",4.5,"{'latitude': 48.854754, 'longitude': 2.342119}",[],€€,"{'address1': '39 quai des Grands Augustins', '...",33143540441,+33 1 43 54 04 41,801.11761
2,cEjF41ZQB8-SST8cd3EsEw,l-avant-comptoir-paris-3,L'Avant Comptoir,https://s3-media3.fl.yelpcdn.com/bphoto/mVwgxg...,False,https://www.yelp.com/biz/l-avant-comptoir-pari...,649,"[{'alias': 'tapas', 'title': 'Tapas Bars'}, {'...",4.5,"{'latitude': 48.85202, 'longitude': 2.3388}",[],€€,"{'address1': ""3 carrefour de l'Odéon"", 'addres...",33142384755,+33 1 42 38 47 55,1131.333887
3,WHHt_Jb8Tgidn9mW7oDnIg,la-coïncidence-paris-4,La Coïncidence,https://s3-media2.fl.yelpcdn.com/bphoto/5O4QPn...,False,https://www.yelp.com/biz/la-co%C3%AFncidence-p...,509,"[{'alias': 'french', 'title': 'French'}]",4.5,"{'latitude': 48.868105, 'longitude': 2.284365}",[],€€,"{'address1': '15 rue Mesnil', 'address2': '', ...",33147559644,+33 1 47 55 96 44,4281.588159
4,wLgAxIB7111BcWLWh7KpFw,la-régalade-paris-3,La Régalade,https://s3-media3.fl.yelpcdn.com/bphoto/f_-Xgg...,False,https://www.yelp.com/biz/la-r%C3%A9galade-pari...,101,"[{'alias': 'french', 'title': 'French'}]",4.5,"{'latitude': 48.8616441182389, 'longitude': 2....",[],€€€,"{'address1': '106 rue Saint-Honoré', 'address2...",33142219240,+33 1 42 21 92 40,36.28048


In [6]:
restaurants.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 16 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   id             200 non-null    object 
 1   alias          200 non-null    object 
 2   name           200 non-null    object 
 3   image_url      200 non-null    object 
 4   is_closed      200 non-null    bool   
 5   url            200 non-null    object 
 6   review_count   200 non-null    int64  
 7   categories     200 non-null    object 
 8   rating         200 non-null    float64
 9   coordinates    200 non-null    object 
 10  transactions   200 non-null    object 
 11  price          200 non-null    object 
 12  location       199 non-null    object 
 13  phone          194 non-null    object 
 14  display_phone  195 non-null    object 
 15  distance       170 non-null    float64
dtypes: bool(1), float64(2), int64(1), object(12)
memory usage: 23.8+ KB


In [7]:
restaurants.describe()

Unnamed: 0,review_count,rating,distance
count,200.0,200.0,170.0
mean,128.855,4.375,1473.475752
std,210.993234,0.357089,1075.663552
min,1.0,3.5,24.624349
25%,22.0,4.0,603.481127
50%,53.0,4.5,1144.011124
75%,138.75,4.5,2133.960636
max,1920.0,5.0,4724.525002


In [8]:
# Check for duplicate rows
duplicate_rows = restaurants.duplicated()

# Print the duplicate rows
restaurants[duplicate_rows]


Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
50,xCkh4Sc0Ih6YAf4hEcrqQA,le-comptoir-paris-12,Le Comptoir,https://s3-media1.fl.yelpcdn.com/bphoto/DEg2ak...,False,https://www.yelp.com/biz/le-comptoir-paris-12?...,125,"[{'alias': 'bistros', 'title': 'Bistros'}, {'a...",4.5,"{'latitude': 48.8295898, 'longitude': 2.33391}",[],€€,"{'address1': '18 avenue René Coty', 'address2'...",+33143226191,+33 1 43 22 61 91,3652.456213
51,MN-I5rJBYAZwp2jeHXs_JQ,l-alsacien-paris,L'Alsacien,https://s3-media3.fl.yelpcdn.com/bphoto/Um1ian...,False,https://www.yelp.com/biz/l-alsacien-paris?adju...,67,"[{'alias': 'beerbar', 'title': 'Beer Bar'}, {'...",4.5,"{'latitude': 48.8582259709568, 'longitude': 2....",[],€€,"{'address1': '6 rue Saint-Bon', 'address2': ''...",+33142776422,+33 1 42 77 64 22,734.205149
52,oBry7omEm5kavp_l86AjCA,khajuraho-paris,Khajuraho,https://s3-media1.fl.yelpcdn.com/bphoto/WKFYGB...,False,https://www.yelp.com/biz/khajuraho-paris?adjus...,22,"[{'alias': 'indpak', 'title': 'Indian'}]",5.0,"{'latitude': 48.861542, 'longitude': 2.310126}",[],"{'address1': '14 bd de la Tour Maubourg', 'add...",+33142732918,+33 1 42 73 29 18,2342.938967141968,


### Get Reviews

In [9]:
import csv
import json
import requests

# Get the list of restaurant IDs
restaurant_ids = restaurants.id.to_list()[:5]

# Define the file path and name
path = '../data/'
filename = 'Drai_Nasredine_2_csv_062023.csv'
csv_file = path + filename
test = 0
# Define the Yelp API access token
access_token = config.API_KEY  # Access the API key from the config module

# Open the CSV file in append mode
with open(csv_file, mode='a', newline='') as file:
    writer = csv.writer(file)

    # Iterate over the restaurant IDs
    for id in restaurant_ids:
        url = f"https://api.yelp.com/v3/businesses/{id}/reviews"
        headers = {'Authorization': f'Bearer {access_token}'}
        response = requests.get(url, headers=headers)
        data = response.json()

        # Write the header row if it's the first restaurant
        if test == 0 and 'reviews' in data:
            writer.writerow(['restaurant_id'] + list(data['reviews'][0].keys()))
            test = 1

        # Write each review as a row
        if 'reviews' in data:
            for review in data['reviews']:
                writer.writerow([id] + list(review.values()))

print(f"Data successfully written to '{csv_file}'.")


Data successfully written to '../data/Drai_Nasredine_2_csv_062023.csv'.
