#  Análisis de Rewiew McDonals

# 📁 1. Definición del Proyecto

### Objetivo general:
Analizar las reseñas de clientes de sucursales de McDonald's para identificar insights sobre la satisfacción del cliente, desempeño de sucursales, palabras clave positivas/negativas y patrones ocultos que ayuden a mejorar la experiencia del cliente.

### Preguntas guía (puedes cambiar/agregar):
- ¿Qué tan satisfechos están los clientes en general?
- ¿Cuáles son las sucursales con mejores y peores puntuaciones?
- ¿Qué palabras se asocian a buenas o malas reseñas?
- ¿Se puede segmentar a los usuarios según su comportamiento en las reseñas?
- ¿Existen relaciones entre la calificación y las palabras clave en texto?
- ¿Hay patrones en la distribución geográfica de las reseñas?

---

# 🧹 2. Carga y Limpieza de Datos

### Checklist:
- Importar librerías necesarias (pandas, numpy, matplotlib, seaborn, nltk, sklearn, etc.)
- Cargar el dataset
- Explorar estructura (`head()`, `info()`, `describe()`)
- Revisar valores nulos y tratarlos
- Eliminar duplicados si hay
- Normalizar nombres de columnas
- Parsear fechas (si aplica)
- Crear nuevas columnas si se requieren (ej. longitud del texto, puntuación binaria, etc.)

---

# 🧼 3. Limpieza de Texto (si hay texto libre)

### Checklist:
- Convertir a minúsculas
- Eliminar puntuación y símbolos
- Eliminar stopwords
- Lemmatizar o stemmizar
- Crear matriz TF-IDF o CountVectorizer

---

# 📊 4. Análisis Exploratorio (EDA)

### Gráficas clave:
- Distribución de puntuaciones (`sns.countplot`)
- Top sucursales por volumen de reseñas
- Top sucursales con mejor/peor promedio
- Wordcloud de reseñas positivas y negativas
- Gráfico de correlación (si aplica)
- Longitud de reseñas vs calificación

---

# 🔎 5. Análisis de Texto

- WordCloud de palabras frecuentes por categoría (positiva/negativa)
- Bigrams o trigrams frecuentes
- Asociación entre ciertas palabras y calificaciones
- Clasificador de sentimiento con Naive Bayes o Logistic Regression

---

# 📊 6. Modelado (Machine Learning básico)

### Opciones:
- Clustering de sucursales o reviews (KMeans)
- Clasificación de reseñas (positiva/negativa) con ML
- Reducción de dimensionalidad con PCA/T-SNE para visualizar clusters

---

# 📍 7. Dashboard / Visualización Final

_(Opcional pero fuerte para el CV)_

- Crear un dashboard interactivo con Plotly Dash, Streamlit o Power BI
- Visualizar: mapas de calor por sucursal, reviews por región, promedio de calificación, análisis de texto

---

# 📝 8. Conclusiones & Recomendaciones

- Resumir insights encontrados
- Recomendar mejoras: Ej. sucursales con mala atención, focos rojos, keywords a evitar
- Preparar una slide final para presentación (PDF o PPT en Notion)

---

# 💾 9. Entrega / Publicación del Proyecto

- Subir a GitHub con README profesional
- Exportar dataset limpio y notebooks bien comentados
- Publicar resumen en LinkedIn, con imágenes de las visualizaciones
- (Opcional) subir video a TikTok/Instagram mostrando tu análisis y hallazgos 🔥

---

¿Quieres que arranquemos ya con el paso 1 (objetivo y carga de datos) y lo trabajamos juntos? También te puedo ir ayudando a montar el código por etapas si lo necesitas para entregar en la uni o para tu portafolio. ¿Cómo quieres trabajar?


#Carga y limpieza

## Carga y Exploración de Datos
Cargaremos los datos y visualizaremos como estan distribuidos

Usaremos el conjunto de datos de Exploring Customer Sentiments in McDonald's US Store Reviews

Este conjunto de datos contiene más de 33,000 reseñas anónimas de tiendas McDonald's en Estados Unidos, obtenidas de Google Reviews. Proporciona valiosos insights sobre las experiencias y opiniones de los clientes en diversas ubicaciones de McDonald's en todo el país.

El conjunto de datos incluye información como:

- Nombres de las tiendas

- Categorías

- Direcciones

- Coordenadas geográficas

- Calificaciones de las reseñas

- Texto de las reseñas

- Fechas y horas en que se realizaron-

| 🏷️ Columna (Inglés)         | 🗣️ Traducción (Español) | 💡 Explicación rápida                                                          |
| ---------------------------- | ------------------------ | ------------------------------------------------------------------------------ |
| `store_name`                 | Nombre de la sucursal    | Nombre del restaurante McDonald's (puede incluir ubicación o nombre comercial) |
| `category`                   | Categoría                | Tipo de negocio; generalmente será "Fast Food", "Restaurant", etc.             |
| `address`                    | Dirección                | Dirección completa donde se ubica la sucursal                                  |
| `city`                       | Ciudad                   | Ciudad en la que está ubicada la sucursal                                      |
| `state`                      | Estado                   | Estado de EE. UU. donde se encuentra la tienda                                 |
| `latitude`                   | Latitud                  | Coordenada geográfica para mapas                                               |
| `longitude`                  | Longitud                 | Coordenada geográfica para mapas                                               |
| `rating`                     | Calificación             | Puntuación dada por el usuario (de 1 a 5 estrellas)                            |
| `review_text`                | Texto de la reseña       | Opinión escrita por el usuario                                                 |
| `timestamp`                  | Fecha y hora             | Cuándo se publicó la reseña                                                    |
| `review_id`                  | ID de la reseña          | Identificador único de la reseña                                               |
| `user_id` (si aparece)       | ID del usuario           | Identificador del usuario (anónimo o hash)                                     |
| `review_language` (opcional) | Idioma de la reseña      | Lenguaje en el que está escrita la reseña                                      |


Cargaremos los datos desde google colab que es donde estan almacenados el dataset

Instalaremos las librerais que se usaran

In [None]:
!pip install pandas
!pip install numpy
!pip install matplotlib
!pip install seaborn
!pip install nltk
!pip install sklearn

Collecting sklearn
  Using cached sklearn-0.0.post12.tar.gz (2.6 kB)
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py egg_info[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... [?25l[?25herror
[1;31merror[0m: [1mmetadata-generation-failed[0m

[31m×[0m Encountered error while generating package metadata.
[31m╰─>[0m See above for output.

[1;35mnote[0m: This is an issue with the package mentioned above, not pip.
[1;36mhint[0m: See above for details.


In [None]:
#importaeemos las librerias
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import nltk

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
df=pd.read_csv('/content/drive/MyDrive/McDonald_s_Reviews.csv', encoding='latin1')
display(df)

Unnamed: 0,reviewer_id,store_name,category,store_address,latitude,longitude,rating_count,review_time,review,rating
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,a month ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star
...,...,...,...,...,...,...,...,...,...,...
33391,33392,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,4 years ago,They treated me very badly.,1 star
33392,33393,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,a year ago,The service is very good,5 stars
33393,33394,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,a year ago,To remove hunger is enough,4 stars
33394,33395,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,5 years ago,"It's good, but lately it has become very expen...",5 stars


In [None]:
display(df.head())
display(df.info())
display(df.describe())

Unnamed: 0,reviewer_id,store_name,category,store_address,latitude,longitude,rating_count,review_time,review,rating
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,a month ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 33396 entries, 0 to 33395
Data columns (total 10 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   reviewer_id    33396 non-null  int64  
 1   store_name     33396 non-null  object 
 2   category       33396 non-null  object 
 3   store_address  33396 non-null  object 
 4   latitude       32736 non-null  float64
 5   longitude      32736 non-null  float64
 6   rating_count   33396 non-null  object 
 7   review_time    33396 non-null  object 
 8   review         33396 non-null  object 
 9   rating         33396 non-null  object 
dtypes: float64(2), int64(1), object(7)
memory usage: 2.5+ MB


None

Unnamed: 0,reviewer_id,latitude,longitude
count,33396.0,32736.0,32736.0
mean,16698.5,34.442546,-90.647033
std,9640.739131,5.344116,16.594844
min,1.0,25.790295,-121.995421
25%,8349.75,28.65535,-97.792874
50%,16698.5,33.931261,-81.471414
75%,25047.25,40.727401,-75.399919
max,33396.0,44.98141,-73.45982


In [None]:
df.shape

(33396, 10)

Revisaremos valores nulos

In [None]:
print("Valores null",df.isnull().sum())
print("Valores Na ",df.isna().sum())

Valores null reviewer_id        0
store_name         0
category           0
store_address      0
latitude         660
longitude        660
rating_count       0
review_time        0
review             0
rating             0
dtype: int64
Valores Na  reviewer_id        0
store_name         0
category           0
store_address      0
latitude         660
longitude        660
rating_count       0
review_time        0
review             0
rating             0
dtype: int64


vamos a eliminaar los valores nulos, evaluando, solamente son latitude y llngitud por lo que analizaremos primero si las direccciones ya contiene esta latitud y remplzarala
por lo que podemos hacer sera:

- primero que muestren cual es el valor de la store adrees de los valores nulos

In [None]:
#visualizaremos estos datos
display(df[df['latitude '].isnull()]['store_address'])
print(df[df['latitude '].isnull()]['store_address'].value_counts())

Unnamed: 0,store_address
22141,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...
22142,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...
22143,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...
22144,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...
22145,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...
...,...
27719,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...
27720,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...
27721,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...
27722,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...


store_address
2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½    660
Name: count, dtype: int64


Por lo que se puede observar solo pertenece a una tienda la cual es de Kali, se intentara invesitgar para poder reemplazar estos valores, primero viendo cuantos daros tiene esta tienda si presenta ya las coordenadas en el df sino investifar por duera, en todo caso de no tenerla se eliminarian

In [None]:
display(df[df["store_address"].str.contains("Kalï")])

Unnamed: 0,reviewer_id,store_name,category,store_address,latitude,longitude,rating_count,review_time,review,rating
22141,22142,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,3 months ago,Breakfast specials are good. The sausage burri...,4 stars
22142,22143,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,a year ago,This isn't your typical McDonald's. This place...,5 stars
22143,22144,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,2 weeks ago,This place was serving good quality breakfast ...,4 stars
22144,22145,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,a month ago,I understand this is a very busy location but ...,1 star
22145,22146,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,2 months ago,"When I arrived at McDonald's, it was very crow...",4 stars
...,...,...,...,...,...,...,...,...,...,...
27719,27720,ýýýMcDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,3 years ago,This McDonald's is across the street from Waik...,5 stars
27720,27721,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,11 months ago,"Seems like, they always makes some mistakes wh...",2 stars
27721,27722,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,11 months ago,Convenient to the east end of Kalakaua Ave. Lo...,4 stars
27722,27723,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,,,2175,11 months ago,"Lost McDonald's in Honolulu, if you can avoid ...",1 star


Al no presentar estos datos de lattitud se trataran de invesitgar por fuera, por lo que se dio que los datos faltantes es de un McDonalds en Kali, sus coordenadas son Lattitude: 12.313036691208465, Longitud: 76.64454211632345 por lo que se cambiara

In [None]:
df.dtypes

Unnamed: 0,0
reviewer_id,int64
store_name,object
category,object
store_address,object
latitude,float64
longitude,float64
rating_count,object
review_time,object
review,object
rating,object


In [None]:
df.isnull().sum()

Unnamed: 0,0
reviewer_id,0
store_name,0
category,0
store_address,0
latitude,660
longitude,660
rating_count,0
review_time,0
review,0
rating,0


In [None]:
df.columns

Index(['reviewer_id', 'store_name', 'category', 'store_address', 'latitude ',
       'longitude', 'rating_count', 'review_time', 'review', 'rating'],
      dtype='object')

In [None]:
#Reemplazar los valores
df['latitude '].fillna(12.313036691208465,inplace=True)
df['longitude'].fillna(76.64454211632345,inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['latitude '].fillna(12.313036691208465,inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['longitude'].fillna(76.64454211632345,inplace=True)


In [None]:
display(df[df["store_address"].str.contains("Kalï")])

Unnamed: 0,reviewer_id,store_name,category,store_address,latitude,longitude,rating_count,review_time,review,rating
22141,22142,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,3 months ago,Breakfast specials are good. The sausage burri...,4 stars
22142,22143,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,a year ago,This isn't your typical McDonald's. This place...,5 stars
22143,22144,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,2 weeks ago,This place was serving good quality breakfast ...,4 stars
22144,22145,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,a month ago,I understand this is a very busy location but ...,1 star
22145,22146,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,2 months ago,"When I arrived at McDonald's, it was very crow...",4 stars
...,...,...,...,...,...,...,...,...,...,...
27719,27720,ýýýMcDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,3 years ago,This McDonald's is across the street from Waik...,5 stars
27720,27721,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,11 months ago,"Seems like, they always makes some mistakes wh...",2 stars
27721,27722,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,11 months ago,Convenient to the east end of Kalakaua Ave. Lo...,4 stars
27722,27723,McDonald's,Fast food restaurant,2476 Kalï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿...,12.313037,76.644542,2175,11 months ago,"Lost McDonald's in Honolulu, if you can avoid ...",1 star


ya estaria mostrando todos los datos necesarios

Visualizaremos los datos duplicados

In [None]:
#Datos duplicados
df.duplicated().sum()

np.int64(0)

Normalizaremos los datos de la columna

In [None]:
#Normalizaremos los datos de la columna
df.columns=df.columns.str.lower().str.replace(' ','_')

In [None]:
df.head()

Unnamed: 0,reviewer_id,store_name,category,store_address,latitude_,longitude,rating_count,review_time,review,rating
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,a month ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star


In [None]:
df.columns


Index(['reviewer_id', 'store_name', 'category', 'store_address', 'latitude_',
       'longitude', 'rating_count', 'review_time', 'review', 'rating'],
      dtype='object')

In [None]:
#Vamos a cambiar el nombre de las columnas
df.rename(columns={
    'category': 'store_category',
    'latitude_': 'latitude',
    'rating_count': 'num_ratings',
}, inplace=True)

In [None]:
df.head()

Unnamed: 0,reviewer_id,store_name,store_category,store_address,latitude,longitude,num_ratings,review_time,review,rating
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,a month ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star


A normalizar columnas necesarias para la evaluacion como rewe time y rating
aunque sean en nuevas columnas

In [None]:
df['rating_score'] = df['rating'].str.extract(r'(\d+\.?\d*)').astype(float)

In [None]:
df.columns

Index(['reviewer_id', 'store_name', 'store_category', 'store_address',
       'latitude', 'longitude', 'num_ratings', 'review_time', 'review',
       'rating', 'rating_score'],
      dtype='object')

In [None]:
df['rating_score']

Unnamed: 0,rating_score
0,1.0
1,4.0
2,1.0
3,5.0
4,1.0
...,...
33391,1.0
33392,5.0
33393,4.0
33394,5.0


In [None]:
df['review_time']=df['review_time'].str.replace('a month ago','1 months ago')

In [None]:
df['review_time']=df['review_time'].str.replace('a week ago','1 weeks ago')
df['review_time']=df['review_time'].str.replace('a day ago','1 days ago')
df['review_time']=df['review_time'].str.replace('a year ago','1 years ago')

In [None]:
import pandas as pd
import numpy as np
import re

# Establecer la fecha de snapshot (cuando asumimos que se hizo el scrape)
snapshot_date = pd.to_datetime("2023-06-29")

# Función para convertir texto como "3 months ago" en timedelta
def convertir_tiempo_relativo(texto):
    if pd.isnull(texto):
        return np.nan

    # Buscar número
    match = re.search(r'(\d+)', texto)
    if not match:
        return np.nan

    cantidad = int(match.group(1))
    texto = texto.lower()

    # Calcular tiempo aproximado
    if "month" in texto:
        return pd.Timedelta(days=cantidad * 30)
    elif "week" in texto:
        return pd.Timedelta(weeks=cantidad)
    elif "day" in texto:
        return pd.Timedelta(days=cantidad)
    elif "year" in texto:
        return pd.Timedelta(days=cantidad * 365)
    else:
        return pd.NaT

# Aplicar función y generar columna de fecha real
df['review_date'] = snapshot_date - df['review_time'].apply(convertir_tiempo_relativo)

# (Opcional) Extraer mes y año para análisis
df['review_month'] = df['review_date'].dt.to_period('M')
df['review_year'] = df['review_date'].dt.year

# Mostrar resultado
print(df[['review_time', 'review_date', 'review_month', 'review_year']].head())


    review_time review_date review_month  review_year
0  3 months ago  2023-03-31      2023-03       2023.0
1    5 days ago  2023-06-24      2023-06       2023.0
2    5 days ago  2023-06-24      2023-06       2023.0
3  1 months ago  2023-05-30      2023-05       2023.0
4  2 months ago  2023-04-30      2023-04       2023.0


vamos a remplazar el texto "a month ago" por 1 month ago

In [None]:
# Reemplazamos los NaN por una fecha específica
fecha_default = pd.Timestamp("2023-06-29")
df['review_date'] = df['review_date'].fillna(fecha_default)

In [None]:
print(df.columns)

Index(['reviewer_id', 'store_name', 'store_category', 'store_address',
       'latitude', 'longitude', 'num_ratings', 'review_time', 'review',
       'rating', 'rating_score', 'review_date', 'review_month', 'review_year'],
      dtype='object')


In [None]:
df.head()

Unnamed: 0,reviewer_id,store_name,store_category,store_address,latitude,longitude,num_ratings,review_time,review,rating,rating_score,review_date,review_month,review_year
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star,1.0,2023-03-31,2023-03,2023.0
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars,4.0,2023-06-24,2023-06,2023.0
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star,1.0,2023-06-24,2023-06,2023.0
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,1 months ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars,5.0,2023-05-30,2023-05,2023.0
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star,1.0,2023-04-30,2023-04,2023.0


In [None]:
print(df[df['review_time']=='1 days ago'])

       reviewer_id  store_name        store_category  \
5651          5652  McDonald's  Fast food restaurant   
10681        10682  McDonald's  Fast food restaurant   
13871        13872  McDonald's  Fast food restaurant   
16988        16989  McDonald's  Fast food restaurant   
16989        16990  McDonald's  Fast food restaurant   
22824        22825  McDonald's  Fast food restaurant   
24285        24286  McDonald's  Fast food restaurant   
25296        25297  McDonald's  Fast food restaurant   
26587        26588  McDonald's  Fast food restaurant   
26592        26593  McDonald's  Fast food restaurant   
26599        26600  McDonald's  Fast food restaurant   
27533        27534  McDonald's  Fast food restaurant   
27716        27717  McDonald's  Fast food restaurant   
27743        27744  McDonald's  Fast food restaurant   
28716        28717  McDonald's  Fast food restaurant   
29858        29859  McDonald's  Fast food restaurant   
30960        30961  McDonald's  Fast food restau

In [None]:
print(df[['review','rating']])

                                                  review   rating
0      Why does it look like someone spit on my food?...   1 star
1      It'd McDonalds. It is what it is as far as the...  4 stars
2      Made a mobile order got to the speaker and che...   1 star
3      My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...  5 stars
4      I repeat my order 3 times in the drive thru, a...   1 star
...                                                  ...      ...
33391                        They treated me very badly.   1 star
33392                           The service is very good  5 stars
33393                         To remove hunger is enough  4 stars
33394  It's good, but lately it has become very expen...  5 stars
33395                          they took good care of me  5 stars

[33396 rows x 2 columns]


In [None]:
display(df['review'].iloc[0])

'Why does it look like someone spit on my food?\nI had a normal transaction,  everyone was chill and polite, but now i dont want to eat this. Im trying not to think about what this milky white/clear substance is all over my food, i d*** sure am not coming back.'

####Vamos a crear nuevas columnas para la lonituf del texto, puntuación binaria, etc

1. Longitud del texto
Observaremos que tan elaboradas estan las reseñas

In [None]:
df['review_length'] = df['review'].apply(lambda x: len(str(x)))

2. Número de palabras
Veremos en su totalidad las palbras usadas

In [None]:
df['word_count']=df['review'].apply(lambda x: len(str(x).split()))

3. Presencia de palabras positivas o negativas

Podemos marcar si el texto tiene algunas palabras de una lista que se definira como "positiva" o "negatica". Esto es útil si no hay etiqueta directa de buena o mala reseña

In [None]:
positive_words=['good','great','excellent','amazing','friendly', 'clean', 'Super', 'favorite',
# 🔥 Food / Flavor
    'delicious', 'mouth-watering', 'flavorful', 'perfectly seasoned',
    'fresh ingredients', 'tasty', 'exquisite', 'savory',
    'well-balanced flavors', 'cooked to perfection',

    # 🌟 Service
    'attentive staff', 'friendly service', 'welcoming atmosphere',
    'excellent customer service', 'prompt and polite', 'made us feel at home',
    'professional and courteous', 'fast service', 'super accommodating',
    'warm and inviting',

    # 🛋️ Atmosphere / Decor
    'cozy vibe', 'great ambiance', 'beautiful decor', 'trendy and modern',
    'chill atmosphere', 'romantic setting', 'stylish interior',
    'instagram-worthy spot', 'clean and well-maintained', 'comfortable seating',

    # 💸 Price / Value
    'great value', 'worth every penny', 'affordable and tasty',
    'reasonably priced', 'quality food at a fair price', 'bang for your buck',
    'impressive portions', 'not overpriced',

    # ✨ Overall Experience
    'highly recommended', 'a hidden gem', 'will definitely come back',
    'memorable dining experience', 'five-star experience',
    'exceeded expectations', 'top-notch', 'best meal I’ve ever had',
    'a must-visit spot', 'consistently amazing']

negative_words=['bad','worst','terrible', 'badly',
    # 🤢 Food / Flavor
    'bland', 'tasteless', 'overcooked', 'undercooked',
    'too salty', 'too greasy', 'dry', 'soggy',
    'lack of flavor', 'stale',

    # 🙄 Service
    'rude staff', 'slow service', 'unfriendly', 'unprofessional',
    'ignored us', 'bad attitude', 'unhelpful', 'inattentive',
    'took forever', 'poor customer service',

    # 🪫 Atmosphere / Decor
    'dirty', 'noisy', 'uncomfortable', 'bad lighting',
    'cramped space', 'outdated decor', 'too loud', 'not clean',
    'weird smell', 'poor hygiene',

    # 💸 Price / Value
    'overpriced', 'not worth the money', 'expensive and bad quality',
    'small portions', 'poor value', 'rip-off', 'felt cheated',
    'waste of money',

    # 😤 Overall Experience
    'disappointing', 'never coming back', 'worst experience',
    'would not recommend', 'low quality', 'below expectations',
    'nothing special', 'frustrating experience', 'bad impression',
    'regret going there'
]


In [None]:
def has_positive_words(text):
  return any(word in str(text).lower() for word in positive_words)

def has_negative_words(text):
  return any(word in str(text).lower() for word in negative_words)

df['has_positive_words'] = df['review'].apply(has_positive_words)
df['has_negative_words'] = df['review'].apply(has_negative_words)

4. Puntuación binaria basada en la valoración numeérica (si la hay)


In [None]:
df['binary_rating'] = df['rating_score'].apply(lambda x: 1 if x >= 4 else 0)

In [None]:
df.head()

Unnamed: 0,reviewer_id,store_name,store_category,store_address,latitude,longitude,num_ratings,review_time,review,rating,rating_score,review_date,review_month,review_year,review_length,word_count,has_positive_words,has_negative_words,binary_rating
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star,1.0,2023-03-31,2023-03,2023.0,259,51,False,False,0
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars,4.0,2023-06-24,2023-06,2023.0,237,42,True,False,1
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star,1.0,2023-06-24,2023-06,2023.0,415,70,False,False,0
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,1 months ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars,5.0,2023-05-30,2023-05,2023.0,176,13,False,False,1
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star,1.0,2023-04-30,2023-04,2023.0,312,68,False,False,0


In [None]:
df.describe()

Unnamed: 0,reviewer_id,latitude,longitude,rating_score,review_date,review_year,review_length,word_count,binary_rating
count,33396.0,33396.0,33396.0,33396.0,33396,33389.0,33396.0,33396.0,33396.0
mean,16698.5,34.005204,-87.340876,3.131363,2020-08-27 11:20:43.118936576,2020.154871,125.903042,22.096269,0.480926
min,1.0,12.313037,-121.995421,1.0,2011-07-02 00:00:00,2011.0,1.0,1.0,0.0
25%,8349.75,28.65535,-97.792874,1.0,2019-06-30 00:00:00,2019.0,17.0,3.0,0.0
50%,16698.5,33.931261,-81.461242,3.0,2020-06-29 00:00:00,2020.0,63.0,11.0,0.0
75%,25047.25,40.727401,-75.399919,5.0,2022-06-29 00:00:00,2022.0,160.0,28.0,1.0
max,33396.0,44.98141,76.644542,5.0,2023-06-29 00:00:00,2023.0,3115.0,589.0,1.0
std,9640.739131,6.12228,28.497791,1.615139,,1.840893,184.000878,33.127654,0.499644


In [None]:
prueba=df['review_date'].isna().sum()
prueba



np.int64(0)

In [None]:
print("Valores null",df['review_date'].isnull())

Valores null 0        False
1        False
2        False
3        False
4        False
         ...  
33391    False
33392    False
33393    False
33394    False
33395    False
Name: review_date, Length: 33396, dtype: bool


In [None]:
display(df[df['review_date'].isnull()]['review_time'])

Unnamed: 0,review_time


## 3. Limpieza de Texto (si hay texto libre)

Checklist:
- Convertir a minúsculas
- Eliminar puntuación y símbolos
- Eliminar stopwords
- Lemmatizar o stemmizar
- Crear matriz TF-IDF o CountVectorizer

In [None]:
#convertir a minusculas el texto
df['review_limpio']=df['review'].str.lower()
df['review_limpio']

Unnamed: 0,review_limpio
0,why does it look like someone spit on my food?...
1,it'd mcdonalds. it is what it is as far as the...
2,made a mobile order got to the speaker and che...
3,my mc. crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...
4,"i repeat my order 3 times in the drive thru, a..."
...,...
33391,they treated me very badly.
33392,the service is very good
33393,to remove hunger is enough
33394,"it's good, but lately it has become very expen..."


In [None]:
#Eliminar puntuacuon y signos
df['review_limpio']=df['review_limpio'].str.replace('[^\w\s]','')
#Vamos a eliminar los 1/2
df['review_limpio']=df['review_limpio'].str.replace('½','')
#Vamos a eliminar ¿
df['review_limpio']=df['review_limpio'].str.replace('¿','')
#vamos a eliminar Â
df['review_limpio']=df['review_limpio'].str.replace('Â','')
df['review_limpio']

Unnamed: 0,review_limpio
0,why does it look like someone spit on my food?...
1,it'd mcdonalds. it is what it is as far as the...
2,made a mobile order got to the speaker and che...
3,my mc. crispy chicken sandwich was ïïïïïïïïïïï...
4,"i repeat my order 3 times in the drive thru, a..."
...,...
33391,they treated me very badly.
33392,the service is very good
33393,to remove hunger is enough
33394,"it's good, but lately it has become very expen..."


In [None]:
#Eliminar stopwords
from nltk.corpus import stopwords
nltk.download('stopwords')
stop=stopwords.words('english')
df['review_limpio']=df['review_limpio'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stop)]))
df['review_limpio']


[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Unnamed: 0,review_limpio
0,look like someone spit food? normal transactio...
1,mcdonalds. far food atmosphere go. staff make ...
2,made mobile order got speaker checked in. line...
3,mc. crispy chicken sandwich ïïïïïïïïïïïïïïïïïï...
4,"repeat order 3 times drive thru, still manage ..."
...,...
33391,treated badly.
33392,service good
33393,remove hunger enough
33394,"good, lately become expensive."


In [None]:
import spacy

# Cargar modelo en inglés
nlp = spacy.load("en_core_web_sm")

# Función para lematizar
def lemmatize_text(text):
    doc = nlp(text)
    lemmatized = [token.lemma_ for token in doc if not token.is_stop and not token.is_punct]
    return " ".join(lemmatized)

# Aplicar al dataframe
df['lemmatized_review'] = df['review_limpio'].apply(lemmatize_text)


In [None]:
df

Unnamed: 0,reviewer_id,store_name,store_category,store_address,latitude,longitude,num_ratings,review_time,review,rating,...,review_date,review_month,review_year,review_length,word_count,has_positive_words,has_negative_words,binary_rating,review_limpio,lemmatized_review
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star,...,2023-03-31,2023-03,2023.0,259,51,False,False,0,look like someone spit food? normal transactio...,look like spit food normal transaction chill p...
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars,...,2023-06-24,2023-06,2023.0,237,42,True,False,1,mcdonalds. far food atmosphere go. staff make ...,mcdonald far food atmosphere staff difference ...
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star,...,2023-06-24,2023-06,2023.0,415,70,False,False,0,made mobile order got speaker checked in. line...,mobile order get speaker check line move leave...
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,1 months ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars,...,2023-05-30,2023-05,2023.0,176,13,False,False,1,mc. crispy chicken sandwich ïïïïïïïïïïïïïïïïïï...,mc crispy chicken sandwich ïïïïïïïïïïïïïïïïïïï...
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star,...,2023-04-30,2023-04,2023.0,312,68,False,False,0,"repeat order 3 times drive thru, still manage ...",repeat order 3 time drive manage mess suppose ...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
33391,33392,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,4 years ago,They treated me very badly.,1 star,...,2019-06-30,2019-06,2019.0,27,5,False,True,0,treated badly.,treat badly
33392,33393,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,1 years ago,The service is very good,5 stars,...,2022-06-29,2022-06,2022.0,24,5,True,False,1,service good,service good
33393,33394,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,1 years ago,To remove hunger is enough,4 stars,...,2022-06-29,2022-06,2022.0,26,5,False,False,1,remove hunger enough,remove hunger
33394,33395,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,5 years ago,"It's good, but lately it has become very expen...",5 stars,...,2018-06-30,2018-06,2018.0,51,9,True,False,1,"good, lately become expensive.",good lately expensive


In [None]:
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

# Puedes elegir una de las dos:
# CountVectorizer cuenta cuántas veces aparece cada palabra
#vectorizer = CountVectorizer(max_features=1000, stop_words='english')

# O usar TF-IDF, que penaliza palabras comunes
vectorizer = TfidfVectorizer(max_features=1000, stop_words='english')

# Aplica la vectorización a la columna de texto ya procesada
X_vectorized = vectorizer.fit_transform(df['review_limpio'])

# Para convertirlo a DataFrame legible
import pandas as pd
X_df = pd.DataFrame(X_vectorized.toarray(), columns=vectorizer.get_feature_names_out())

# Mostrar las primeras filas
print(X_df.head())


    00   10  100   11  11pm   12   15   19  1st   20  ...  young  yummy  zero  \
0  0.0  0.0  0.0  0.0   0.0  0.0  0.0  0.0  0.0  0.0  ...    0.0    0.0   0.0   
1  0.0  0.0  0.0  0.0   0.0  0.0  0.0  0.0  0.0  0.0  ...    0.0    0.0   0.0   
2  0.0  0.0  0.0  0.0   0.0  0.0  0.0  0.0  0.0  0.0  ...    0.0    0.0   0.0   
3  0.0  0.0  0.0  0.0   0.0  0.0  0.0  0.0  0.0  0.0  ...    0.0    0.0   0.0   
4  0.0  0.0  0.0  0.0   0.0  0.0  0.0  0.0  0.0  0.0  ...    0.0    0.0   0.0   

   zoo   ïï  ïïï  ïïïï  ïïïïïï  ïïïïïïïïïïïïïïïïïïïïïïïïïïïïïïïïïïïï  ýýý  
0  0.0  0.0  0.0   0.0     0.0                              0.000000  0.0  
1  0.0  0.0  0.0   0.0     0.0                              0.000000  0.0  
2  0.0  0.0  0.0   0.0     0.0                              0.000000  0.0  
3  0.0  0.0  0.0   0.0     0.0                              0.466638  0.0  
4  0.0  0.0  0.0   0.0     0.0                              0.000000  0.0  

[5 rows x 1000 columns]


In [None]:
df

Unnamed: 0,reviewer_id,store_name,store_category,store_address,latitude,longitude,num_ratings,review_time,review,rating,...,review_date,review_month,review_year,review_length,word_count,has_positive_words,has_negative_words,binary_rating,review_limpio,lemmatized_review
0,1,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,3 months ago,Why does it look like someone spit on my food?...,1 star,...,2023-03-31,2023-03,2023.0,259,51,False,False,0,look like someone spit food? normal transactio...,look like spit food normal transaction chill p...
1,2,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,It'd McDonalds. It is what it is as far as the...,4 stars,...,2023-06-24,2023-06,2023.0,237,42,True,False,1,mcdonalds. far food atmosphere go. staff make ...,mcdonald far food atmosphere staff difference ...
2,3,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,5 days ago,Made a mobile order got to the speaker and che...,1 star,...,2023-06-24,2023-06,2023.0,415,70,False,False,0,made mobile order got speaker checked in. line...,mobile order get speaker check line move leave...
3,4,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,1 months ago,My mc. Crispy chicken sandwich was ï¿½ï¿½ï¿½ï¿...,5 stars,...,2023-05-30,2023-05,2023.0,176,13,False,False,1,mc. crispy chicken sandwich ïïïïïïïïïïïïïïïïïï...,mc crispy chicken sandwich ïïïïïïïïïïïïïïïïïïï...
4,5,McDonald's,Fast food restaurant,"13749 US-183 Hwy, Austin, TX 78750, United States",30.460718,-97.792874,1240,2 months ago,"I repeat my order 3 times in the drive thru, a...",1 star,...,2023-04-30,2023-04,2023.0,312,68,False,False,0,"repeat order 3 times drive thru, still manage ...",repeat order 3 time drive manage mess suppose ...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
33391,33392,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,4 years ago,They treated me very badly.,1 star,...,2019-06-30,2019-06,2019.0,27,5,False,True,0,treated badly.,treat badly
33392,33393,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,1 years ago,The service is very good,5 stars,...,2022-06-29,2022-06,2022.0,24,5,True,False,1,service good,service good
33393,33394,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,1 years ago,To remove hunger is enough,4 stars,...,2022-06-29,2022-06,2022.0,26,5,False,False,1,remove hunger enough,remove hunger
33394,33395,McDonald's,Fast food restaurant,"3501 Biscayne Blvd, Miami, FL 33137, United St...",25.810000,-80.189098,2810,5 years ago,"It's good, but lately it has become very expen...",5 stars,...,2018-06-30,2018-06,2018.0,51,9,True,False,1,"good, lately become expensive.",good lately expensive


In [None]:
#Guardar el df producido, y descargarlo
df.to_csv('df_clean.csv', index=False)
from google.colab import files
files.download('df_clean.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>