Importamos las librerías Pandas y Numpy

# Chipotle

Nos han contactado para hacer un análisis de la cadena de restaurantes Chipotle y entender mejor posibles puntos de mejora... Por ahora solo nos pasan una muestra de los datos [aquí](https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv). ¿Somos capaces de explorarlos para ver de qué se trata? Qué tipo de información hay, está completa, podemos pedirles que nos la manden de otro modo, que nos informen de algún campo más, etc...


In [48]:
%pip install pandas numpy

import pandas as pd
import numpy as np





### Step 2. Tomaremos el fichero en [esta URL](https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv) y lo leeremos como un dataframe.

Pista: [read_csv](https://numpy.org/doc/stable/user/absolute_beginners.html#importing-and-exporting-a-csv)

In [49]:
url = "https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv"

df = pd.read_csv(url, sep='\t')

print(df.head())


   order_id  quantity                              item_name  \
0         1         1           Chips and Fresh Tomato Salsa   
1         1         1                                   Izze   
2         1         1                       Nantucket Nectar   
3         1         1  Chips and Tomatillo-Green Chili Salsa   
4         2         2                           Chicken Bowl   

                                  choice_description item_price  
0                                                NaN     $2.39   
1                                       [Clementine]     $3.39   
2                                            [Apple]     $3.39   
3                                                NaN     $2.39   
4  [Tomatillo-Red Chili Salsa (Hot), [Black Beans...    $16.98   


### Step 3. Veamos los tipos de datos. ¿Podríamos obtener el producto de mayor precio?

In [50]:
print(df.dtypes)

highest_priced_product = df.loc[df['item_price'].idxmax()]

print("\nProduct with the highest price:")
print(highest_priced_product)


order_id               int64
quantity               int64
item_name             object
choice_description    object
item_price            object
dtype: object

Product with the highest price:
order_id                                        250
quantity                                          1
item_name                          Steak Salad Bowl
choice_description    [Fresh Tomato Salsa, Lettuce]
item_price                                   $9.39 
Name: 607, dtype: object


### Step 4. ¿Qué productos cuestan más de $10?

In [60]:
df['item_price'] = df['item_price'].astype(str).str.replace('$', '')

df['item_price'] = pd.to_numeric(df['item_price'])

expensive_products = df[df['item_price'] > 10]

print("Products that cost more than $10:")
print(expensive_products[['item_name', 'item_price']])


Products that cost more than $10:
               item_name  item_price
4           Chicken Bowl       16.98
5           Chicken Bowl       10.98
7          Steak Burrito       11.75
13          Chicken Bowl       11.25
23       Chicken Burrito       10.98
...                  ...         ...
4610       Steak Burrito       11.75
4611      Veggie Burrito       11.25
4617       Steak Burrito       11.75
4618       Steak Burrito       11.75
4619  Chicken Salad Bowl       11.25

[1130 rows x 2 columns]


### Step 4.1: ¿Y cuántos pedidos se han hecho con un producto de más de 10$? ¿Es lo mismo?

In [62]:
orders_with_expensive_products = df[df['item_price'] > 10]['order_id'].nunique()

print("Number of orders made with a product that costs more than $10:", orders_with_expensive_products)


Number of orders made with a product that costs more than $10: 863


### Step 4.2: ¿Y cuántos pedidos se han hecho de más de 10$? ¿Es lo mismo?

In [53]:
total_per_order = df.groupby('order_id')['item_price'].sum()

num_orders_total_over_10 = (total_per_order > 10).sum()

print("Number of orders with a total of more than $10:", num_orders_total_over_10)


Number of orders with a total of more than $10: 1834


### Step 4.3: ¿Y en cuántos pedidos se ha pagado más de 10$ por un mismo producto? ¿Es lo mismo?

In [54]:
total_per_product = df.groupby('item_name')['item_price'].sum()

expensive_products = total_per_product[total_per_product > 10]

num_products_over_10 = len(expensive_products)

print("Number of products that have been paid more than $10 for the same product:", num_products_over_10)


Number of products that have been paid more than $10 for the same product: 47


### Step 5. ¿Qué precio tiene cada producto en distintos pedidos? ¿Hay productos con varios precios?

In [55]:
prices_per_product = df.groupby('item_name')['item_price'].unique()

print("Prices per product:")
print(prices_per_product)


Prices per product:
item_name
6 Pack Soft Drink                                                            [6.49, 12.98]
Barbacoa Bowl                                      [11.75, 9.25, 8.99, 11.48, 8.69, 11.49]
Barbacoa Burrito                                   [8.99, 9.25, 11.75, 11.08, 8.69, 11.48]
Barbacoa Crispy Tacos                                     [11.75, 9.25, 11.48, 8.99, 18.5]
Barbacoa Salad Bowl                                                          [11.89, 9.39]
Barbacoa Soft Tacos                                             [9.25, 8.99, 11.75, 11.48]
Bottled Water                            [1.09, 1.5, 3.0, 3.27, 2.18, 6.0, 7.5, 4.5, 10...
Bowl                                                                           [22.2, 7.4]
Burrito                                                                              [7.4]
Canned Soda                                                             [2.18, 1.09, 4.36]
Canned Soft Drink                                           

### Step 6. Ordena el dataframe en base al nombre de producto (item name)

In [56]:
df_sorted = df.sort_values(by='item_name')

print("Sorted DataFrame based on item name:")
print(df_sorted)


Sorted DataFrame based on item name:
      order_id  quantity          item_name  \
3389      1360         2  6 Pack Soft Drink   
341        148         1  6 Pack Soft Drink   
1849       749         1  6 Pack Soft Drink   
1860       754         1  6 Pack Soft Drink   
2713      1076         1  6 Pack Soft Drink   
...        ...       ...                ...   
2384       948         1  Veggie Soft Tacos   
781        322         1  Veggie Soft Tacos   
2851      1132         1  Veggie Soft Tacos   
1699       688         1  Veggie Soft Tacos   
1395       567         1  Veggie Soft Tacos   

                                     choice_description  item_price  
3389                                        [Diet Coke]       12.98  
341                                         [Diet Coke]        6.49  
1849                                             [Coke]        6.49  
1860                                        [Diet Coke]        6.49  
2713                                            

### Step 7. ¿Cuantas veces se ha pedido los productos más caros?

In [57]:
most_expensive_products = df[df['item_price'] == df['item_price'].max()]

num_orders_most_expensive = most_expensive_products['quantity'].sum()

print("Number of times the most expensive products have been ordered:", num_orders_most_expensive)


Number of times the most expensive products have been ordered: 15


### Step 8. Veamos para el caso de Veggie Salad Bowl. Extrae esa información.

In [58]:
veggie_salad_bowl_info = df[df['item_name'] == 'Veggie Salad Bowl']

print("Information for Veggie Salad Bowl:")
print(veggie_salad_bowl_info)


Information for Veggie Salad Bowl:
      order_id  quantity          item_name  \
186         83         1  Veggie Salad Bowl   
295        128         1  Veggie Salad Bowl   
455        195         1  Veggie Salad Bowl   
496        207         1  Veggie Salad Bowl   
960        394         1  Veggie Salad Bowl   
1316       536         1  Veggie Salad Bowl   
1884       760         1  Veggie Salad Bowl   
2156       869         1  Veggie Salad Bowl   
2223       896         1  Veggie Salad Bowl   
2269       913         1  Veggie Salad Bowl   
2683      1066         1  Veggie Salad Bowl   
3223      1289         1  Veggie Salad Bowl   
3293      1321         1  Veggie Salad Bowl   
4109      1646         1  Veggie Salad Bowl   
4201      1677         1  Veggie Salad Bowl   
4261      1700         1  Veggie Salad Bowl   
4541      1805         1  Veggie Salad Bowl   
4573      1818         1  Veggie Salad Bowl   

                                     choice_description  item_price  
1