## CS 248 Assignment 5

**Author:** _Manasa Kudumu_
**Date:** _Feb 27, 2025_

**Table of Content**

1. [Part 1: Getting the menus from Wellesley Fresh](#sec1)
2. [Part 2: Working with the menu data](#sec2)

[Link to Github Repo](https://github.com/Wellesley-CS248/assignment-5-wellesley-fresh-manasakudumu)

<a id="sec1"></a>

### Part 1: Getting the menus from Wellesley Fresh

#### Task 1: creating wellesley-dining.csv

In [1]:

# # bae pao
# url1 = "https://dish.avifoodsystems.com/wellesley/96/148/week"
# url2 = "https://dish.avifoodsystems.com/wellesley/96/149/week"
# url3 = "https://dish.avifoodsystems.com/wellesley/96/312/week"

# #bates
# url4 = "https://dish.avifoodsystems.com/wellesley/95/145/week"
# url5 ="https://dish.avifoodsystems.com/wellesley/95/146/week"
# url6 ="https://dish.avifoodsystems.com/wellesley/95/311/week"

# # stone d
# url7 ="https://dish.avifoodsystems.com/wellesley/131/261/week"
# url8 ="https://dish.avifoodsystems.com/wellesley/131/262/week"
# url9 ="https://dish.avifoodsystems.com/wellesley/131/263/week"

# #tower
# url10 ="https://dish.avifoodsystems.com/wellesley/97/153/week"
# url11 ="https://dish.avifoodsystems.com/wellesley/97/154/week"
# url12 ="https://dish.avifoodsystems.com/wellesley/97/310/week"


In [2]:
import requests
import time
import csv
import json
import os
from datetime import datetime

#### Task 2: define a function, get_menu, which takes three parameters: a date, a location ID, and a meal ID.

In [3]:
def get_menu(date, locationID , mealID):
    base_url = "https://dish.avifoodsystems.com/wellesley"
    url = f"{base_url}/{locationID}/{mealID}/{date.replace('-', '/')}"
    response = requests.get(url)
    return response.json()

#### Task 3: write_menus, which takes two parameters: the CSV filename we created in Task 1 and a date. Then, this function will call the function get_menu as many times as needed to get the menu for each row of the CSV, and each time dump the data into a new JSON file

In [4]:
def write_menus(file, date ):
    with open(file, "r") as f:
        csvr = csv.DictReader(f)
    
        for i in csvr:
            location = i['location']
            meal = i['meal']
            locationID = i['locationID']
            mealID = i['mealID']
            menu = get_menu(date, locationID, mealID)
            
            if menu:
                jsonf = f"{location}-{meal}-{date}.json"
                with open(jsonf, "w") as ff:
                    json.dump(menu, ff)
            time.sleep(2)

In [5]:
today = datetime.now().date()
today.strftime("%m-%d-%Y")

'04-02-2025'

#### Task 4: checking of the results, especially with respect to their completeness.

In [6]:
import os
import json
 
files = os.listdir("provided-jsons") 
json_files = sorted([file for file in files if file.endswith(".json")])
for file in json_files:
        file_path = os.path.join("provided-jsons", file)  
        with open(file_path, "r") as jsonf:
                data = json.load(jsonf)  
                print(f"{len(data)} rows in {file}")

59 rows in Bao-Breakfast-02-20-2025.json
111 rows in Bao-Dinner-02-20-2025.json
119 rows in Bao-Lunch-02-20-2025.json
71 rows in Bates-Breakfast-02-20-2025.json
146 rows in Bates-Dinner-02-20-2025.json
156 rows in Bates-Lunch-02-20-2025.json
48 rows in StoneD-Breakfast-02-20-2025.json
56 rows in StoneD-Dinner-02-20-2025.json
96 rows in StoneD-Lunch-02-20-2025.json
42 rows in Tower-Breakfast-02-20-2025.json
120 rows in Tower-Dinner-02-20-2025.json
106 rows in Tower-Lunch-02-20-2025.json


<a id="sec2"></a>

### Part 2: Working with the menu data


#### Task 1: merge df together using pd.concat

In [7]:
import pandas as pd
df = pd.DataFrame()
files = os.listdir("provided-jsons")
json_files = sorted([file for file in files if file.endswith(".json")])
for file in json_files:
    file_path = os.path.join("provided-jsons", file)
    df1 = pd.read_json(file_path)
    df = pd.concat([df, df1], ignore_index=True)
    print(f"Appeded {file}. Now size is {df.shape}" )

Appeded Bao-Breakfast-02-20-2025.json. Now size is (59, 12)
Appeded Bao-Dinner-02-20-2025.json. Now size is (170, 12)
Appeded Bao-Lunch-02-20-2025.json. Now size is (289, 12)
Appeded Bates-Breakfast-02-20-2025.json. Now size is (360, 12)
Appeded Bates-Dinner-02-20-2025.json. Now size is (506, 12)
Appeded Bates-Lunch-02-20-2025.json. Now size is (662, 12)
Appeded StoneD-Breakfast-02-20-2025.json. Now size is (710, 12)
Appeded StoneD-Dinner-02-20-2025.json. Now size is (766, 12)
Appeded StoneD-Lunch-02-20-2025.json. Now size is (862, 12)
Appeded Tower-Breakfast-02-20-2025.json. Now size is (904, 12)
Appeded Tower-Dinner-02-20-2025.json. Now size is (1024, 12)
Appeded Tower-Lunch-02-20-2025.json. Now size is (1130, 12)


#### Task 2: Using pandas commands  for dropping columns, rows, and fixing the two columns “allergens” and “preferences”

In [8]:
dfLess = df.drop(columns=['date', 'image', 'stationName', 'stationOrder', 'price'])
dfLess.shape

(1130, 7)

In [9]:
dfFinal = dfLess.drop_duplicates(subset=['id'], keep='first')
dfFinal.shape

(399, 7)

In [10]:
def transform(cellLst):
    result = ""
    if cellLst:
        result = ",".join([item['name'] for item in cellLst])
    return result

In [11]:
df['allergens'] = df['allergens'].apply(transform) # notice, we don't pass any arguments to the function here
df['preferences'] = df['preferences'].apply(transform)
df.head()

Unnamed: 0,id,name,date,image,description,categoryName,stationName,stationOrder,allergens,preferences,price,nutritionals
0,16472,Breakfast Bowl with Sausage,2025-02-17,,"Scrambled eggs topped with sausage, home fries...",Breakfast,BREAKFAST,4,"Soy,Dairy,Egg",Gluten Sensitive,0.0,"{'id': 230891, 'servingSize': '6.00', 'serving..."
1,20985,Caramelized Onions (Oil),2025-02-17,,,Misc,BREAKFAST,4,,"Vegan,Gluten Sensitive,Vegetarian",0.0,"{'id': 229549, 'servingSize': '1.02', 'serving..."
2,15890,Home Fry Potatoes,2025-02-17,,Crispy seasoned diced breakfast potatoes.,Breakfast,BREAKFAST,4,"Soy,Dairy","Gluten Sensitive,Vegetarian",0.0,"{'id': 233526, 'servingSize': '4.00', 'serving..."
3,19875,Pork Sausage Link,2025-02-17,,Pork breakfast sausage link.,Breakfast,BREAKFAST,4,,Gluten Sensitive,0.0,"{'id': 228338, 'servingSize': '0.80', 'serving..."
4,19611,Sauteed Spinach,2025-02-17,,Sautéed baby spinach with minced onion and gar...,Misc,BREAKFAST,4,,"Vegan,Gluten Sensitive,Vegetarian,NutriGOOD",0.0,"{'id': 231288, 'servingSize': '2.38', 'serving..."


#### Task 4: separating nutritionals into different columns and then dropping nutritionals column at the end

In [12]:
df['servingSize'] = df['nutritionals'].apply(lambda d: float(d['servingSize']) if d and d['servingSize'] else np.nan)
df['servingSizeUOM'] = df['nutritionals'].apply(lambda d: d['servingSizeUOM'] if d else np.nan)
df['calories'] = df['nutritionals'].apply(lambda d: float(d['calories']) if d and d['calories'] else np.nan)
df['fat'] = df['nutritionals'].apply(lambda d: float(d['fat']) if d and d['fat'] else np.nan)
df['caloriesFromFat'] = df['nutritionals'].apply(lambda d: float(d['caloriesFromFat']) if d and d['caloriesFromFat'] else np.nan)
df['saturatedFat'] = df['nutritionals'].apply(lambda d: float(d['saturatedFat']) if d and d['saturatedFat'] else np.nan)
df['transFat'] = df['nutritionals'].apply(lambda d: float(d['transFat']) if d and d['transFat'] else np.nan)
df['cholesterol'] = df['nutritionals'].apply(lambda d: float(d['cholesterol']) if d and d['cholesterol'] else np.nan)
df['sodium'] = df['nutritionals'].apply(lambda d: float(d['sodium']) if d and d['sodium'] else np.nan)
df['carbohydrates'] = df['nutritionals'].apply(lambda d: float(d['carbohydrates']) if d and d['carbohydrates'] else np.nan)
df['dietaryFiber'] = df['nutritionals'].apply(lambda d: float(d['dietaryFiber']) if d and d['dietaryFiber'] else np.nan)
df['sugars'] = df['nutritionals'].apply(lambda d: float(d['sugars']) if d and d['sugars'] else np.nan)
df['addedSugar'] = df['nutritionals'].apply(lambda d: float(d['addedSugar']) if d and d['addedSugar'] else np.nan)
df['protein'] = df['nutritionals'].apply(lambda d: float(d['protein']) if d and d['protein'] else np.nan)

In [13]:
df.drop(columns=['nutritionals'], inplace=True)


In [14]:
df.to_csv("wellesley-meals.csv", index=False)

In [15]:
df.head()

Unnamed: 0,id,name,date,image,description,categoryName,stationName,stationOrder,allergens,preferences,...,caloriesFromFat,saturatedFat,transFat,cholesterol,sodium,carbohydrates,dietaryFiber,sugars,addedSugar,protein
0,16472,Breakfast Bowl with Sausage,2025-02-17,,"Scrambled eggs topped with sausage, home fries...",Breakfast,BREAKFAST,4,"Soy,Dairy,Egg",Gluten Sensitive,...,335.0,9.0,0.0,286.0,670.0,9.0,1.0,2.0,0.0,15.0
1,20985,Caramelized Onions (Oil),2025-02-17,,,Misc,BREAKFAST,4,,"Vegan,Gluten Sensitive,Vegetarian",...,68.0,0.0,0.0,0.0,5.0,11.0,2.0,5.0,0.0,1.0
2,15890,Home Fry Potatoes,2025-02-17,,Crispy seasoned diced breakfast potatoes.,Breakfast,BREAKFAST,4,"Soy,Dairy","Gluten Sensitive,Vegetarian",...,149.0,3.0,0.0,0.0,596.0,19.0,2.0,1.0,0.0,2.0
3,19875,Pork Sausage Link,2025-02-17,,Pork breakfast sausage link.,Breakfast,BREAKFAST,4,,Gluten Sensitive,...,130.0,5.0,0.0,21.0,170.0,1.0,0.0,0.0,0.0,3.0
4,19611,Sauteed Spinach,2025-02-17,,Sautéed baby spinach with minced onion and gar...,Misc,BREAKFAST,4,,"Vegan,Gluten Sensitive,Vegetarian,NutriGOOD",...,61.0,0.0,0.0,0.0,61.0,3.0,2.0,0.0,0.0,2.0


In [16]:
df = dfFinal.copy()