# Custom Chatbot Project

##### I've selected the "nyc_food_scrap_drop_off_sites.csv" dataset for my project, motivated by the growing need for eco-friendly living practices, especially in bustling urban environments like New York City. This dataset will be used to power a chatbot designed to help people easily find local food scrap drop-off locations, know when they're open, and understand what kinds of materials they can bring. Not only is this super handy, but it also supports environmental efforts by promoting the recycling of organic waste. New York City is big on sustainability, and this chatbot plays a crucial role in getting the community involved in eco-friendly initiatives. It's a perfect fit for what I'm trying to 

## Data Wrangling

TODO: In the cells below, load your chosen dataset into a `pandas` dataframe with a column named `"text"`. This column should contain all of your text data, separated into at least 20 rows.

In [1]:
import pandas as pd

In [2]:
data = pd.read_csv('data/nyc_food_scrap_drop_off_sites.csv')

In [3]:
data.columns

Index(['Unnamed: 0', 'borough', 'ntaname', 'food_scrap_drop_off_site',
       'location', 'hosted_by', 'open_months', 'operation_day_hours',
       'website', 'borocd', 'councildist', 'latitude', 'longitude', 'precinct',
       'object_id', 'location_point', ':@computed_region_yeji_bk3q',
       ':@computed_region_92fq_4b7q', ':@computed_region_sbqj_enih',
       ':@computed_region_efsh_h5xi', ':@computed_region_f5dn_yrer', 'notes',
       'ct2010', 'bbl', 'bin'],
      dtype='object')

In [4]:
# handling missing values and concatenate information
def create_text_column(row):
    drop_off_site = row['food_scrap_drop_off_site'] if pd.notnull(row['food_scrap_drop_off_site']) else 'Unknown drop-off site'
    location = row['location'] if pd.notnull(row['location']) else 'Unknown location'
    borough = row['borough'] if pd.notnull(row['borough']) else 'Unknown borough'
    hours = row['operation_day_hours'] if pd.notnull(row['operation_day_hours']) else 'Check website for hours'
    notes = row['notes'] if pd.notnull(row['notes']) else 'No additional notes available'
    website = row['website'] if pd.notnull(row['website']) else 'No website provided'
    
    # Concatenate the information into a single text string
    text = f"Drop-off site: {drop_off_site}. Location: {location}, {borough}. Hours: {hours}. {notes}. For more info: {website}."
    return text

# DataFrame
data['text'] = data.apply(create_text_column, axis=1)

# First 5 rows
print(data[['text']].head())


                                                text
0  Drop-off site: South Beach. Location: 21 Robin...
1  Drop-off site: SE Corner of Broadway & Academy...
2  Drop-off site: Old Stone House Brooklyn. Locat...
3  Drop-off site: SE Corner of Pleasant Avenue & ...
4  Drop-off site: Malcolm X FSDO. Location: 111-2...


In [5]:
data.head()

Unnamed: 0.1,Unnamed: 0,borough,ntaname,food_scrap_drop_off_site,location,hosted_by,open_months,operation_day_hours,website,borocd,...,:@computed_region_yeji_bk3q,:@computed_region_92fq_4b7q,:@computed_region_sbqj_enih,:@computed_region_efsh_h5xi,:@computed_region_f5dn_yrer,notes,ct2010,bbl,bin,text
0,0,Staten Island,Grasmere-Arrochar-South Beach-Dongan Hills,South Beach,"21 Robin Road, Staten Island NY",Snug Harbor Youth,Year Round,Friday (Start Time: 1:30 PM - End Time: 4:30 PM),snug-harbor.org,502,...,1.0,14.0,76.0,10692.0,30.0,,,,,Drop-off site: South Beach. Location: 21 Robin...
1,1,Manhattan,Inwood,SE Corner of Broadway & Academy Street,,Department of Sanitation,Year Round,24/7,www.nyc.gov/smartcomposting,112,...,,,,,,Download the app to access bins. Accepts all f...,,,,Drop-off site: SE Corner of Broadway & Academy...
2,2,Brooklyn,Park Slope,Old Stone House Brooklyn,"336 3rd St, Brooklyn, NY 11215",Old Stone House Brooklyn,Year Round,24/7 (Start Time: 24/7 - End Time: 24/7),,306,...,2.0,27.0,50.0,17617.0,14.0,,,,,Drop-off site: Old Stone House Brooklyn. Locat...
3,3,Manhattan,East Harlem (North),SE Corner of Pleasant Avenue & E 116 Street,,Department of Sanitation,Year Round,24/7,www.nyc.gov/smartcomposting,111,...,,,,,,Download the app to access bins. Accepts all f...,,,,Drop-off site: SE Corner of Pleasant Avenue & ...
4,4,Queens,Corona,Malcolm X FSDO,"111-26 Northern Blvd, Flushing, NY 11368",NYC Compost Project Hosted by Big Reuse,Year Round,Tuesdays (Start Time: 12:00 PM - End Time: 2:...,,404,...,3.0,21.0,68.0,14510.0,66.0,,,,,Drop-off site: Malcolm X FSDO. Location: 111-2...


## Custom Query Completion

TODO: In the cells below, compose a custom query using your chosen dataset and retrieve results from an OpenAI `Completion` model. You may copy and paste any useful code from the course materials.

In [6]:
import openai

openai.api_key = 'open api key'
def basic_response(question):
    openai.api_key = 'open api key'
    response = openai.Completion.create(
        engine="gpt-3.5-turbo-instruct",  
        prompt=question,
        temperature=0.5,
        max_tokens=150,  
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0
    )
    return response.choices[0].text.strip()

   
def get_chatbot_response(question, context):
    openai.api_key = 'open api key'
    response = openai.Completion.create(
        engine="gpt-3.5-turbo-instruct",  
        prompt=f"Question: {question}\n\nContext: {context}\n\nAnswer:",
        temperature=0.5,
        max_tokens=150,
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0
    )
    return response.choices[0].text.strip()

#drop-off sites in Brooklyn
question = "Where can I drop off my food scraps in Brooklyn?"

relevant_context = " ".join(data[data['text'].str.contains("Brooklyn", na=False)]['text'].tolist()[:5])
# context is within the token limit
limited_context = " ".join(relevant_context.split()[:4000])  
response = get_chatbot_response(question, limited_context)
print(f"Question: {question}\nResponse: {response}")

Question: Where can I drop off my food scraps in Brooklyn?
Response: You can drop off your food scraps at the following locations in Brooklyn: 

1. Old Stone House Brooklyn - 336 3rd St, Brooklyn, NY 11215
2. NW Corner of Malcolm X Boulevard & Bainbridge Street (accessible through the app)
3. Walt L Shamel Community Garden - 1097 Dean St, Brooklyn, NY 11216
4. Underhill Avenue & Park Place (accessible through the app)
5. Los Colibríes Community Garden - 219 34th Street, Brooklyn 11232


## Custom Performance Demonstration

TODO: In the cells below, demonstrate the performance of your custom query using at least 2 questions. For each question, show the answer from a basic `Completion` model query as well as the answer from your custom query.

### Question 1

In [7]:
# Seasonal operation hours of drop-off sites
question_seasonal = "What are the seasonal operation hours of drop-off sites?"
# seasonal hours or changes
context_seasonal = " ".join(data['text'].tolist()[:10]) 

#context is within the token limit for the model
limited_context_seasonal = " ".join(context_seasonal.split()[:4000])

# Response based on the provided context
response_seasonal = get_chatbot_response(question_seasonal, limited_context_seasonal)
print(f"Question: {question_seasonal}\nResponse: {response_seasonal}\n")

Question: What are the seasonal operation hours of drop-off sites?
Response: The seasonal operation hours of drop-off sites vary depending on the location. Some sites, like the South Beach location in Staten Island, have specific hours on certain days (Fridays from 1:30 PM to 4:30 PM), while others, like the SE Corner of Broadway & Academy Street in Manhattan, are open 24/7. Some sites, like the Old Stone House Brooklyn location, have no specified hours and are open 24/7. It is best to check the specific location's website or contact them directly for more information on their seasonal operation hours.



In [8]:
# without using the context
basic_query = basic_response(question_seasonal)

In [9]:
#Result without using context
basic_query

'The seasonal operation hours of drop-off sites vary depending on the location and the type of site. Some drop-off sites may have limited hours during certain seasons, while others may have extended hours. It is best to check with the specific drop-off site for their current seasonal operation hours.'

### Question 2

In [10]:
# Drop-off sites open on weekends
question_weekend = "Which drop-off sites are open on the weekend?"
#information about operation hours that can indicate weekend availability

context_weekend = " ".join(data['text'].tolist()[:10])  
limited_context_weekend = " ".join(context_weekend.split()[:4000])

response_weekend = get_chatbot_response(question_weekend, limited_context_weekend)
print(f"Question: {question_weekend}\nResponse: {response_weekend}\n")


Question: Which drop-off sites are open on the weekend?
Response: The drop-off sites at South Beach, SE Corner of Broadway & Academy Street, Old Stone House Brooklyn, SE Corner of Pleasant Avenue & E 116 Street, Astoria Pug: 41st Street, SE Corner of Kings College Place & Gun Hill Rd., NW Corner of Malcolm X Boulevard & Bainbridge Street, Astoria Pug: Broadway, and SE Corner of Eastburn Avenue & East 174th Street are open on the weekends.



In [11]:
# basic reponse without using context from our dataset
basic_query = basic_response(question_weekend)

In [12]:
#Result
basic_query

'Please provide more information about the specific drop-off sites in question. Without this information, it is not possible to determine which sites are open on the weekend.'