In [29]:
# Melanie Schwartz
# sno122

## Lab 7B

<em>Lab 7 consists of two exercises on relational and document-oriented databases.</em>

In this exercise, you will write a program that will map New York City restaurants on a map using Folium and Choropleth with sample data on [MongoDB](http://www.mongodb.com). We will use the ```restaurants``` collection in the database ```sample_restaurants``` that comes with MongoDB.

**Tasks**

1. Query the ```restaurants``` collection to find documents where the cuisine is ```Irish``` and its last grade was an ```A```. The returned documents will include location information (i.e., the name of the restaurants, address, coordinates, etc.). You may need to put this data into a ```DataFrame```.

2. Create a ```DataFrame``` with a count of restaurants that met the conditions above by borough. A borough is like a neighborhood in New York City. There are five boroughs: Manhattan, Brooklyn, Queens, The Bronx, and Staten Island. The data will only refer to the Bronx as Bronx (without the *the*).

3. Create a map using Folium that centers and zooms in on New York City. The coordinates are (40.7128, -74.0060). All boroughs should be visible. 

4. Add a marker for every restaurant that met the conditions. The returned document will have the coordinates for each restaurant. However, the coordinates are reversed in order in the JSON! The JSON has the longitude first and then the latitude. The rest of the program expects latitude to come first. The marker will contain the name of the restaurant the street address. 

5. Download the GeoJSON file for the NYC boroughs from [NYC Open Data](https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm). Include this file in the same directory as this notebook. 

6. Using choropleth and the GeoJSON file, shade the boroughs based on the count of restaurants that met the conditions above. The ```key_on``` keyword argument is ```feature.properties.boro_name```, which contains the name of the borough and will match up with your counts ```DataFrame``` from step 2. 

7. Save the map in a file named ```map.html```.

In [30]:
import pymongo
from pymongo import MongoClient
import folium
import pandas as pd
# certifi library to avoid SSL error
import certifi

# Get the path to the certifi CA bundle
ca = certifi.where()

# Update the MongoClient connection to include the tlsCAFile 
client = pymongo.MongoClient("mongodb+srv://melanieschwartz:Clarkjr10@cluster0.es5ziuz.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0", tlsCAFile=ca)
database = client['sample_restaurants']
collection = database['restaurants']

# TODO: 1. Query the 'restaurants' collection to find documents where the cuisine is 'Irish' and its last grade was an 'A'. The returned documents will include location information (i.e., the name of the restaurants, address, coordinates, etc.). You may need to put this data into a 'DataFrame'.
# query Irish restaurants with last grade 'A'
query = {
    "cuisine": "Irish",
    "grades.0.grade": "A"  
}

# execute
documents = collection.find(query)

# load relevant data to DF
data = []

# extract name, address, coordinate
for doc in documents:
    name = doc.get('name', 'N/A')  
    address = doc.get('address', {}).get('street', 'N/A')  
    coord = doc.get('address', {}).get('coord', [0, 0])  
    
    # Append the data
    data.append({
        "Name": name,
        "Address": address,
        "Coordinates": coord
    })

# Convert the list of dictionaries into a DataFrame
df = pd.DataFrame(data)

# print the DataFrame
print(df)

# TODO: 2. Create a 'DataFrame' with a count of restaurants that met the conditions above by borough. A borough is like a neighborhood in New York City. There are five boroughs: Manhattan, Brooklyn, Queens, The Bronx, and Staten Island. The data will only refer to the Bronx as Bronx (without the *the*).

# query to count restaurants by borough
count = [
    {
        '$match': {
            "cuisine": "Irish",
            "grades.0.grade": "A"
        }
    },
    # Group by borough and count
    {
        '$group': {
            '_id': '$borough',  
            'count': {'$sum': 1} 
        }
    },
]

# store the query to result
result = collection.aggregate(count)

# Convert the aggregation result into DF
df_boroughs = pd.DataFrame(list(result))

# Display the DF
print(df_boroughs)

#TODO: 3. Create a map using Folium that centers and zooms in on New York City. The coordinates are (40.7128, -74.0060). All boroughs should be visible. 

# Create your map here - This has been started for you.
nyc_coordinates = (40.7128, -74.0060)

# Folium map centered on New York City
nyc_map = folium.Map(location=nyc_coordinates, zoom_start=10)

# TODO: 4. Add a marker for every restaurant that met the conditions. The returned document will have the coordinates for each restaurant. However, the coordinates are reversed in order in the JSON! The JSON has the longitude first and then the latitude. The rest of the program expects latitude to come first. The marker will contain the name of the restaurant the street address. 
# Loop through the DF to add each restaurant as a marker
for index, row in df.iterrows():
    # Reverse the coordinates 
    folium_coords = [row['Coordinates'][1], row['Coordinates'][0]]
    
    # Create the popups with restaurant name and address
    popup_text = "{} - {}".format(row['Name'], row['Address'])
    
    # Add a markers
    folium.Marker(
        location=folium_coords,
        popup=popup_text
    ).add_to(nyc_map)

# TODO: 6. Using choropleth and the GeoJSON file, shade the boroughs based on the count of restaurants that met the conditions above. The 'key_on' keyword argument is 'feature.properties.boro_name', which contains the name of the borough and will match up with your counts 'DataFrame' from step 2. 

# Load GeoJSON file
geojson_path = 'Borough Boundaries.geojson'

# Add the Choropleth layer
folium.Choropleth(
    geo_data=geojson_path,
    name='Choropleth',
    data=df_boroughs,
    columns=['_id', 'count'],
    key_on='feature.properties.boro_name',
    fill_color='Reds',
    fill_opacity=0.7,
    line_opacity=0.3,
    legend_name='Restaurant Count'
).add_to(nyc_map)

# TODO: 7. Save the map in a file named 'map.html'.
nyc_map.save('map.html')

                               Name           Address  \
0    Dj Reynolds Pub And Restaurant  West   57 Street   
1                    Aqueduct North       Katonah Ave   
2                     Mcaleer'S Pub  Amsterdam Avenue   
3     Dorrian'S Red Hand Restaurant          2 Avenue   
4                         Twins Pub          9 Avenue   
..                              ...               ...   
159                   Austin Public     Austin Street   
160                The Copper Still          2 Avenue   
161                     The Jar Bar         48 Avenue   
162                     The Brewery         30 Avenue   
163      O'Neill'S Restaurant & Bar        Forest Ave   

                                 Coordinates  
0           [-73.98513559999999, 40.7676919]  
1                  [-73.8675389, 40.8977829]  
2                    [-73.977372, 40.783934]  
3                    [-73.952449, 40.776325]  
4            [-73.99682299999999, 40.753182]  
..                               