In [1]:
%load_ext raw_magic

# Restaurant Health Scores

The files `sf_restaurants.csv` and `sf_restaurant_inspections.csv` were taken from the San Francisco open data project and contain restaurant info and inspection scores respectively. 
In this example we are calculating the average inspection score per restaurant in Restaurant_Inspections.csv then joining with the location information in Restaurants.csv and displaying them with 4 colors: green, yellow, orange and red in the map.

In [3]:
%buckets_register raw-tutorial

API error: S3 credentials already exists


In [3]:
%%query
restaurants := read("s3://raw-tutorial/ipython-demos/sf_restaurants.csv");

avg_scores := SELECT business_id as b_id, AVG(inspection_score) AS avg_score 
    FROM read("s3://raw-tutorial/ipython-demos/sf_restaurant_inspections.csv")
    GROUP BY business_id;
    
SELECT 
    name,
    address,
    CASE
        WHEN avg_score > 95 then "Definitely!"
        WHEN avg_score > 85 then "Yes"
        WHEN avg_score > 75 then "Probably not"
        ELSE "Nope!"
    END  as `Should I go there`
FROM restaurants, avg_scores
WHERE id = b_id

Showing only 100 values...


name,address,Should I go there
Henry's Hunan Restaurant,4753 MISSION St,Yes
Carousel/Sea Lion Cart,1 Zoo Rd,Definitely!
Jenny's Restaurant,91 06th St,Nope!
Pepples Donuts,1 Ferry Building #38C,Definitely!
Julie's Kitchen,50 FREMONT St,Definitely!
BLUE FRONT DELI,1430 HAIGHT St,Nope!
Nubi Yogurt,2300 16th St Suite #215,Definitely!
Cellar Door,3131 Fillmore St,Yes
RICHMOND NEW MAY WAH SUPERMARKET,719 CLEMENT St,Yes
DELANCEY ST. RESTAURANT,600 EMBARCADERO St,Nope!


In [4]:
%%query
euclidean_dist(x: DOUBLE, y: DOUBLE) := SQRT(x*x + y*y);

restaurants := SELECT latitude, longitude, name, "restaurant" as type,  "" as coment
               FROM read("s3://raw-tutorial/ipython-demos/sf_restaurants.csv")
               WHERE address LIKE "%New Montgomery%";

meters_distance := SELECT latitude, longitude, POST_ID,
    CMIN(SELECT euclidean_dist(latitude - r.latitude, longitude - r.longitude) FROM restaurants r) AS distance
    FROM read("s3://raw-tutorial/ipython-demos/sf_parking_meters.csv")
    WHERE SFPARKAREA = "Downtown";


SELECT latitude, longitude, POST_ID as `meter id`,
    CASE
        WHEN distance < 0.0002 THEN "next to a restaurant"
        WHEN distance < 0.0005 THEN "close to a restaurant"
        WHEN distance < 0.0008 THEN "far from a restaurant"
        else "pretty far from a restaurant"
    END as coment
FROM meters_distance
WHERE distance < 0.001

Showing only 100 values...


latitude,longitude,meter id,coment
37.788449698,-122.401629701,582-00090,next to a restaurant
37.7873499979,-122.4003047009,582-01030,next to a restaurant
37.7864862987,-122.399227699,582-01410,next to a restaurant
37.7873034001,-122.3996543881,570-01140,close to a restaurant
37.7873736014,-122.3991246016,202-01300,far from a restaurant
37.7876674988,-122.4003499017,568-06180,close to a restaurant
37.7856604018,-122.3991460984,463-00060,pretty far from a restaurant
37.7882194974,-122.4006894023,501-01121,close to a restaurant
37.7883251988,-122.4004280993,501-01010,far from a restaurant
37.7883900021,-122.4004737024,501-01020,far from a restaurant
