# Belongs to Airbnb Lab

### Introduction
In this lab we will continue to explore the relationships between data in different tables of a database. The Airbnb database for this lab contains four tables, `hosts`, `listings`, `locations`, and `neighborhoods`. In order to understand and analyze the data, we need to first understand the relationships between the tables. Relationships include "Has One" and "Has Many". For example, the `listings` table has a column "host_id" which HAS ONE record in hosts table that it corresponds to (a listing will only have one host). The `locations` table has an id column which HAS MANY corresponding records in the `listings` table (a location will have more than one listing). 

Let's begin by connecting to the database and reviewing the schema of the tables.

In [1]:
import sqlite3
conn = sqlite3.connect('airbnb.db')
cursor = conn.cursor()

In [2]:
cursor.execute('SELECT name from sqlite_master where type= "table"')
cursor.fetchall()

[('hosts',), ('neighborhoods',), ('locations',), ('listings',)]

In [3]:
cursor.execute('PRAGMA table_info(hosts)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0), (1, 'host_name', 'TEXT', 0, None, 0)]

In [4]:
cursor.execute('PRAGMA table_info(neighborhoods)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'name', 'TEXT', 0, None, 0),
 (2, 'neighbourhood_group', 'TEXT', 0, None, 0)]

In [5]:
cursor.execute('PRAGMA table_info(locations)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'longitude', 'REAL', 0, None, 0),
 (2, 'latitude', 'REAL', 0, None, 0),
 (3, 'neighborhood_id', 'INTEGER', 0, None, 0)]

In [6]:
cursor.execute('PRAGMA table_info(listings)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'name', 'TEXT', 0, None, 0),
 (2, 'host_id', 'INTEGER', 0, None, 0),
 (3, 'location_id', 'INTEGER', 0, None, 0),
 (4, 'number_of_reviews', 'INTEGER', 0, None, 0),
 (5, 'occupancy', 'INTEGER', 0, None, 0),
 (6, 'price', 'INTEGER', 0, None, 0),
 (7, 'room_type', 'TEXT', 0, None, 0),
 (8, 'host_listings_count', 'INTEGER', 0, None, 0)]

We'll start off with some basic one table queries:

* Which listing name has the highest price?

In [7]:
cursor.execute('SELECT name FROM listings ORDER BY price DESC LIMIT 1;')
cursor.fetchall()

[('Furnished room in Astoria apartment',)]

* What is the id of the location with the lowest longitude?

In [8]:
cursor.execute('SELECT id FROM locations ORDER BY longitude LIMIT 1;')
cursor.fetchall()

[(45652,)]

* What is the greatest occupancy of a listing?

In [9]:
cursor.execute('SELECT MAX(occupancy) FROM listings;')
cursor.fetchall()

[(365,)]

* What is the average price of a listing?

In [10]:
cursor.execute('SELECT AVG(price) FROM listings;')
cursor.fetchall()

[(152.7206871868289,)]

* What is the count of number of hosts?

In [11]:
cursor.execute('SELECT COUNT(id) FROM hosts;')
cursor.fetchall()

[(37457,)]

### Move onto relationships

Have them map out the relationships 

*  host
    * include the host name, and host id
    
* A location belongs to a neighborhoods 
    * neighborhood_id, latitude, longitude
* A neighborhood belongs to a neighborhood group

* listing 
    * name, host_id, location_id, room_type, price, occupancy

For the following queries, use the relationships between the tables to find the solutions

* What is the longitude and latitude of the listing of the highest price?

In [12]:
cursor.execute("""
SELECT a.longitude,
       a.latitude
  FROM locations AS a
       JOIN
       listings AS b ON a.id = b.location_id
 ORDER BY b.price DESC
 LIMIT 1;
""")
cursor.fetchall()

[(-73.91651, 40.7681)]

* What is the neighborhood id of the listing with the lowest price?

In [14]:
cursor.execute("""
SELECT a.neighborhood_id
  FROM locations AS a
       JOIN
       listings AS b ON a.id = b.location_id
 ORDER BY b.price
 LIMIT 1;

""")
cursor.fetchall()

[(6,)]

* What is the longitude and latitude of the listing of the lowest price?

In [15]:
cursor.execute("""
SELECT a.longitude,
       a.latitude
  FROM locations AS a
       JOIN
       listings AS b ON a.id = b.location_id
 ORDER BY b.price
 LIMIT 1;
""")
cursor.fetchall()

[(-73.95428000000001, 40.69023)]

### Relations and GROUP BY

* What is the name of the host has the most number of reviews?

In [24]:
cursor.execute("""
SELECT h.host_name,
       SUM(l.number_of_reviews) 
  FROM hosts AS h
       JOIN
       listings AS l ON h.id = l.host_id
 GROUP BY l.host_id
 ORDER BY SUM(l.number_of_reviews) DESC
 LIMIT 1;
""")
cursor.fetchall()

[('Maya', 2273)]

* What is the name of the host with the lowest average listing price?

In [17]:
cursor.execute("""
SELECT h.host_name
  FROM hosts AS h
       JOIN
       listings AS l ON h.id = l.host_id
 GROUP BY l.host_id
 ORDER BY AVG(l.price) 
 LIMIT 1;
""")
cursor.fetchall()

[('Aymeric',)]

* What is the name of the neighborhood with the most number of locations

In [18]:
cursor.execute("""
SELECT n.name
  FROM neighborhoods AS n
       JOIN
       locations AS l ON n.id = l.neighborhood_id
 GROUP BY n.id
 ORDER BY COUNT(n.id) DESC
 LIMIT 1;

""")
cursor.fetchall()


[('Williamsburg',)]

* What are the names of the neighborhoods with 10 locations?

In [19]:
cursor.execute("""
SELECT n.name
  FROM neighborhoods AS n
       JOIN
       locations AS l ON n.id = l.neighborhood_id
 GROUP BY n.id
HAVING COUNT(n.id) = 10;
""")
cursor.fetchall()

[('North Riverdale',),
 ('Great Kills',),
 ('East Morrisania',),
 ('Melrose',),
 ('Bergen Beach',),
 ('Westchester Square',)]

The following questions will require joins of three tables

**To add**

* What is the average occupancy of each neighborhood?

In [None]:
cursor.execute("""
SELECT n.name,
       AVG(l.occupancy) 
  FROM neighborhoods AS n
       JOIN
       locations AS lo ON n.id = lo.neighborhood_id
       JOIN
       listings AS l ON lo.id = l.location_id
 GROUP BY lo.neighborhood_id;
""")
cursor.fetchall()

* What is the total number of reviews for each neighborhood?

In [None]:
cursor.execute("""
SELECT n.name,
       SUM(l.number_of_reviews) 
  FROM neighborhoods AS n
       JOIN
       locations AS lo ON n.id = lo.neighborhood_id
       JOIN
       listings AS l ON lo.id = l.location_id
 GROUP BY lo.neighborhood_id;
""")
cursor.fetchall()

* Write a query that returns the name and average listing price of each neighborhood

In [None]:
cursor.execute("""
SELECT n.name,
       AVG(l.price) 
  FROM neighborhoods AS n
       JOIN
       locations AS lo ON n.id = lo.neighborhood_id
       JOIN
       listings AS l ON lo.id = l.location_id
 GROUP BY lo.neighborhood_id;
""")
cursor.fetchall()

### Conclusion
In this lab we worked on the "Has One" and "Has Many" relations in SQL. We began by mapping out the relations between the tables, which gave us a better idea of how we could then join them in our queries. We finished the lab by creating queries using JOIN clauses that connect the tables using these relationships.