# Belongs to Airbnb Lab

### Introduction
In this lab we will continue to explore the relationships between data in different tables of a database. The Airbnb database for this lab contains four tables, `hosts`, `listings`, `locations`, and `neighborhoods`. In order to understand and analyze the data, we need to first understand the relationships between the tables. Relationships include "Has One" and "Has Many". For example, the `listings` table has a column "host_id" which HAS ONE record in hosts table that it corresponds to (a listing will only have one host). The `locations` table has an id column which HAS MANY corresponding records in the `listings` table (a location will have more than one listing). 

Let's begin by connecting to the database and reviewing the schema of the tables.

### Loading Data

In [5]:
import pandas as pd
neighborhoods_url = "https://raw.githubusercontent.com/sql-fundamentals-jigsaw/mod-1-sql-curriculum/master/2-sql-relations/3-belongs-to-bnb/data/neighborhoods.csv"
hosts_url = "https://raw.githubusercontent.com/sql-fundamentals-jigsaw/mod-1-sql-curriculum/master/2-sql-relations/3-belongs-to-bnb/data/hosts.csv"
locations_url = "https://raw.githubusercontent.com/sql-fundamentals-jigsaw/mod-1-sql-curriculum/master/2-sql-relations/3-belongs-to-bnb/data/locations.csv"
listings_url = "https://raw.githubusercontent.com/sql-fundamentals-jigsaw/mod-1-sql-curriculum/master/2-sql-relations/3-belongs-to-bnb/data/listings.csv"


hosts_df = pd.read_csv(hosts_url)
neighborhoods_df = pd.read_csv(neighborhoods_url)

locations_df = pd.read_csv(locations_url)
listings_df = pd.read_csv(listings_url)

In [6]:
import sqlite3
conn = sqlite3.connect('listings.db')

In [7]:
listings_df = pd.read_sql('select * from listings', conn)

In [7]:
hosts_df.to_sql('hosts',conn, index = False)
neighborhoods_df.to_sql('neighborhoods',conn, index = False)
locations_df.to_sql('locations',conn, index = False)
listings_df.to_sql('listings', conn, index = False)

### Exploring Data

In [8]:
cursor.execute('SELECT name from sqlite_master where type= "table"')
cursor.fetchall()

[('hosts',), ('neighborhoods',), ('locations',), ('listings',)]

In [9]:
cursor.execute('PRAGMA table_info(hosts)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0), (1, 'host_name', 'TEXT', 0, None, 0)]

In [10]:
cursor.execute('PRAGMA table_info(neighborhoods)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'name', 'TEXT', 0, None, 0),
 (2, 'neighbourhood_group', 'TEXT', 0, None, 0)]

In [11]:
cursor.execute('PRAGMA table_info(locations)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'longitude', 'REAL', 0, None, 0),
 (2, 'latitude', 'REAL', 0, None, 0),
 (3, 'neighborhood_id', 'INTEGER', 0, None, 0)]

In [12]:
cursor.execute('PRAGMA table_info(listings)')
cursor.fetchall()

[(0, 'id', 'INTEGER', 0, None, 0),
 (1, 'name', 'TEXT', 0, None, 0),
 (2, 'host_id', 'INTEGER', 0, None, 0),
 (3, 'location_id', 'INTEGER', 0, None, 0),
 (4, 'number_of_reviews', 'INTEGER', 0, None, 0),
 (5, 'occupancy', 'INTEGER', 0, None, 0),
 (6, 'price', 'INTEGER', 0, None, 0),
 (7, 'room_type', 'TEXT', 0, None, 0),
 (8, 'host_listings_count', 'INTEGER', 0, None, 0)]

We'll start off with some basic one table queries:

* Which listing name has the highest price?

* What is the id of the location with the lowest longitude?

* What is the greatest occupancy of a listing?

* What is the average price of a listing?

* What is the count of number of hosts?

### Relationships
To help us better understand the relationships, create queries below that JOIN the tables. 

Have them map out the relationships 

*  host
    * include the host name, and host id
    
* A location belongs to a neighborhoods 
    * neighborhood_id, latitude, longitude
* A neighborhood belongs to a neighborhood group

* listing 
    * name, host_id, location_id, room_type, price, occupancy

### JOINs

For the following queries, use the relationships between the tables to find the solutions

* What is the longitude and latitude of the listing of the highest price?

* What is the neighborhood id of the listing with the lowest price?

* What is the longitude and latitude of the listing of the lowest price?

### Relations and GROUP BY

* What is the name of the host has the most number of reviews?

* What is the name of the host with the highest average listing price?

* What is the name of the host with the lowest average listing price?

* What is the name of the neighborhood with the most number of locations

* What are the names of the neighborhoods with 10 locations?

The following questions will require joins of three tables

* What is the average occupancy of each neighborhood?

* What is the total number of reviews for each neighborhood?

* Write a query that returns the name and average listing price of each neighborhood

### Conclusion
In this lab we worked on the "Has One" and "Has Many" relations in SQL. We began by mapping out the relations between the tables, which gave us a better idea of how we could then join them in our queries. We finished the lab by creating queries using JOIN clauses that connect the tables using these relationships.