### Introduction
In order to efficiently store data, we often spread related information across multiple tables.

For instance, imagine that we're running a magazine company where users can have different types of subscriptions to different products. Different subscriptions might have many different properties. Each customer would also have lots of associated information.

We could have one table with all of the following information:

    order_id
    customer_id
    customer_name
    customer_address
    subscription_id
    subscription_description
    subscription_monthly_price
    subscription_length
    purchase_date
However, a lot of this information would be repeated. If the same customer has multiple subscriptions, that customer's name and address will be reported multiple times. If the same subscription type is ordered by multiple customers, then the subscription price and subscription description will be repeated. This will make our table big and unmanageable.

So instead, we can split our data into three tables:

orders would contain just the information necessary to describe what was ordered:
    order_id
    customer_id
    subscription_id
    purchase_date
subscriptions would contain the information to describe each type of subscription:
    subscription_id
    description
    price_per_month
    subscription_length
customers would contain the information for each customer:
    customer_id
    customer_name
    address
In this lesson, we'll learn the SQL commands that will help us work with data that is stored in multiple tables.

In [1]:
# SELECT * FROM orders LIMIT 5;

# SELECT * FROM subscriptions LIMIT 5;

# SELECT * FROM customers LIMIT 5;

### Combining Tables Manually
Let's return to our magazine company. Suppose we have the three tables described in the previous exercise – shown in the browser on the right (we are going to try something new!):

    orders
    subscriptions
    customers
If we just look at the orders table, we can't really tell what's happened in each order. However, if we refer to the other tables, we can get a complete picture.

Let's examine the order with an order_id of 2. It was purchased by the customer with a customer_id of 2.

To find out the customer's name, we look at the customers table and look for the item with a customer_id value of 2. We can see that Customer 2's name is 'Jane Doe' and that she lives at '456 Park Ave'.

Doing this kind of matching is called joining two tables.

### Combining Tables with SQL
Combining tables manually is time-consuming. Luckily, SQL gives us an easy sequence for this: it's called a JOIN.

If we want to combine orders and customers, we would type:

In [1]:
'''
SELECT *
FROM orders
JOIN customers
  ON orders.customer_id = customers.customer_id;
'''

'\nSELECT *\nFROM orders\nJOIN customers\n  ON orders.customer_id = customers.customer_id;\n'

Let's break down this command:

    The first line selects all columns from our combined table. If we only want to select certain columns, we can specify which ones we want.
    The second line specifies the first table that we want to look in, orders
    The third line uses JOIN to say that we want to combine information from orders with customers.
    The fourth line tells us how to combine the two tables. We want to match customer_id from orders with customer_id from customers.
Because column names are often repeated across multiple tables, we use the syntax table_name.column_name to be sure that our requests for columns are unambiguous. In our example, we use this syntax in the ON statement, but we will also use it in the SELECT or any other statement where we refer to column names.

For example, if we only wanted to select the order_id from orders and the customer_name from customers, we could use the following query:

In [2]:
'''
SELECT orders.order_id,
   customers.customer_name
FROM orders
JOIN customers
  ON orders.customer_id = customers.customer_id;
'''

'\nSELECT orders.order_id,\n   customers.customer_name\nFROM orders\nJOIN customers\n  ON orders.customer_id = customers.customer_id;\n'

Join orders table and subscriptions table and select all columns.

Make sure to join on subscription_id.

In [3]:
'''
SELECT *
FROM orders
JOIN subscriptions
  ON orders.subscription_id = subscriptions.subscription_id;
'''

'\nSELECT *\nFROM orders\nJOIN subscriptions\n  ON orders.subscription_id = subscriptions.subscription_id;\n'

Don't remove the previous query.

Add a second query after your first one that only selects rows from the join where description is equal to 'Fashion Magazine'.

In [4]:
'''
SELECT *
FROM orders
JOIN subscriptions
  ON orders.subscription_id = subscriptions.subscription_id;

SELECT *
FROM orders
JOIN subscriptions
  ON orders.subscription_id = subscriptions.subscription_id
WHERE subscriptions.description = 'Fashion Magazine';
'''

"\nSELECT *\nFROM orders\nJOIN subscriptions\n  ON orders.subscription_id = subscriptions.subscription_id;\n\nSELECT *\nFROM orders\nJOIN subscriptions\n  ON orders.subscription_id = subscriptions.subscription_id\nWHERE subscriptions.description = 'Fashion Magazine';\n"

### Inner Joins
Let's revisit how we joined orders and customers. For every possible value of customer_id in orders, there was a corresponding row of customers with the same customer_id.

What if that wasn't true?

For instance, imagine that our customers table was out of date, and was missing any information on customer 11. If that customer had an order in orders, what would happen when we joined the tables?

When we perform a simple JOIN (often called an inner join) our result only includes rows that match our ON condition.

Suppose we are working for The Codecademy Times, a newspaper with two types of subscriptions:

    print newspaper
    online articles
Some users subscribe to just the newspaper, some subscribe to just the online edition, and some subscribe to both.

The table newspaper contains information about the newspaper subscribers.

Count the number of subscribers who get a print newspaper using COUNT().

In [5]:
# SELECT COUNT(*) FROM newspaper;

Don't remove your previous query.

The table online contains information about the online subscribers.

Count the number of subscribers who get an online newspaper using COUNT().

In [6]:
'''
SELECT COUNT(*)
FROM newspaper;

SELECT COUNT(*)
FROM online;
'''

'\nSELECT COUNT(*)\nFROM newspaper;\n\nSELECT COUNT(*)\nFROM online;\n'

Don't remove your previous queries.

Join newspaper and online on id (the unique ID of the subscriber).

How many rows are in this table?

In [7]:
'''
SELECT COUNT(*)
FROM newspaper;

SELECT COUNT(*)
FROM online;

SELECT COUNT(*)
FROM newspaper
JOIN online
	ON newspaper.id = online.id;
'''

'\nSELECT COUNT(*)\nFROM newspaper;\n\nSELECT COUNT(*)\nFROM online;\n\nSELECT COUNT(*)\nFROM newspaper\nJOIN online\n\tON newspaper.id = online.id;\n'

### Left Joins
What if we want to combine two tables and keep some of the un-matched rows?

SQL lets us do this through a command called LEFT JOIN. A left join will keep all rows from the first table, regardless of whether there is a matching row in the second table.

Let's return to our newspaper and online subscribers. Suppose we want to know how many users subscribe to the print newspaper, but not to the online.

Start by performing a left join of newspaper and online on id and selecting all columns.

In [8]:
'''
SELECT *
FROM newspaper
LEFT JOIN online
	ON newspaper.id = online.id;
'''

'\nSELECT *\nFROM newspaper\nLEFT JOIN online\n\tON newspaper.id = online.id;\n'

In order to find which users do not subscribe to the online edition, we need to add a WHERE clause.

In [9]:
'''
SELECT *
FROM newspaper
LEFT JOIN online
  ON newspaper.id = online.id
WHERE online.id IS NULL;
'''

'\nSELECT *\nFROM newspaper\nLEFT JOIN online\n  ON newspaper.id = online.id\nWHERE online.id IS NULL;\n'