INNER JOIN basics
The JOIN functionality is pretty much the main event when it comes to learning SQL. It's another clause with another specific place. Everything else you've learned fits around it.
How old and what gender is the person named Angela?
people table does not contain basic bio info. We need to JOIN
bios using the
id columns in
I intentionally split up
peoples so that the desired results would be obvious when figuring out
Start off with what we already know -- how to fetch the
people row for "Angela":
SELECT * FROM people WHERE name = 'Angela';
SELECT * FROM people INNER JOIN bios ON bios.id = people.id WHERE people.name = 'Angela'
Join all "people" and "bios"
This query is the same as above, except we leave out the
WHERE condition. By using
SELECT *, we are basically slapping together the columns of
bios, side by side:
SELECT * FROM people INNER JOIN bios; ON bios.id = people.id;
Note something about the results: even though
bios contains 6 rows, only 5 rows are returned. That's because there are only 5 possible rows in the
people table to match against.
Also, note that there are two
id columns, confusingly named the same (because they are named the same in their respective tables), with the same values. Using
SELECT * when doing JOINs is almost always a bit sloppy
Be more selective with columns
Same query concept as above, just a more verbose/explicit
SELECT people.id, people.name, bios.birthdate, bios.gender FROM people INNER JOIN bios ON bios.id = people.id;
What pets does the person named "Angela" own?
Same kind of query, using
Begin by finding the 'Angela' person record:
SELECT * FROM people WHERE name = 'Angela';
Pretend we weren't learning joins, how could we list the pets belonging to 'Angela'? We use the given
id value found in the
people table in a normal
SELECT * FROM pets WHERE owner_id = '001';
That's the expected result. If we only cared about 'Angela', this would work fine. But in most data situations, we care more than just records associated with a singular value that can be manually queried.
List Angela's name with her pets (hack)
One thing to notice in the above example, we still only have the "owner_id", not the owner's name. Of course, in this simple example, 'Angela' is our complete world and we don't need to list her on the table results to remind us of that.
We don't get the luxury of a simple, singular viewpoint (i.e. everyone is Angela!) when doing real-world data analysis. But as an exercise, pretend it was imperative to list the owner's actual name, 'Angela', instead of '001' in the result set. We can still hack it:
SELECT 'Angela' AS "owner_name", pets.name AS "pet_name", species, purchase_date, purchase_price FROM pets WHERE owner_id = '001';
List the pets with the name of their owners
The JOIN condition needed is a little more complicated in that
pets does not have an
id column. This will result in an error:
SELECT * FROM pets INNER JOIN people ON people.id = pets.id;
pets has an
owner_id column. Just a simple name change (though a change in name is all that's needed to make data joining terrifically painful) that an be accommodated by changing the reference in the JOIN ON clause:
SELECT * FROM pets INNER JOIN people ON people.id = pets.owner_id;
SELECT *, we told the interpreter to give us all the columns of both joined tables. Which it did. It didn't cause a meaningful issue for us now, but when joined tables have the same column names, confusions and errors arise easily.
Try to stick with best practice of not using
SELECT * and selecting only the columns that you need. The use of aliases can reduce confusion. More importantly, it's time to realize that columns are associated with tables, and when dealing with more than one table, we'll just have to identify columns with their corresponding table:
SELECT people.id AS owner_id, people.name AS owner_name, pets.name AS pet_name, pets.species FROM pets INNER JOIN people ON people.id = pets.owner_id;
Why did I include/alias certain columns but not others? It all depends on the context...
List pets.owner_id, people.name, bios.gender, pets.name, pets.species
This is a INNER JOIN between 3 tables. Syntax structure is the same order as before:
| owner_id | owner_name | owner_gender | pet_name | species | | -------- | ---------- | ------------ | -------- | ------- | | 001 | Angela | F | Einstein | dog | | 001 | Angela | F | Olly | cat | | 005 | Elliot | M | Flipper | dog | | 004 | Darlene | F | Shaggy | dog | | 002 | Bill | M | Garfield | cat | | 005 | Elliot | M | Qwerty | fish | | 002 | Bill | M | Pounce | cat | | 002 | Bill | M | Timba | cat |