# Pair Programming Joins and Views

HD Sheets,  February 6, 2025

Sources

https://www.sqlitetutorial.net/sqlite-join/

Beaulieau, Chapter 5,  Chapter 10,  

In [26]:
# Set Up and Connect

In [27]:
# Libaries

import sqlalchemy

# we will want Pandas for the data frame structure

import pandas as pd
import os

from dotenv import find_dotenv, dotenv_values

In [28]:
keys = list(dotenv_values(find_dotenv('.env')).items())
os.environ['POSTGRES_PASS'] = keys[1][1]

In [29]:
# Connect to the database
# Alter this to reflect your username and password,   this is for postgres on the same machine

keys = list(dotenv_values(find_dotenv('.env')).items())
os.environ['POSTGRES_PASS'] = keys[1][1]
os.environ['POSTGRES_USER'] = keys[2][1]
host = 'localhost'
port = '5432'
db = 'chinook'

engine=f'postgresql://{os.getenv('POSTGRES_USER')}:{os.getenv('POSTGRES_PASS')}@{host}:{port}/{db}'


In [30]:
# really just testing the connection

pd.read_sql_query("SELECT table_name  FROM information_schema.tables LIMIT 15",engine)

Unnamed: 0,table_name
0,employee
1,genre
2,invoice
3,pg_type
4,album
5,artist
6,customer
7,invoice_line
8,playlist
9,media_type


# Finding the artist for each album

Suppose we want a list of the artists for each album,   

the album titles are in album, the artist names are in artist.

in album, we have album.artist_id which is the same artist id number as in artist, where it is artist.artist_id,  we can use these in the Join

This is ordered by title



In [31]:
pd.read_sql_query("""SELECT
                        title,name
                    FROM album
                        INNER JOIN artist ON artist.artist_id =album.artist_id
                ORDER BY title;
                   """
                     ,engine)

Unnamed: 0,title,name
0,...And Justice For All,Metallica
1,[1997] Black Light Syndrome,"Terry Bozzio, Tony Levin & Steve Stevens"
2,20th Century Masters - The Millennium Collecti...,Scorpions
3,A-Sides,Soundgarden
4,"A Copland Celebration, Vol. I",Aaron Copland & London Symphony Orchestra
...,...,...
342,War,U2
343,Warner 25 Anos,Antônio Carlos Jobim
344,Weill: The Seven Deadly Sins,Kent Nagano and Orchestre de l'Opéra de Lyon
345,Worlds,Aaron Goldberg


#LEFT JOIN

We could also do this with a LEFT JOIN, since every album has an associated artist,  we get the same result as we did with the inner join

In [32]:
pd.read_sql_query("""SELECT
                        title,name
                    FROM album
                        LEFT JOIN artist ON artist.artist_id =album.artist_id
                    ORDER BY title
                   """
                     ,engine)

Unnamed: 0,title,name
0,...And Justice For All,Metallica
1,[1997] Black Light Syndrome,"Terry Bozzio, Tony Levin & Steve Stevens"
2,20th Century Masters - The Millennium Collecti...,Scorpions
3,A-Sides,Soundgarden
4,"A Copland Celebration, Vol. I",Aaron Copland & London Symphony Orchestra
...,...,...
342,War,U2
343,Warner 25 Anos,Antônio Carlos Jobim
344,Weill: The Seven Deadly Sins,Kent Nagano and Orchestre de l'Opéra de Lyon
345,Worlds,Aaron Goldberg


# RIGHT JOIN

If we do the same join with a RIGHT JOIN,  I would expect will cause some problems since each artist may have multiple albums

In [33]:
pd.read_sql_query("""SELECT
                        title,name
                    FROM album
                        RIGHT JOIN artist ON artist.artist_id =album.artist_id
                    ORDER BY title
                   """
                     ,engine)

Unnamed: 0,title,name
0,...And Justice For All,Metallica
1,[1997] Black Light Syndrome,"Terry Bozzio, Tony Levin & Steve Stevens"
2,20th Century Masters - The Millennium Collecti...,Scorpions
3,A-Sides,Soundgarden
4,"A Copland Celebration, Vol. I",Aaron Copland & London Symphony Orchestra
...,...,...
413,,Jaguares
414,,Barão Vermelho
415,,João Gilberto
416,,Los Lonely Boys


#CROSS JOIN

creates all possible combinations,  also called a "Cartesian Join"

In the SELECT before we get the first name of each employee, with each possible media type after the employee's name

They can be useful for creating large and varied test sets for use in development

It might be helpful to generate a "grid" of all permutations for calculating over all possible combinations,  for example 4 sales categories over each of 12 months

In [34]:
pd.read_sql_query("""SELECT employee.first_name, media_type.name mt_name FROM employee
                     CROSS JOIN media_type""",engine)

Unnamed: 0,first_name,mt_name
0,Andrew,MPEG audio file
1,Nancy,MPEG audio file
2,Jane,MPEG audio file
3,Margaret,MPEG audio file
4,Steve,MPEG audio file
5,Michael,MPEG audio file
6,Robert,MPEG audio file
7,Laura,MPEG audio file
8,Andrew,Protected AAC audio file
9,Nancy,Protected AAC audio file


# Views

A View is the stored output of a query

I haven't figured out how to create a View using SQL Alchemy,  that seems to be an issue

We can do it through the postgress command window

1.) Start the postgres command window and log in as the superuser postgres

2.) Connect to the chinook database

        \connect chinook

3.) Creat a view

        CREATE VIEW enames AS SELECT first_name, last_name FROM employee;
        
4.) Use \dv to see all the viewers, and verify it works

5.) Grant your user access to the view

       GRANT SELECT ON ALL TABLES IN SCHEMA public TO bob;
       
       my user is bob,  you may have a different username
       
       Note: when we set up bob as a user, we granted him SELECT privileges, but when we create new tables or views
       we have to grant it again.    There is a way to change this default setting in postgres, but finding that could be
       a bit of work

6.) We can now treat the View (enames) as though it was a table.
     This can be very helpful if we have a large database and really complex queries to carry out.   The View can simplify this

In [35]:
pd.read_sql_query("""SELECT *
                     FROM enames;""",engine)

ProgrammingError: (psycopg2.errors.UndefinedTable) relation "enames" does not exist
LINE 2:                      FROM enames;
                                  ^

[SQL: SELECT *
                     FROM enames;]
(Background on this error at: https://sqlalche.me/e/20/f405)

In [None]:
engine.dispose()