I'm using the Chinook database — a sample relational database that simulates a digital music store — to demonstrate my skills in SQL and data analysis within a Python environment. This project allows me to practice querying and manipulating data related to artists, albums, tracks, employees, customers, and purchases, while also showcasing how SQL can be integrated seamlessly with Python for real-world data exploration and reporting.

Write a SQL query that shows the total number of distinct tracks sold per invoice ordered by invoice date.
How many tracks were sold in the earliest invoice?
How many tracks were sold in the latest invoice?

In [2]:
import sqlite3
import pandas as pd

# Connect to the Chinook database
conn = sqlite3.connect("chinook.db")

# SQL Query
query = """
SELECT 
    invoices.InvoiceId,
    invoices.InvoiceDate,
    COUNT(DISTINCT invoice_items.TrackId) AS DistinctTracksSold
FROM 
    invoices
JOIN 
    invoice_items ON invoices.InvoiceId = invoice_items.InvoiceId
GROUP BY 
    invoices.InvoiceId
ORDER BY 
    invoices.InvoiceDate ASC;
"""

# Run the query and store results in a DataFrame
df = pd.read_sql_query(query, conn)

# Display the first few rows
df.head()

Unnamed: 0,InvoiceId,InvoiceDate,DistinctTracksSold
0,1,2009-01-01 00:00:00,2
1,2,2009-01-02 00:00:00,4
2,3,2009-01-03 00:00:00,6
3,4,2009-01-06 00:00:00,9
4,5,2009-01-11 00:00:00,14


In [4]:
# Earliest invoice (first row)
print("Earliest invoice - Tracks sold:", df.iloc[0]["DistinctTracksSold"])

# Latest invoice (last row)
print("Latest invoice - Tracks sold:", df.iloc[-1]["DistinctTracksSold"])

Earliest invoice - Tracks sold: 2
Latest invoice - Tracks sold: 1


Get a "glimpse" of the entire Chinook database

In [42]:
# Get list of all tables in the database
tables_query = "SELECT name FROM sqlite_master WHERE type='table';"
tables_df = pd.read_sql_query(tables_query, conn)
print("Tables in the database:")
display(tables_df)

Tables in the database:


Unnamed: 0,name
0,albums
1,sqlite_sequence
2,artists
3,customers
4,employees
5,genres
6,invoices
7,invoice_items
8,media_types
9,playlists


Write a SQL query to find the number of distinct artists of the tracks sold in each invoice.
How many rows are in the result? 
What is the maximum such count?

In [6]:
# SQL query
query = """
SELECT 
    invoices.InvoiceId,
    COUNT(DISTINCT artists.ArtistId) AS DistinctArtists
FROM 
    invoice_items
JOIN 
    tracks ON invoice_items.TrackId = tracks.TrackId
JOIN 
    albums ON tracks.AlbumId = albums.AlbumId
JOIN 
    artists ON albums.ArtistId = artists.ArtistId
JOIN 
    invoices ON invoice_items.InvoiceId = invoices.InvoiceId
GROUP BY 
    invoices.InvoiceId
ORDER BY 
    invoices.InvoiceId;
"""

# Run query
df = pd.read_sql_query(query, conn)

# Show the first few results
df.head()

Unnamed: 0,InvoiceId,DistinctArtists
0,1,1
1,2,1
2,3,2
3,4,5
4,5,9


In [8]:
# Number of rows (invoices)
print("Number of invoices:", len(df))

# Maximum number of distinct artists in any invoice
print("Maximum distinct artists in an invoice:", df["DistinctArtists"].max())

Number of invoices: 412
Maximum distinct artists in an invoice: 11


Write a SQL query to list the number of distinct genres of tracks sold in each invoice. You may add WHERE conditions as needed to answer the following.
How many distinct genres were sold in invoice ID 138?
How many distinct genres were sold in invoice ID 40?

In [11]:
# SQL query
query = """
SELECT 
    invoice_items.InvoiceId,
    COUNT(DISTINCT tracks.GenreId) AS DistinctGenres
FROM 
    invoice_items
JOIN 
    tracks ON invoice_items.TrackId = tracks.TrackId
GROUP BY 
    invoice_items.InvoiceId
ORDER BY 
    invoice_items.InvoiceId;
"""

# Execute query
df = pd.read_sql_query(query, conn)

# Look up specific invoice IDs
genres_138 = df[df["InvoiceId"] == 138]["DistinctGenres"].values[0]
genres_40 = df[df["InvoiceId"] == 40]["DistinctGenres"].values[0]

print("Invoice 138 - Distinct genres:", genres_138)
print("Invoice 40 - Distinct genres:", genres_40)

Invoice 138 - Distinct genres: 7
Invoice 40 - Distinct genres: 4


Write a SQL query to list each customer’s name, the name of the employee that is their support representative, and the employee’s job title. You may add WHERE conditions as needed to answer the following.
Who was the support representative for invoice ID 100?
How many customers have the same first name as their support representative?
How many invoices total are for a customer with the same first name as their support representative?

In [13]:
# 1. List each customer's name, their support rep, and the rep's job title
query_customers_reps = """
SELECT 
    customers.CustomerId,
    customers.FirstName AS CustomerFirstName,
    customers.LastName AS CustomerLastName,
    employees.FirstName AS RepFirstName,
    employees.LastName AS RepLastName,
    employees.Title AS RepJobTitle
FROM 
    customers
JOIN 
    employees ON customers.SupportRepId = employees.EmployeeId;
"""
df_customers_reps = pd.read_sql_query(query_customers_reps, conn)
print("Customer and Support Rep Info:")
display(df_customers_reps.head())

Customer and Support Rep Info:


Unnamed: 0,CustomerId,CustomerFirstName,CustomerLastName,RepFirstName,RepLastName,RepJobTitle
0,1,Luís,Gonçalves,Jane,Peacock,Sales Support Agent
1,2,Leonie,Köhler,Steve,Johnson,Sales Support Agent
2,3,François,Tremblay,Jane,Peacock,Sales Support Agent
3,4,Bjørn,Hansen,Margaret,Park,Sales Support Agent
4,5,František,Wichterlová,Margaret,Park,Sales Support Agent


In [15]:
# 2. Who was the support rep for Invoice ID 100?
query_invoice_100 = """
SELECT 
    invoices.InvoiceId,
    customers.FirstName AS CustomerFirstName,
    customers.LastName AS CustomerLastName,
    employees.FirstName AS RepFirstName,
    employees.LastName AS RepLastName,
    employees.Title AS RepJobTitle
FROM 
    invoices
JOIN 
    customers ON invoices.CustomerId = customers.CustomerId
JOIN 
    employees ON customers.SupportRepId = employees.EmployeeId
WHERE 
    invoices.InvoiceId = 100;
"""
df_invoice_100 = pd.read_sql_query(query_invoice_100, conn)
print("Support Rep for Invoice ID 100:")
display(df_invoice_100)

Support Rep for Invoice ID 100:


Unnamed: 0,InvoiceId,CustomerFirstName,CustomerLastName,RepFirstName,RepLastName,RepJobTitle
0,100,František,Wichterlová,Margaret,Park,Sales Support Agent


In [17]:
# 3. How many customers share the same first name as their support rep?
query_matching_names = """
SELECT 
    COUNT(*) AS MatchingFirstNames
FROM 
    customers
JOIN 
    employees ON customers.SupportRepId = employees.EmployeeId
WHERE 
    customers.FirstName = employees.FirstName;
"""
df_matching_names = pd.read_sql_query(query_matching_names, conn)
print("Number of customers who share first name with their rep:")
display(df_matching_names)

Number of customers who share first name with their rep:


Unnamed: 0,MatchingFirstNames
0,1


In [19]:
# 4. How many invoices are from customers who share first name with their rep?
query_invoice_count = """
SELECT 
    COUNT(*) AS InvoiceCount
FROM 
    invoices
JOIN 
    customers ON invoices.CustomerId = customers.CustomerId
JOIN 
    employees ON customers.SupportRepId = employees.EmployeeId
WHERE 
    customers.FirstName = employees.FirstName;
"""
df_invoice_count = pd.read_sql_query(query_invoice_count, conn)
print("Number of invoices for matching name customers and reps:")
display(df_invoice_count)

Number of invoices for matching name customers and reps:


Unnamed: 0,InvoiceCount
0,7


Write a SQL command to add a new customer named 'John Smith' from 'San Francisco' with `CustomerId` 60 into the `Customers` table. Assume the highest existing CustomerId is 59.

In [25]:
# Create a cursor (this was missing in your cell)
cursor = conn.cursor()

# SQL command to insert a new customer
insert_query = """
INSERT INTO Customers (CustomerId, FirstName, LastName, City) 
VALUES (60, 'John', 'Smith', 'San Francisco');
"""

# Execute and commit
cursor.execute(insert_query)
conn.commit()

# Optional: Verify the insertion
verify_query = "SELECT * FROM Customers WHERE CustomerId = 60;"
df = pd.read_sql_query(verify_query, conn)
display(df)

IntegrityError: NOT NULL constraint failed: customers.Email

Write a query to find the names of customers who have bought more than the average number of tracks, ordered by customer first name.
How many are there?
What is the first name of the customer firts in the list?
What is the first name of the customer last in the list?

In [27]:
# SQL query for customers who bought more than average tracks
query = """
SELECT 
    customers.FirstName,
    customers.LastName,
    COUNT(invoice_items.InvoiceLineId) AS TracksBought
FROM 
    customers
JOIN 
    invoices ON customers.CustomerId = invoices.CustomerId
JOIN 
    invoice_items ON invoices.InvoiceId = invoice_items.InvoiceId
GROUP BY 
    customers.CustomerId
HAVING 
    COUNT(invoice_items.InvoiceLineId) > (
        SELECT 
            AVG(TrackCount)
        FROM (
            SELECT 
                COUNT(*) AS TrackCount
            FROM 
                invoice_items
            JOIN 
                invoices ON invoice_items.InvoiceId = invoices.InvoiceId
            GROUP BY 
                invoices.CustomerId
        )
    )
ORDER BY 
    customers.FirstName;
"""

# Execute and display results
df = pd.read_sql_query(query, conn)
display(df)

# How many customers?
print("Number of customers above average:", len(df))

# First and last first names in the list
print("First in list:", df.iloc[0]["FirstName"])
print("Last in list:", df.iloc[-1]["FirstName"])

Unnamed: 0,FirstName,LastName,TracksBought
0,Aaron,Mitchell,38
1,Alexandre,Rocha,38
2,Astrid,Gruber,38
3,Bjørn,Hansen,38
4,Camille,Bernard,38
5,Daan,Peeters,38
6,Dan,Miller,38
7,Diego,Gutiérrez,38
8,Dominique,Lefebvre,38
9,Eduardo,Martins,38


Number of customers above average: 58
First in list: Aaron
Last in list: Wyatt


Write a query to find the name of the genres that have more tracks than the average number of tracks per genre. How many are there?

In [29]:
# SQL query to find genres with more tracks than average
query = """
SELECT 
    genres.Name AS GenreName,
    COUNT(tracks.TrackId) AS TrackCount
FROM 
    genres
JOIN 
    tracks ON genres.GenreId = tracks.GenreId
GROUP BY 
    genres.GenreId
HAVING 
    COUNT(tracks.TrackId) > (
        SELECT 
            AVG(TrackCount)
        FROM (
            SELECT 
                COUNT(*) AS TrackCount
            FROM 
                tracks
            GROUP BY 
                GenreId
        )
    )
ORDER BY 
    TrackCount DESC;
"""

# Run the query
df = pd.read_sql_query(query, conn)
display(df)

# How many genres?
print("Number of genres above average:", len(df))

Unnamed: 0,GenreName,TrackCount
0,Rock,1297
1,Latin,579
2,Metal,374
3,Alternative & Punk,332


Number of genres above average: 4


Write a query to find the name of the genre that has the most tracks. What is it?

In [31]:
# Query to find the genre with the most tracks
query = """
SELECT 
    genres.Name AS GenreName,
    COUNT(tracks.TrackId) AS TrackCount
FROM 
    genres
JOIN 
    tracks ON genres.GenreId = tracks.GenreId
GROUP BY 
    genres.GenreId
ORDER BY 
    TrackCount DESC
LIMIT 1;
"""

# Run and display
df = pd.read_sql_query(query, conn)
display(df)

# Extract the genre name
print("Genre with the most tracks:", df.iloc[0]["GenreName"])

Unnamed: 0,GenreName,TrackCount
0,Rock,1297


Genre with the most tracks: Rock


Multiple questions:

In [33]:
# List of questions with correct answers
quiz = [
    ("Which SQL statement is used to create a user in a database?", "CREATE USER"),
    ("Which SQL command is used to remove a user's privileges?", "REVOKE"),
    ("The GRANT command in SQL is used for what purpose?", "All of the above."),
    ("The ______ command is used to provide any type of permission to a user in SQL.", "GRANT"),
    ("Which SQL command removes all privileges of a user, but does not delete the user itself?", "REVOKE ALL"),
    ("What is the purpose of a Foreign Key in a relational database?", "To create a link between two tables"),
    ("What happens if you try to insert a row with a foreign key that does not exist in the referenced table?", "An error is thrown"),
    ("What does the ON DELETE CASCADE constraint do in SQL?", "It deletes rows in the referencing table if the referenced row is deleted"),
    ("Which ON DELETE constraint sets the foreign key field to NULL when the referenced row is deleted?", "ON DELETE SET NULL"),
    ("Which SQL command is used to create a new view?", "CREATE VIEW"),
    ("Which SQL command is used to remove an existing view?", "DROP VIEW"),
    ("What is the purpose of an index in a SQL database?", "To speed up queries that filter by the indexed column"),
    ("What command is used to create an index in SQL?", "CREATE INDEX")
]

# Print each question with its correct answer
print("SQL Quiz Answers:\n")
for i, (question, answer) in enumerate(quiz, start=18):
    print(f"Question {i}: {question}")
    print(f"Answer: {answer}\n")

SQL Quiz Answers:

Question 18: Which SQL statement is used to create a user in a database?
Answer: CREATE USER

Question 19: Which SQL command is used to remove a user's privileges?
Answer: REVOKE

Question 20: The GRANT command in SQL is used for what purpose?
Answer: All of the above.

Question 21: The ______ command is used to provide any type of permission to a user in SQL.
Answer: GRANT

Question 22: Which SQL command removes all privileges of a user, but does not delete the user itself?
Answer: REVOKE ALL

Question 23: What is the purpose of a Foreign Key in a relational database?
Answer: To create a link between two tables

Question 24: What happens if you try to insert a row with a foreign key that does not exist in the referenced table?
Answer: An error is thrown

Question 25: What does the ON DELETE CASCADE constraint do in SQL?
Answer: It deletes rows in the referencing table if the referenced row is deleted

Question 26: Which ON DELETE constraint sets the foreign key fiel

Write an SQL query to find all albums with “Piano” in the title. 
How many are there?
Which of the following composers is listed in the titles?

In [35]:
# Query for albums with "Piano" in the title
query = """
SELECT 
    Title
FROM 
    albums
WHERE 
    Title LIKE '%Piano%';
"""

# Run query
df = pd.read_sql_query(query, conn)
display(df)

# How many albums contain "Piano"?
print("Number of albums with 'Piano' in the title:", len(df))

# Show unique composer names (if they exist in title)
print("\nPossible composer names in titles:")
for title in df["Title"]:
    print("-", title)

Unnamed: 0,Title
0,Chopin: Piano Concertos Nos. 1 & 2
1,Beethoven Piano Sonatas: Moonlight & Pastorale
2,"Szymanowski: Piano Works, Vol. 1"


Number of albums with 'Piano' in the title: 3

Possible composer names in titles:
- Chopin: Piano Concertos Nos. 1 & 2
- Beethoven Piano Sonatas: Moonlight & Pastorale
- Szymanowski: Piano Works, Vol. 1


In [37]:
composers = ["Mozart", "Beethoven", "Chopin", "Bach", "Debussy", "Liszt"]
found = [c for c in composers if any(c in title for title in df["Title"])]
print("\nComposers found in titles:", found)


Composers found in titles: ['Beethoven', 'Chopin']


Write an SQL query to find all invoices dated on or after December 1, 2013.
How many are there?
Where is the invoice with the highest total from?

In [40]:
# Query: all invoices on or after Dec 1, 2013
query = """
SELECT 
    InvoiceId,
    InvoiceDate,
    BillingCity,
    BillingCountry,
    Total
FROM 
    invoices
WHERE 
    InvoiceDate >= '2013-12-01'
ORDER BY 
    InvoiceDate;
"""

# Run the query
df = pd.read_sql_query(query, conn)
display(df)

# 1. How many invoices?
print("Number of invoices on or after Dec 1, 2013:", len(df))

# 2. Invoice with highest total
max_invoice = df.loc[df["Total"].idxmax()]
print("\nInvoice with highest total:")
print("Invoice ID:", max_invoice["InvoiceId"])
print("Total:", max_invoice["Total"])
print("City:", max_invoice["BillingCity"])
print("Country:", max_invoice["BillingCountry"])

Unnamed: 0,InvoiceId,InvoiceDate,BillingCity,BillingCountry,Total
0,406,2013-12-04 00:00:00,Reno,USA,1.98
1,407,2013-12-04 00:00:00,Boston,USA,1.98
2,408,2013-12-05 00:00:00,Madison,USA,3.96
3,409,2013-12-06 00:00:00,Toronto,Canada,5.94
4,410,2013-12-09 00:00:00,Porto,Portugal,8.91
5,411,2013-12-14 00:00:00,Helsinki,Finland,13.86
6,412,2013-12-22 00:00:00,Delhi,India,1.99


Number of invoices on or after Dec 1, 2013: 7

Invoice with highest total:
Invoice ID: 411
Total: 13.86
City: Helsinki
Country: Finland


Write an SQL query to find all tracks whose Composer is either ‘May’ or ‘Sykes’.
How many are there?
Which of these is a TrackID of one of these tracks?

In [46]:
# SQL query to find tracks with Composer = 'May' or 'Sykes'
query = """
SELECT 
    TrackId,
    Name,
    Composer
FROM 
    tracks
WHERE 
    Composer = 'May' OR Composer = 'Sykes';
"""

# Run the query
df = pd.read_sql_query(query, conn)
display(df)

# Number of matching tracks
print("Number of tracks by 'May' or 'Sykes':", len(df))

# Show one example TrackId
if not df.empty:
    print("Example TrackId from result:", df.iloc[0]["TrackId"])

Unnamed: 0,TrackId,Name,Composer
0,2271,We Will Rock You,May
1,2274,"All Dead, All Dead",May
2,2278,Sleep On The Sidewalk,May
3,2280,It's Late,May
4,3132,Still Of The Night,Sykes
5,3134,Is This Love,Sykes
6,3136,Looking For Love,Sykes
7,3141,You're Gonna Break My Hart Again,Sykes


Number of tracks by 'May' or 'Sykes': 8
Example TrackId from result: 2271


Write an SQL query to find all Customers whose email address includes ‘yahoo’. This means yahoo.com, yahoo.ca, yahoo.in, etc.
How many are there?
Are there any from Germany?
Are there any from Dublin?
Are there any from Bangalore?

In [48]:
# SQL query to find customers with 'yahoo' in email
query = """
SELECT 
    CustomerId,
    FirstName,
    LastName,
    Email,
    Country,
    City
FROM 
    customers
WHERE 
    Email LIKE '%yahoo%';
"""

# Run the query
df = pd.read_sql_query(query, conn)
display(df)

# 1. How many customers with 'yahoo' emails?
print("Number of customers with 'yahoo' emails:", len(df))

# 2. Are there any from Germany?
has_germany = 'Germany' in df['Country'].values
print("Any from Germany?", "Yes" if has_germany else "No")

# 3. Are there any from Dublin?
has_dublin = 'Dublin' in df['City'].values
print("Any from Dublin?", "Yes" if has_dublin else "No")

# 4. Are there any from Bangalore?
has_bangalore = 'Bangalore' in df['City'].values
print("Any from Bangalore?", "Yes" if has_bangalore else "No")

Unnamed: 0,CustomerId,FirstName,LastName,Email,Country,City
0,4,Bjørn,Hansen,bjorn.hansen@yahoo.no,Norway,Oslo
1,23,John,Gordon,johngordon22@yahoo.com,USA,Boston
2,25,Victor,Stevens,vstevens@yahoo.com,USA,Madison
3,32,Aaron,Mitchell,aaronmitchell@yahoo.ca,Canada,Winnipeg
4,34,João,Fernandes,jfernandes@yahoo.pt,Portugal,Lisbon
5,36,Hannah,Schneider,hannah.schneider@yahoo.de,Germany,Berlin
6,37,Fynn,Zimmermann,fzimmermann@yahoo.de,Germany,Frankfurt
7,39,Camille,Bernard,camille.bernard@yahoo.fr,France,Paris
8,42,Wyatt,Girard,wyatt.girard@yahoo.fr,France,Bordeaux
9,47,Lucas,Mancini,lucas.mancini@yahoo.it,Italy,Rome


Number of customers with 'yahoo' emails: 18
Any from Germany? Yes
Any from Dublin? No
Any from Bangalore? Yes


Write an SQL query to find all Invoices with a Total between 9 and 12.
How many are there?
Which of these countries is not a Billing Country of any of these invoices?

In [50]:
# Query: Invoices with total between 9 and 12
query = """
SELECT 
    InvoiceId,
    InvoiceDate,
    BillingCountry,
    Total
FROM 
    invoices
WHERE 
    Total BETWEEN 9 AND 12;
"""

# Run the query
df = pd.read_sql_query(query, conn)
display(df)

# 1. Number of invoices
print("Number of invoices with total between 9 and 12:", len(df))

# 2. List of unique BillingCountries in this range
billing_countries = df["BillingCountry"].unique()
print("\nBilling countries in this range:", billing_countries)

# 3. Check if certain countries are not present
check_countries = ["USA", "Canada", "Germany", "France", "Brazil", "India"]
missing_countries = [c for c in check_countries if c not in billing_countries]
print("\nCountries not present in these invoices:", missing_countries)

Unnamed: 0,InvoiceId,InvoiceDate,BillingCountry,Total
0,102,2010-03-16 00:00:00,Canada,9.91
1,298,2012-07-31 00:00:00,USA,10.91
2,311,2012-09-28 00:00:00,USA,11.94
3,312,2012-10-01 00:00:00,Portugal,10.91


Number of invoices with total between 9 and 12: 4

Billing countries in this range: ['Canada' 'USA' 'Portugal']

Countries not present in these invoices: ['Germany', 'France', 'Brazil', 'India']


Glimpse of the invoices table and understand the structure of the invoices table info (schema)

In [52]:
# Query the first 5 rows of the invoices table
query = "SELECT * FROM invoices LIMIT 5;"
df_invoices = pd.read_sql_query(query, conn)

# Display the preview
print("Preview of the 'invoices' table:")
display(df_invoices)

Preview of the 'invoices' table:


Unnamed: 0,InvoiceId,CustomerId,InvoiceDate,BillingAddress,BillingCity,BillingState,BillingCountry,BillingPostalCode,Total
0,1,2,2009-01-01 00:00:00,Theodor-Heuss-Straße 34,Stuttgart,,Germany,70174,1.98
1,2,4,2009-01-02 00:00:00,Ullevålsveien 14,Oslo,,Norway,0171,3.96
2,3,8,2009-01-03 00:00:00,Grétrystraat 63,Brussels,,Belgium,1000,5.94
3,4,14,2009-01-06 00:00:00,8210 111 ST NW,Edmonton,AB,Canada,T6G 2C7,8.91
4,5,23,2009-01-11 00:00:00,69 Salem Street,Boston,MA,USA,2113,13.86


In [54]:
schema_query = "PRAGMA table_info(invoices);"
df_schema = pd.read_sql_query(schema_query, conn)

print("Schema of the 'invoices' table:")
display(df_schema)

Schema of the 'invoices' table:


Unnamed: 0,cid,name,type,notnull,dflt_value,pk
0,0,InvoiceId,INTEGER,1,,1
1,1,CustomerId,INTEGER,1,,0
2,2,InvoiceDate,DATETIME,1,,0
3,3,BillingAddress,NVARCHAR(70),0,,0
4,4,BillingCity,NVARCHAR(40),0,,0
5,5,BillingState,NVARCHAR(40),0,,0
6,6,BillingCountry,NVARCHAR(40),0,,0
7,7,BillingPostalCode,NVARCHAR(10),0,,0
8,8,Total,"NUMERIC(10,2)",1,,0


and for the employees (employees table, column names and type)

In [56]:
# Query the first 5 rows of the employees table
query = "SELECT * FROM employees LIMIT 5;"
df_employees = pd.read_sql_query(query, conn)

print("Preview of the 'employees' table:")
display(df_employees)

Preview of the 'employees' table:


Unnamed: 0,EmployeeId,LastName,FirstName,Title,ReportsTo,BirthDate,HireDate,Address,City,State,Country,PostalCode,Phone,Fax,Email
0,1,Adams,Andrew,General Manager,,1962-02-18 00:00:00,2002-08-14 00:00:00,11120 Jasper Ave NW,Edmonton,AB,Canada,T5K 2N1,+1 (780) 428-9482,+1 (780) 428-3457,andrew@chinookcorp.com
1,2,Edwards,Nancy,Sales Manager,1.0,1958-12-08 00:00:00,2002-05-01 00:00:00,825 8 Ave SW,Calgary,AB,Canada,T2P 2T3,+1 (403) 262-3443,+1 (403) 262-3322,nancy@chinookcorp.com
2,3,Peacock,Jane,Sales Support Agent,2.0,1973-08-29 00:00:00,2002-04-01 00:00:00,1111 6 Ave SW,Calgary,AB,Canada,T2P 5M5,+1 (403) 262-3443,+1 (403) 262-6712,jane@chinookcorp.com
3,4,Park,Margaret,Sales Support Agent,2.0,1947-09-19 00:00:00,2003-05-03 00:00:00,683 10 Street SW,Calgary,AB,Canada,T2P 5G3,+1 (403) 263-4423,+1 (403) 263-4289,margaret@chinookcorp.com
4,5,Johnson,Steve,Sales Support Agent,2.0,1965-03-03 00:00:00,2003-10-17 00:00:00,7727B 41 Ave,Calgary,AB,Canada,T3B 1Y7,1 (780) 836-9987,1 (780) 836-9543,steve@chinookcorp.com


In [58]:
# Show column names and types
schema_query = "PRAGMA table_info(employees);"
df_emp_schema = pd.read_sql_query(schema_query, conn)

print("Schema of the 'employees' table:")
display(df_emp_schema)

Schema of the 'employees' table:


Unnamed: 0,cid,name,type,notnull,dflt_value,pk
0,0,EmployeeId,INTEGER,1,,1
1,1,LastName,NVARCHAR(20),1,,0
2,2,FirstName,NVARCHAR(20),1,,0
3,3,Title,NVARCHAR(30),0,,0
4,4,ReportsTo,INTEGER,0,,0
5,5,BirthDate,DATETIME,0,,0
6,6,HireDate,DATETIME,0,,0
7,7,Address,NVARCHAR(70),0,,0
8,8,City,NVARCHAR(40),0,,0
9,9,State,NVARCHAR(40),0,,0
