In [2]:
import pandas as pd

The data frames `Customers`, `Employees`, `Offices`, `OrderDetails`, `Orders`, `Payments`, `ProductLines`, and `Products` contain data of the corresponding tables in the [ClassicModels database](https://www.richardtwatson.com/dm6e/Reader/ClassicModels.html).

The entity relationship diagram is shown here ![ERD](figures/ClassicModels.png)

Using Pandas merge and join operations, answer the following questions:

*One to many relationship*

- Report the account representative for each customer.
- Report total payments for Atelier graphique.
- Report the total payments by date
- Report the products that have not been sold.
- List the amount paid by each customer.
- How many orders have been placed by Herkku Gifts?
- Who are the employees in Boston?
- Report those payments greater than \\$100,000. Sort the report so the customer who made the highest payment appears first.
- List the value of 'On Hold' orders.
- Report the number of orders 'On Hold' for each customer.

*Many to many relationship*

- List products sold by order date.
- List the order dates in descending order for orders for the 1940 Ford Pickup Truck.
- List the names of customers and their corresponding order number where a particular order from that customer has a value greater than $25,000?
- Are there any products that appear on all orders?
- List the names of products sold at less than 80% of the MSRP.
- Reports those products that have been sold with a markup of 100% or more (i.e.,  the priceEach is at least twice the buyPrice)
- List the products ordered on a Monday.
- What is the quantity on hand for products listed on 'On Hold' orders?

In [3]:
Customers = pd.read_csv('data/ClassicModels_Customers.csv', sep=';')

In [4]:
Customers.head(1)

Unnamed: 0,customerNumber,customerName,contactLastName,contactFirstName,phone,addressLine1,addressLine2,city,state,postalCode,country,salesRepEmployeeNumber,creditLimit,customerLocation
0,103,Atelier graphique,Schmitt,Carine,40.32.2555,"54, rue Royale",,Nantes,,44000,France,1370,21000.0,0


In [5]:
Employees = pd.read_csv('data/ClassicModels_Employees.csv', sep=';')

In [6]:
Employees.head(1)

Unnamed: 0,employeeNumber,lastName,firstName,extension,email,reportsTo,jobTitle,officeCode
0,1002,Murphy,Diane,x5800,dmurphy@classicmodelcars.com,0,President,1


In [7]:
Offices = pd.read_csv('data/ClassicModels_Offices.csv', sep=';')

In [8]:
Offices.head(1)

Unnamed: 0,officeCode,city,phone,addressLine1,addressLine2,state,country,postalCode,territory,officeLocation
0,1,San Francisco,+1 650 219 4782,100 Market Street,Suite 300,CA,USA,94080,,0


In [9]:
OrderDetails = pd.read_csv('data/ClassicModels_OrderDetails.csv', sep=';')

In [10]:
OrderDetails.head(1)

Unnamed: 0,orderNumber,productCode,quantityOrdered,priceEach,orderLineNumber
0,10107,S10_1678,30,81.35,2


In [11]:
Orders = pd.read_csv('data/ClassicModels_Orders.csv', sep=';')

In [12]:
Orders.head(1)

Unnamed: 0,orderNumber,orderDate,requiredDate,shippedDate,status,comments,customerNumber
0,10100,2003-01-06 00:00:00,2003-01-13 00:00:00,2003-01-10 00:00:00,Shipped,,363


In [13]:
Payments = pd.read_csv('data/ClassicModels_Payments.csv', sep=';')

In [14]:
Payments.head(1)

Unnamed: 0,checkNumber,paymentDate,amount,customerNumber
0,AB661578,2004-07-28 00:00:00,9415.13,471


In [15]:
ProductLines = pd.read_csv('data/ClassicModels_ProductLines.csv', sep=';')

In [16]:
ProductLines.head(1)

Unnamed: 0,productLine,textDescription,htmlDescription,image
0,Classic Cars,Attention car enthusiasts: Make your wildest c...,,


In [17]:
Products = pd.read_csv('data/ClassicModels_Products.csv', sep=';')

In [18]:
Products.head(1)

Unnamed: 0,productCode,productName,productScale,productVendor,productDescription,quantityInStock,buyPrice,MSRP,productLine
0,S10_1678,1969 Harley Davidson Ultimate Chopper,1:10,Min Lin Diecast,"This replica features working kickstand, front...",7933,48.81,95.7,Motorcycles


In [19]:
# Report the account representative for each customer.

pd.merge(Customers, Employees, 
         left_on = 'salesRepEmployeeNumber', 
         right_on = 'employeeNumber', how='inner')[['customerName', 
                                                    'firstName', 'lastName']]


Unnamed: 0,customerName,firstName,lastName
0,Atelier graphique,Gerard,Hernandez
1,La Rochelle Gifts,Gerard,Hernandez
2,Euro+ Shopping Channel,Gerard,Hernandez
3,Daedalus Designs Imports,Gerard,Hernandez
4,Mini Caravy,Gerard,Hernandez
5,Alpha Cognac,Gerard,Hernandez
6,Auto Associ,Gerard,Hernandez
7,Signal Gift Stores,Leslie,Thompson
8,Toys4GrownUps.com,Leslie,Thompson
9,Boards & Toys Co.,Leslie,Thompson


In [None]:
# Report total payments for Atelier graphique.

pd.merge(Customers, Payments).groupby('customerName')['amount'].sum().loc['Atelier graphique']

In [20]:
# Report total payments for Atelier graphique.

df_join = Customers.join(
    Orders, rsuffix = "_order").join(
    OrderDetails, rsuffix="_details")[Customers['customerName']=='Atelier graphique']

df_join['totalPrice'] = df_join['quantityOrdered'] * df_join['priceEach'] 
df_join['totalPrice'].sum()

2440.5

In [21]:
# Report the total payments by date

Payments.groupby('paymentDate')['amount'].sum()

paymentDate
2003-01-16 00:00:00     10223.83
2003-01-28 00:00:00     10549.01
2003-01-30 00:00:00      5494.78
2003-02-16 00:00:00     50218.95
2003-02-20 00:00:00     53959.21
2003-02-25 00:00:00     40206.20
2003-03-02 00:00:00     52151.81
2003-03-09 00:00:00     51001.22
2003-03-12 00:00:00     22292.62
2003-03-20 00:00:00     25833.14
2003-03-27 00:00:00     48425.69
2003-04-09 00:00:00     24212.79
2003-04-11 00:00:00     11044.30
2003-04-16 00:00:00     21665.98
2003-04-19 00:00:00      1627.56
2003-04-20 00:00:00     33383.14
2003-04-22 00:00:00     44380.15
2003-05-09 00:00:00      3101.40
2003-05-12 00:00:00     35826.33
2003-05-20 00:00:00     45864.03
2003-05-21 00:00:00     16700.47
2003-05-25 00:00:00     50824.66
2003-05-31 00:00:00      7565.08
2003-06-05 00:00:00     14571.44
2003-06-06 00:00:00     32641.98
2003-06-13 00:00:00     57131.92
2003-06-18 00:00:00     58841.35
2003-06-25 00:00:00     17032.29
2003-07-05 00:00:00      2880.00
2003-07-06 00:00:00      6036.9

In [None]:
# Report the total payments by date
months=Payments['paymentDate'].str[:7]
Payments.groupby(months)['amount'].sum()

In [22]:
# Report the products that have not been sold.
#     same as products with no orders
df_join = pd.merge(Products, OrderDetails, how='left')
df_join[df_join['orderNumber'].isnull()]

Unnamed: 0,productCode,productName,productScale,productVendor,productDescription,quantityInStock,buyPrice,MSRP,productLine,orderNumber,quantityOrdered,priceEach,orderLineNumber
1122,S18_3233,1985 Toyota Supra,1:18,Highway 66 Mini Classics,"This model features soft rubber tires, working...",7733,57.01,107.57,Classic Cars,,,,


In [23]:
# List the amount paid by each customer.
pd.merge(Payments, Customers).groupby('customerName')['amount'].sum()

customerName
AV Stores, Co.                        148410.09
Alpha Cognac                           60483.36
Amica Models & Co.                     82223.23
Anna's Decorations, Ltd               137034.22
Atelier graphique                      22314.36
Australian Collectables, Ltd           44920.76
Australian Collectors, Co.            180585.07
Australian Gift Network, Co            55190.16
Auto Associ                            58876.41
Auto Canal+ Petit                      86436.97
Auto-Moto Classics Inc.                21554.26
Baane Mini Imports                    104224.79
Bavarian Collectables Imports, Co.     31310.09
Blauer See Auto, Co.                   75937.76
Boards & Toys Co.                       7918.60
CAF Imports                            46751.14
Cambridge Collectables Co.             32198.69
Canadian Gift Exchange Network         70122.19
Classic Gift Ideas, Inc                57939.34
Classic Legends Inc.                   69214.33
Clover Collections, Co.    

In [24]:
# How many orders have been placed by Herkku Gifts?

df_join = pd.merge(Customers, Orders)
df_join[df_join['customerName']=='Herkku Gifts'].count()['orderNumber']

3

In [25]:
# Who are the employees in Boston?
df_join = pd.merge(Employees, Offices)
df_join[df_join['city']=='Boston']

Unnamed: 0,employeeNumber,lastName,firstName,extension,email,reportsTo,jobTitle,officeCode,city,phone,addressLine1,addressLine2,state,country,postalCode,territory,officeLocation
15,1188,Firrelli,Julie,x2173,jfirrelli@classicmodelcars.com,1143,Sales Rep,2,Boston,+1 215 837 0825,1550 Court Place,Suite 102,MA,USA,2107,,0
16,1216,Patterson,Steve,x4334,spatterson@classicmodelcars.com,1143,Sales Rep,2,Boston,+1 215 837 0825,1550 Court Place,Suite 102,MA,USA,2107,,0


In [26]:
# Report those payments greater than $100,000. Sort the report so the customer who made the highest payment appears first.
pd.merge(Customers, Payments[Payments['amount'] > 100000]).sort_values(by='amount', ascending = False)

Unnamed: 0,customerNumber,customerName,contactLastName,contactFirstName,phone,addressLine1,addressLine2,city,state,postalCode,country,salesRepEmployeeNumber,creditLimit,customerLocation,checkNumber,paymentDate,amount
3,141,Euro+ Shopping Channel,Freyre,Diego,(91) 555 94 44,"C/ Moralzarzal, 86",,Madrid,,28034,Spain,1370,227600.0,0,JE105477,2005-03-18 00:00:00,120166.58
2,141,Euro+ Shopping Channel,Freyre,Diego,(91) 555 94 44,"C/ Moralzarzal, 86",,Madrid,,28034,Spain,1370,227600.0,0,ID10962,2004-12-31 00:00:00,116208.4
1,124,Mini Gifts Distributors Ltd.,Nelson,Susan,4155551450,5677 Strong St.,,San Rafael,CA,97562,USA,1165,210500.0,0,KI131716,2003-08-15 00:00:00,111654.4
4,148,"Dragon Souveniers, Ltd.",Natividad,Eric,+65 221 7555,Bronz Sok.,Bronz Apt. 3/6 Tesvikiye,Singapore,,79903,Singapore,1621,103800.0,0,KM172879,2003-12-26 00:00:00,105743.0
0,124,Mini Gifts Distributors Ltd.,Nelson,Susan,4155551450,5677 Strong St.,,San Rafael,CA,97562,USA,1165,210500.0,0,AE215433,2005-03-05 00:00:00,101244.59


In [27]:
# List the value of 'On Hold' orders.
df_join = pd.merge(Orders[Orders['status']=='On Hold'], OrderDetails)
df_join['total'] = df_join['priceEach'] * df_join['quantityOrdered']
df_join['total'].sum()

169575.61000000004

In [28]:
# Report the number of orders 'On Hold' for each customer
pd.merge(Orders[Orders['status']=='On Hold'], Customers).groupby('customerName')['orderNumber'].count()

customerName
Gifts4AllAges.com            1
Tekni Collectables Inc.      1
The Sharp Gifts Warehouse    1
Volvo Model Replicas, Co     1
Name: orderNumber, dtype: int64

In [29]:
# List products sold by order date
pd.merge(pd.merge(Orders, OrderDetails), Products).sort_values(by='orderDate')

Unnamed: 0,orderNumber,orderDate,requiredDate,shippedDate,status,comments,customerNumber,productCode,quantityOrdered,priceEach,orderLineNumber,productName,productScale,productVendor,productDescription,quantityInStock,buyPrice,MSRP,productLine
0,10100,2003-01-06 00:00:00,2003-01-13 00:00:00,2003-01-10 00:00:00,Shipped,,363,S18_1749,30,136.00,3,1917 Grand Touring Sedan,1:18,Welly Diecast Productions,This 1:18 scale replica of the 1917 Grand Tour...,2724,86.70,170.00,Vintage Cars
50,10100,2003-01-06 00:00:00,2003-01-13 00:00:00,2003-01-10 00:00:00,Shipped,,363,S18_4409,22,75.46,4,1932 Alfa Romeo 8C2300 Spider Sport,1:18,Exoto Designs,This 1:18 scale precision die cast replica fea...,6553,43.26,92.03,Vintage Cars
75,10100,2003-01-06 00:00:00,2003-01-13 00:00:00,2003-01-10 00:00:00,Shipped,,363,S24_3969,49,35.29,1,1936 Mercedes Benz 500k Roadster,1:24,Red Start Diecast,This model features grille-mounted chrome horn...,2081,21.75,41.03,Vintage Cars
25,10100,2003-01-06 00:00:00,2003-01-13 00:00:00,2003-01-10 00:00:00,Shipped,,363,S18_2248,50,55.09,2,1911 Ford Town Car,1:18,Motor City Art Classics,"Features opening hood, opening doors, opening ...",540,33.30,60.54,Vintage Cars
128,10101,2003-01-09 00:00:00,2003-01-18 00:00:00,2003-01-11 00:00:00,Shipped,Check on availability.,128,S18_2795,26,167.06,1,1928 Mercedes-Benz SSK,1:18,Gearbox Collectibles,This 1:18 replica features grille-mounted chro...,548,72.56,168.75,Vintage Cars
156,10101,2003-01-09 00:00:00,2003-01-18 00:00:00,2003-01-11 00:00:00,Shipped,Check on availability.,128,S24_1937,45,32.53,3,1939 Chevrolet Deluxe Coupe,1:24,Motor City Art Classics,This 1:24 scale die-cast replica of the 1939 C...,7332,22.57,33.19,Vintage Cars
184,10101,2003-01-09 00:00:00,2003-01-18 00:00:00,2003-01-11 00:00:00,Shipped,Check on availability.,128,S24_2022,46,44.35,2,1938 Cadillac V-16 Presidential Limousine,1:24,Classic Metal Creations,This 1:24 scale precision die cast replica of ...,2847,20.61,44.80,Vintage Cars
100,10101,2003-01-09 00:00:00,2003-01-18 00:00:00,2003-01-11 00:00:00,Shipped,Check on availability.,128,S18_2325,25,108.06,4,1932 Model A Ford J-Coupe,1:18,Autoart Studio Design,This model features grille-mounted chrome horn...,9354,58.48,127.13,Vintage Cars
240,10102,2003-01-10 00:00:00,2003-01-18 00:00:00,2003-01-14 00:00:00,Shipped,,181,S18_1367,41,43.13,1,1936 Mercedes-Benz 500K Special Roadster,1:18,Studio M Art Models,This 1:18 scale replica is constructed of heav...,8635,24.26,53.91,Vintage Cars
212,10102,2003-01-10 00:00:00,2003-01-18 00:00:00,2003-01-14 00:00:00,Shipped,,181,S18_1342,39,95.55,2,1937 Lincoln Berline,1:18,Motor City Art Classics,"Features opening engine cover, doors, trunk, a...",8693,60.62,102.74,Vintage Cars


In [30]:
# List the order dates in descending order for orders for the 1940 Ford Pickup Truck
pd.merge(
    pd.merge(
        Products[Products.productName=='1940 Ford Pickup Truck'], OrderDetails
            ), Orders
        ).sort_values('orderDate')['orderDate']

0     2003-01-29 00:00:00
1     2003-03-26 00:00:00
2     2003-05-28 00:00:00
3     2003-07-24 00:00:00
4     2003-09-19 00:00:00
5     2003-10-21 00:00:00
6     2003-11-06 00:00:00
7     2003-11-13 00:00:00
8     2003-11-25 00:00:00
9     2003-12-05 00:00:00
10    2004-01-29 00:00:00
11    2004-03-10 00:00:00
12    2004-05-04 00:00:00
13    2004-06-15 00:00:00
14    2004-07-19 00:00:00
15    2004-08-17 00:00:00
16    2004-09-08 00:00:00
17    2004-10-11 00:00:00
18    2004-10-21 00:00:00
19    2004-11-04 00:00:00
20    2004-11-18 00:00:00
21    2004-11-29 00:00:00
22    2004-12-10 00:00:00
23    2005-01-20 00:00:00
24    2005-02-17 00:00:00
25    2005-03-09 00:00:00
26    2005-05-01 00:00:00
27    2005-05-31 00:00:00
Name: orderDate, dtype: object

In [31]:
# List the names of customers and their corresponding order number where a particular order from that customer has a value greater than $25,000?

In [32]:
# first, calculate amount
OrderDetails['amount'] = OrderDetails.quantityOrdered * OrderDetails.priceEach
# calculate total per order
order_sum = pd.merge(
    Orders, OrderDetails
    ).groupby('orderNumber')[['amount', 'customerNumber']].aggregate(
        {'amount':'sum', 'customerNumber':'first'})
# filter
high_orders = order_sum[order_sum.amount > 25000]
# join with customers
Customers.merge(high_orders.reset_index())[['customerName', 'orderNumber', 'amount']]

Unnamed: 0,customerName,orderNumber,amount
0,Signal Gift Stores,10124,32641.98
1,Signal Gift Stores,10278,33347.88
2,"Australian Collectors, Co.",10120,45864.03
3,"Australian Collectors, Co.",10223,44894.74
4,"Australian Collectors, Co.",10342,40265.60
5,"Australian Collectors, Co.",10347,41995.62
6,La Rochelle Gifts,10275,47924.19
7,La Rochelle Gifts,10375,49523.67
8,La Rochelle Gifts,10425,41623.44
9,Baane Mini Imports,10103,50218.95


In [33]:
# Are there any products that appear on all orders?
number_of_orders = len(Orders)
j = pd.merge(Products, OrderDetails)
product_orders = j.groupby('productName')['orderNumber'].count()

In [34]:
product_orders[product_orders==number_of_orders]

Series([], Name: orderNumber, dtype: int64)

In [35]:
# List the names of products sold at less than 80% of the MSRP

In [40]:
j = pd.merge(Products, OrderDetails)
j['markup'] = j['priceEach'] / j['MSRP']
set(j[j.markup < 0.8]['productName'])

{'18th century schooner',
 '1911 Ford Town Car',
 '1930 Buick Marquette Phaeton',
 '1932 Alfa Romeo 8C2300 Spider Sport',
 '1936 Mercedes Benz 500k Roadster',
 '1937 Lincoln Berline',
 '1939 Chevrolet Deluxe Coupe',
 '1940s Ford truck',
 "1950's Chicago Surface Lines Streetcar",
 '1952 Alpine Renault 1300',
 '1952 Citroen-15CV',
 '1956 Porsche 356A Coupe',
 '1957 Chevy Pickup',
 '1957 Corvette Convertible',
 '1962 City of Detroit Streetcar',
 '1962 Volkswagen Microbus',
 '1965 Aston Martin DB5',
 '1971 Alpine Renault 1600s',
 '1972 Alfa Romeo GTA',
 '1976 Ford Gran Torino',
 '1980s Black Hawk Helicopter',
 '1982 Camaro Z28',
 '1992 Ferrari 360 Spider red',
 '1993 Mazda RX-7',
 '1995 Honda Civic',
 '1996 Moto Guzzi 1100i',
 '1996 Peterbilt 379 Stake Bed with Outrigger',
 '1997 BMW F650 ST',
 '1999 Indy 500 Monte Carlo SS',
 'ATA: B757-300',
 'American Airlines: B767-300',
 'American Airlines: MD-11S',
 'Collectable Wooden Train',
 'Corsair F4U ( Bird Cage)',
 'Diamond T620 Semi-Skirted 

In [45]:
# Reports those products that have been sold with a markup of 100% or more 
# (i.e., the priceEach is at least twice the buyPrice)
j['markup2'] = j['priceEach'] / j['buyPrice']
set(j[j['markup2'] > 2]['productName'])

{'1926 Ford Fire Engine',
 '1928 Ford Phaeton Deluxe',
 '1928 Mercedes-Benz SSK',
 '1932 Alfa Romeo 8C2300 Spider Sport',
 '1932 Model A Ford J-Coupe',
 '1936 Harley Davidson El Knucklehead',
 '1936 Mercedes-Benz 500K Special Roadster',
 '1937 Horch 930V Limousine',
 '1938 Cadillac V-16 Presidential Limousine',
 '1939 Cadillac Limousine',
 '1940 Ford Pickup Truck',
 '1948 Porsche Type 356 Roadster',
 "1950's Chicago Surface Lines Streetcar",
 '1952 Alpine Renault 1300',
 '1954 Greyhound Scenicruiser',
 '1957 Chevy Pickup',
 '1957 Corvette Convertible',
 '1957 Ford Thunderbird',
 '1958 Chevy Corvette Limited Edition',
 '1960 BSA Gold Star DBD34',
 '1961 Chevrolet Impala',
 '1962 Volkswagen Microbus',
 '1968 Ford Mustang',
 '1969 Ford Falcon',
 '1970 Plymouth Hemi Cuda',
 '1976 Ford Gran Torino',
 '1980s Black Hawk Helicopter',
 '1982 Camaro Z28',
 '1982 Lamborghini Diablo',
 '1992 Ferrari 360 Spider red',
 '1999 Indy 500 Monte Carlo SS',
 '2001 Ferrari Enzo',
 '2002 Suzuki XREO',
 '2002

In [67]:
global t 
def c(x):
    t = x
    return x

In [79]:
# List the products ordered on a Monday.
Orders['weekday'] = Orders.orderDate.apply(lambda x: pd.to_datetime(x).day_name())
pd.merge(pd.merge(Products, OrderDetails), 
         Orders[Orders.weekday=='Monday'])[['productName', 'orderDate', 'weekday']]


Unnamed: 0,productName,orderDate,weekday
0,1969 Harley Davidson Ultimate Chopper,2003-02-24 00:00:00,Monday
1,1996 Moto Guzzi 1100i,2003-02-24 00:00:00,Monday
2,2003 Harley-Davidson Eagle Drag Bike,2003-02-24 00:00:00,Monday
3,2002 Suzuki XREO,2003-02-24 00:00:00,Monday
4,1936 Harley Davidson El Knucklehead,2003-02-24 00:00:00,Monday
5,1997 BMW R 1100 S,2003-02-24 00:00:00,Monday
6,1960 BSA Gold Star DBD34,2003-02-24 00:00:00,Monday
7,1997 BMW F650 ST,2003-02-24 00:00:00,Monday
8,1969 Harley Davidson Ultimate Chopper,2003-08-25 00:00:00,Monday
9,1996 Moto Guzzi 1100i,2003-08-25 00:00:00,Monday


In [None]:
# What is the quantity on hand for products listed on 'On Hold' orders?

In [83]:
pd.merge(Products, pd.merge(Orders[Orders.status=='On Hold'], OrderDetails)
    ).groupby('productName')['quantityInStock'].sum()

productName
18th century schooner                  1898
1900s Vintage Tri-Plane                2756
1903 Ford Model A                      3913
1904 Buick Runabout                    8290
1911 Ford Town Car                      540
1912 Ford Model T Delivery Wagon       9173
1917 Grand Touring Sedan               2724
1926 Ford Fire Engine                  2018
1928 British Royal Navy Airplane       3627
1928 Ford Phaeton Deluxe                136
1930 Buick Marquette Phaeton           7062
1932 Alfa Romeo 8C2300 Spider Sport    6553
1940 Ford Delivery Sedan               6621
1940s Ford truck                       3128
1949 Jaguar XK 120                     2350
1952 Citroen-15CV                      1452
1957 Ford Thunderbird                  3209
1962 LanciaA Delta 16V                 6791
1962 Volkswagen Microbus               2327
1964 Mercedes Tour Bus                 8258
1965 Aston Martin DB5                  9042
1966 Shelby Cobra 427 S/C              8197
1969 Chevrolet Camar