# Challenge 3

In this challenge we will work on the `Orders` data set. In your work you will apply the thinking process and workflow we showed you in Challenge 2.

You are serving as a Business Intelligence Analyst at the headquarter of an international fashion goods chain store. Your boss today asked you to do two things for her:

**First, identify two groups of customers from the data set.** The first group is **VIP Customers** whose **aggregated expenses** at your global chain stores are **above the 95th percentile** (aka. 0.95 quantile). The second group is **Preferred Customers** whose **aggregated expenses** are **between the 75th and 95th percentile**.

**Second, identify which country has the most of your VIP customers, and which country has the most of your VIP+Preferred Customers combined.**

## Q1: How to identify VIP & Preferred Customers?

We start by importing all the required libraries:

In [1]:
# import required libraries
import pandas as pd
import numpy as np

Next, import `Orders` from Ironhack's database into a dataframe variable called `orders`. Print the head of `orders` to overview the data:

In [2]:
# your code here

orders = pd.read_csv('../your-code/orders.csv')

In [3]:
orders.head()

Unnamed: 0.1,Unnamed: 0,InvoiceNo,StockCode,year,month,day,hour,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country,amount_spent
0,0,536365.0,85123A,2010.0,12.0,3.0,8.0,white hanging heart t-light holder,6.0,2010-12-01 08:26:00,2.55,17850.0,United Kingdom,15.3
1,1,536365.0,71053,2010.0,12.0,3.0,8.0,white metal lantern,6.0,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34
2,2,536365.0,84406B,2010.0,12.0,3.0,8.0,cream cupid hearts coat hanger,8.0,2010-12-01 08:26:00,2.75,17850.0,United Kingdom,22.0
3,3,536365.0,84029G,2010.0,12.0,3.0,8.0,knitted union flag hot water bottle,6.0,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34
4,4,536365.0,84029E,2010.0,12.0,3.0,8.0,red woolly hottie white heart.,6.0,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34


In [4]:
orders.columns = [e.replace(' ', '_').replace('.', '').lower() for e in orders.columns]
orders.head()

Unnamed: 0,unnamed:_0,invoiceno,stockcode,year,month,day,hour,description,quantity,invoicedate,unitprice,customerid,country,amount_spent
0,0,536365.0,85123A,2010.0,12.0,3.0,8.0,white hanging heart t-light holder,6.0,2010-12-01 08:26:00,2.55,17850.0,United Kingdom,15.3
1,1,536365.0,71053,2010.0,12.0,3.0,8.0,white metal lantern,6.0,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34
2,2,536365.0,84406B,2010.0,12.0,3.0,8.0,cream cupid hearts coat hanger,8.0,2010-12-01 08:26:00,2.75,17850.0,United Kingdom,22.0
3,3,536365.0,84029G,2010.0,12.0,3.0,8.0,knitted union flag hot water bottle,6.0,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34
4,4,536365.0,84029E,2010.0,12.0,3.0,8.0,red woolly hottie white heart.,6.0,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34


---

"Identify VIP and Preferred Customers" is the non-technical goal of your boss. You need to translate that goal into technical languages that data analysts use:

## How to label customers whose aggregated `amount_spent` is in a given quantile range?


We break down the main problem into several sub problems:

#### Sub Problem 1: How to aggregate the  `amount_spent` for unique customers?

#### Sub Problem 2: How to select customers whose aggregated `amount_spent` is in a given quantile range?

#### Sub Problem 3: How to label selected customers as "VIP" or "Preferred"?

*Note: If you want to break down the main problem in a different way, please feel free to revise the sub problems above.*

Now in the workspace below, tackle each of the sub problems using the iterative problem solving workflow. Insert cells as necessary to write your codes and explain your steps.

In [65]:
# your code here
spent_by_customer = orders.groupby(['customerid', 'country'], as_index = False).sum()
spent_by_customer.head()

Unnamed: 0,customerid,country,unnamed:_0,invoiceno,stockcode,year,month,day,hour,description,quantity,invoicedate,unitprice,amount_spent
0,12346.0,United Kingdom,61619,541431.0,23166,2011.0,1.0,2.0,10.0,medium ceramic top storage jar,74215.0,2011-01-18 10:01:00,1.04,77183.6
1,12347.0,Iceland,1493814939149401494114942149431494414945149461...,100740725.0,8511622375714772249222771227722277322774227752...,363960.0,1377.0,437.0,2206.0,black candelabra t-light holderairline bag vin...,2446.0,2010-12-07 14:57:002010-12-07 14:57:002010-12-...,480.36,4299.8
2,12348.0,Finland,3408334084340853408634087340883408934090340913...,16869685.0,8499222951849918499121213212132261621981219822...,62324.0,257.0,111.0,472.0,72 sweetheart fairy cake cases60 cake cases do...,2341.0,2010-12-16 19:09:002010-12-16 19:09:002010-12-...,178.71,1797.24
3,12349.0,Italy,4855024855034855044855054855064855074855084855...,42165457.0,2311223460215642141121563221312219548194849782...,146803.0,803.0,73.0,657.0,parisienne curio cabinetsweetheart wall tidy p...,631.0,2011-11-21 09:51:002011-11-21 09:51:002011-11-...,605.1,1757.55
4,12350.0,Norway,8032380324803258032680327803288032980330803318...,9231629.0,219082241279066K79191C2234884086C2255122557218...,34187.0,34.0,51.0,272.0,chocolate this way metal signmetal sign neighb...,197.0,2011-02-02 16:01:002011-02-02 16:01:002011-02-...,65.3,334.4


In [66]:
per_95 = spent_by_customer.amount_spent.quantile(0.95)  

per_75 = spent_by_customer.amount_spent.quantile(0.75)

In [67]:
per_95

5724.340000000001

In [68]:
per_75

1651.03

In [69]:
VIP = spent_by_customer[spent_by_customer.amount_spent >= per_95]  
VIP

Unnamed: 0,customerid,country,unnamed:_0,invoiceno,stockcode,year,month,day,hour,description,quantity,invoicedate,unitprice,amount_spent
0,12346.0,United Kingdom,61619,541431.0,23166,2011.0,1.0,2.0,10.0,medium ceramic top storage jar,74215.0,2011-01-18 10:01:00,1.04,77183.60
10,12357.0,Switzerland,4446854446864446874446884446894446904446914446...,71842500.0,2206421232220662206721555223162231521843220612...,251375.0,1375.0,875.0,2000.0,pink doughnut trinket pot strawberry ceramic t...,2616.0,2011-11-06 16:07:002011-11-06 16:07:002011-11-...,426.03,6065.71
12,12359.0,Cyprus,5438054381543825438354384543855438654387543885...,137108833.0,2251122510226562272022721226668248422423207048...,494706.0,1506.0,777.0,3112.0,retrospot babushka doorstopgingham babushka do...,1582.0,2011-01-12 12:43:002011-01-12 12:43:002011-01-...,2137.02,6355.78
52,12409.0,Switzerland,2222582222592222602222612222622222632222642222...,61157926.0,8475571053227842190721169217702219222427224172...,219199.0,789.0,499.0,1222.0,colour glass t-light holder hangingwhite metal...,5551.0,2011-06-10 12:19:002011-06-10 12:19:002011-06-...,389.62,11072.67
57,12415.0,Australia,4551145512455134551445515455164551745518455194...,395240333.0,2207822079220802207722505225162251722518225192...,1427810.0,4229.0,2155.0,7982.0,ribbon reel lace design ribbon reel hearts des...,75654.0,2011-01-06 11:12:002011-01-06 11:12:002011-01-...,2086.61,123146.21
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4214,18109.0,United Kingdom,7576757775787579758075817582758375847585758675...,246646849.0,2216922469224708248421333228558248685123A21363...,886830.0,3095.0,2145.0,6034.0,family album white picture frameheart of wicke...,4210.0,2010-12-05 10:58:002010-12-05 10:58:002010-12-...,2178.71,8028.04
4236,18139.0,United Kingdom,4867094867104867114867124867134867144867154867...,90134520.0,2208722046227082354623230232962121084991219772...,313716.0,1716.0,205.0,2051.0,paper bunting white lacetea party wrapping pa...,5449.0,2011-11-21 14:06:002011-11-21 14:06:002011-11-...,238.71,8392.98
4260,18172.0,United Kingdom,9916299163991649916599166991679916899169991709...,105352467.0,2243122432225022251422515225162251722518225192...,380079.0,1128.0,679.0,2506.0,watering can blue elephantwatering can pink bu...,3184.0,2011-02-23 10:44:002011-02-23 10:44:002011-02-...,587.23,7561.68
4299,18223.0,United Kingdom,3372833729337303373133732337333373433735337363...,149969229.0,8487922890227262265922414224122236522198221972...,540938.0,1818.0,772.0,3257.0,assorted colour bird ornamentnovelty biscuits ...,2924.0,2010-12-16 16:42:002010-12-16 16:42:002010-12-...,799.51,6423.34


In [70]:
preferred = spent_by_customer[(spent_by_customer.amount_spent <= per_95) & (spent_by_customer.amount_spent >= per_75)]
preferred

Unnamed: 0,customerid,country,unnamed:_0,invoiceno,stockcode,year,month,day,hour,description,quantity,invoicedate,unitprice,amount_spent
1,12347.0,Iceland,1493814939149401494114942149431494414945149461...,100740725.0,8511622375714772249222771227722277322774227752...,363960.0,1377.0,437.0,2206.0,black candelabra t-light holderairline bag vin...,2446.0,2010-12-07 14:57:002010-12-07 14:57:002010-12-...,480.36,4299.80
2,12348.0,Finland,3408334084340853408634087340883408934090340913...,16869685.0,8499222951849918499121213212132261621981219822...,62324.0,257.0,111.0,472.0,72 sweetheart fairy cake cases60 cake cases do...,2341.0,2010-12-16 19:09:002010-12-16 19:09:002010-12-...,178.71,1797.24
3,12349.0,Italy,4855024855034855044855054855064855074855084855...,42165457.0,2311223460215642141121563221312219548194849782...,146803.0,803.0,73.0,657.0,parisienne curio cabinetsweetheart wall tidy p...,631.0,2011-11-21 09:51:002011-11-21 09:51:002011-11-...,605.10,1757.55
5,12352.0,Norway,9181791818918199182091821918229182391824918259...,47523155.0,2138022064212322264622779224232265422120217552...,170935.0,552.0,243.0,1193.0,wooden happy birthday garlandpink doughnut tri...,536.0,2011-02-16 12:33:002011-02-16 12:33:002011-02-...,1354.11,2506.04
9,12356.0,Portugal,6158361584615856158661587615886158961590615916...,32183405.0,2213821198211142119921231220602206222066221322...,118649.0,142.0,185.0,592.0,baking set 9 piece retrospot white heart confe...,1591.0,2011-01-18 09:50:002011-01-18 09:50:002011-01-...,188.87,2811.43
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4326,18259.0,United Kingdom,1824818249182501825118252182531825434291434291...,23741242.0,2269422470228672283422112221142283722834228672...,84455.0,427.0,133.0,553.0,wicker star heart of wicker largehand warmer b...,714.0,2010-12-08 13:38:002010-12-08 13:38:002010-12-...,136.90,2338.60
4327,18260.0,United Kingdom,3401834019340203402134022340233402434025340263...,73457520.0,7932122452224532245421258212572208721843215272...,269463.0,577.0,436.0,1616.0,chilli lightsmeasuring tape babushka pinkmeasu...,1478.0,2010-12-16 18:23:002010-12-16 18:23:002010-12-...,469.94,2643.20
4335,18272.0,United Kingdom,1481441481451481461481471481481481491481501481...,93342091.0,2075421563714592255722979229802075220751229892...,333826.0,1221.0,509.0,2107.0,retrospot red washing up glovesred heart shape...,2050.0,2011-04-07 09:35:002011-04-07 09:35:002011-04-...,380.91,3078.58
4343,18283.0,United Kingdom,4602646027460284602946030460314603246033460344...,425704048.0,22356207262238422386207172071885099F85099B2238...,1520316.0,5503.0,2489.0,10346.0,charlotte bag pink polkadotlunch bag woodlandl...,1397.0,2011-01-06 14:14:002011-01-06 14:14:002011-01-...,1220.93,2094.88


Now we'll leave it to you to solve Q2 & Q3, which you can leverage from your solution for Q1:

## Q2: How to identify which country has the most VIP Customers?

In [80]:
# your code here
VIP.groupby('country', as_index = False).agg({'customerid': 'count'}).sort_values('customerid', ascending = False)

Unnamed: 0,country,customerid
16,United Kingdom,178
7,Germany,11
6,France,9
15,Switzerland,3
13,Spain,2
11,Portugal,2
8,Japan,2
4,EIRE,2
5,Finland,1
1,Channel Islands,1


In [81]:
preferred.groupby('country', as_index = False).agg({'customerid': 'count'}).sort_values('customerid', ascending = False)

Unnamed: 0,country,customerid
25,United Kingdom,755
10,Germany,28
9,France,20
2,Belgium,11
24,Switzerland,6
19,Norway,6
22,Spain,5
21,Portugal,5
14,Italy,5
8,Finland,4


## Q3: How to identify which country has the most VIP+Preferred Customers combined?

In [82]:
# your code here

VIP_and_preferred = spent_by_customer[spent_by_customer.amount_spent >= per_75]  
VIP_and_preferred

Unnamed: 0,customerid,country,unnamed:_0,invoiceno,stockcode,year,month,day,hour,description,quantity,invoicedate,unitprice,amount_spent
0,12346.0,United Kingdom,61619,541431.0,23166,2011.0,1.0,2.0,10.0,medium ceramic top storage jar,74215.0,2011-01-18 10:01:00,1.04,77183.60
1,12347.0,Iceland,1493814939149401494114942149431494414945149461...,100740725.0,8511622375714772249222771227722277322774227752...,363960.0,1377.0,437.0,2206.0,black candelabra t-light holderairline bag vin...,2446.0,2010-12-07 14:57:002010-12-07 14:57:002010-12-...,480.36,4299.80
2,12348.0,Finland,3408334084340853408634087340883408934090340913...,16869685.0,8499222951849918499121213212132261621981219822...,62324.0,257.0,111.0,472.0,72 sweetheart fairy cake cases60 cake cases do...,2341.0,2010-12-16 19:09:002010-12-16 19:09:002010-12-...,178.71,1797.24
3,12349.0,Italy,4855024855034855044855054855064855074855084855...,42165457.0,2311223460215642141121563221312219548194849782...,146803.0,803.0,73.0,657.0,parisienne curio cabinetsweetheart wall tidy p...,631.0,2011-11-21 09:51:002011-11-21 09:51:002011-11-...,605.10,1757.55
5,12352.0,Norway,9181791818918199182091821918229182391824918259...,47523155.0,2138022064212322264622779224232265422120217552...,170935.0,552.0,243.0,1193.0,wooden happy birthday garlandpink doughnut tri...,536.0,2011-02-16 12:33:002011-02-16 12:33:002011-02-...,1354.11,2506.04
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4326,18259.0,United Kingdom,1824818249182501825118252182531825434291434291...,23741242.0,2269422470228672283422112221142283722834228672...,84455.0,427.0,133.0,553.0,wicker star heart of wicker largehand warmer b...,714.0,2010-12-08 13:38:002010-12-08 13:38:002010-12-...,136.90,2338.60
4327,18260.0,United Kingdom,3401834019340203402134022340233402434025340263...,73457520.0,7932122452224532245421258212572208721843215272...,269463.0,577.0,436.0,1616.0,chilli lightsmeasuring tape babushka pinkmeasu...,1478.0,2010-12-16 18:23:002010-12-16 18:23:002010-12-...,469.94,2643.20
4335,18272.0,United Kingdom,1481441481451481461481471481481481491481501481...,93342091.0,2075421563714592255722979229802075220751229892...,333826.0,1221.0,509.0,2107.0,retrospot red washing up glovesred heart shape...,2050.0,2011-04-07 09:35:002011-04-07 09:35:002011-04-...,380.91,3078.58
4343,18283.0,United Kingdom,4602646027460284602946030460314603246033460344...,425704048.0,22356207262238422386207172071885099F85099B2238...,1520316.0,5503.0,2489.0,10346.0,charlotte bag pink polkadotlunch bag woodlandl...,1397.0,2011-01-06 14:14:002011-01-06 14:14:002011-01-...,1220.93,2094.88


In [84]:
VIP_and_preferred.groupby('country', as_index = False).agg({'customerid': 'count'}).sort_values('customerid', ascending = False)

Unnamed: 0,country,customerid
27,United Kingdom,933
10,Germany,39
9,France,29
2,Belgium,11
26,Switzerland,9
24,Spain,7
22,Portugal,7
20,Norway,7
14,Italy,5
8,Finland,5
