# Challenge: Promotions

In this challenge, you'll develop codes to parse and analyze data returned from another API on Zalando such as [Promos homme (Men's Promotions)
](https://www.zalando.fr/promo-homme/) or [Promos femme (Women's Promotions)](https://www.zalando.fr/promo-femme/). The workflow is almost the same as in the guided lesson but you'll work with different data.

## Obtaining the link

Wrote your codes in the cell below to obtain the data from the API endpoint you choose. A recap of the workflow:

1. Examine the webpages and choose one that you want to work with.

1. Use Google Chrome's DevTools to inspect the XHR network requests. Find out the API endpoint that serves data to the webpage.

1. Test the API endpoint in the browser to verify its data.

1. Change the page number offset of the API URL to test if it's working.

In [1]:
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0"}

In [2]:
# your code here
url = 'https://www.zalando.fr/api/catalog/articles?categories=promo-femme&limit=84&offset=0&sort=popularity'

## Reading the data

In the next cell, use Python to obtain data from the API endpoint you chose in the previous step. Workflow:

1. Import libraries.

1. Define the initial API endpoint URL.

1. Make request to obtain data of the 1st page. Flatten the data and store it in an empty object variable.

1. Find out the total page count in the 1st page data.

1. Use a FOR loop to make requests for the additional pages from 2 to page count. Append the data of each additional page to the flatterned data object.

1. Print and review the data you obtained.

In [3]:
# your code here
import requests
import json
import pandas as pd
from pandas.io.json import json_normalize

In [4]:
url = 'https://www.zalando.fr/api/catalog/articles?categories=promo-femme&limit=84&offset=0&sort=popularity'

In [5]:
response = requests.get(url, headers=headers)
results = response.json()
results

{'total_count': 112202,
 'pagination': {'page_count': 892, 'current_page': 1, 'per_page': 84},
 'sort': 'popularity',
 'articles': [{'sku': 'VA215B000-Q12',
   'name': 'OLD SKOOL - Chaussures de skate - black',
   'price': {'original': '74,95\xa0€',
    'promotional': '67,45\xa0€',
    'has_different_prices': False,
    'has_different_original_prices': False,
    'has_different_promotional_prices': False,
    'has_discount_on_selected_sizes_only': False},
   'sizes': ['34.5',
    '35',
    '36',
    '36.5',
    '37',
    '38',
    '38.5',
    '39',
    '40',
    '40.5',
    '41',
    '42',
    '42.5',
    '43',
    '44',
    '44.5',
    '45',
    '46',
    '47',
    '48',
    '49',
    '50'],
   'url_key': 'vans-old-skool-baskets-basses-va215b000-q12',
   'media': [{'path': 'VA/21/5B/00/0Q/12/VA215B000-Q12@12.jpg',
     'role': 'DEFAULT',
     'packet_shot': True}],
   'brand_name': 'Vans',
   'is_premium': False,
   'family_articles': [],
   'flags': [{'key': 'discountRate',
     'val

In [6]:
flattened_data = json_normalize(results)

In [7]:
flattened_data1 = json_normalize(flattened_data.articles[0])
flattened_data1

Unnamed: 0,sku,name,sizes,url_key,media,brand_name,is_premium,family_articles,flags,product_group,outfits,delivery_promises,price.original,price.promotional,price.has_different_prices,price.has_different_original_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,amount,price.base_price
0,VA215B000-Q12,OLD SKOOL - Chaussures de skate - black,"[34.5, 35, 36, 36.5, 37, 38, 38.5, 39, 40, 40....",vans-old-skool-baskets-basses-va215b000-q12,[{'path': 'VA/21/5B/00/0Q/12/VA215B000-Q12@12....,Vans,False,[],"[{'key': 'discountRate', 'value': '-10%', 'tra...",shoe,"[{'id': 'GS2rkdweTwu', 'url_key': '/outfits/GS...",[],"74,95 €","67,45 €",False,False,False,False,,
1,NL011N0AH-Q11,BRISK - Bottines - black,"[36, 37, 38, 39, 41, 42]",new-look-brisk-bottines-black-nl011n0ah-q11,[{'path': 'NL/01/1N/0A/HQ/11/NL011N0AH-Q11@9.j...,New Look,False,[],"[{'key': 'discountRate', 'value': '-25%', 'tra...",shoe,,[],"29,99 €","22,49 €",False,False,False,False,,
2,AD121D0HV-Q11,ADICOLOR TREFOIL GRAPHIC TEE - T-shirt imprimé...,"[32, 34, 36, 38, 40, 42, 44, 46]",adidas-originals-t-shirt-imprime-black-ad121d0...,[{'path': 'AD/12/1D/0H/VQ/11/AD121D0HV-Q11@13....,adidas Originals,False,[],"[{'key': 'discountRate', 'value': 'Jusqu’à -10...",clothing,,[],"24,95 €","22,45 €",True,False,True,False,,
3,TW421IA0N-Q11,Gilet - black/white,"[XS, S, M, L]",twintip-gilet-black-white-tw421ia0n-q11,[{'path': 'TW/42/1I/A0/NQ/11/TW421IA0N-Q11@10....,TWINTIP,False,[],"[{'key': 'discountRate', 'value': '-20%', 'tra...",clothing,,[],"38,99 €","31,15 €",False,False,False,False,,
4,AD121J0IH-G11,LOCK UP - Veste légère - scarlet,"[34, 36, 38, 40, 42, 44, 46, 48]",adidas-originals-lock-up-veste-de-survetement-...,[{'path': 'AD/12/1J/0I/HG/11/AD121J0IH-G11@10....,adidas Originals,False,[],"[{'key': 'discountRate', 'value': 'Jusqu’à -50...",clothing,,[],"59,95 €","29,95 €",True,False,True,False,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
79,AD115O0G9-A11,AMERICANA - Baskets basses - footwear white/co...,"[36, 38, 40, 42, 44, 46, 48, 36 2/3, 37 1/3, 3...",adidas-originals-americana-baskets-basses-foot...,[{'path': 'AD/11/5O/0G/9A/11/AD115O0G9-A11@11....,adidas Originals,False,[],"[{'key': 'discountRate', 'value': 'Jusqu’à -50...",shoe,,[],"79,95 €","39,95 €",True,False,True,False,,
80,EV421N032-K12,Jean slim - dark blue,"[36, 38, 40, 44, 46]",evenandodd-jegging-dark-blue-ev421n032-k12,[{'path': 'EV/42/1N/03/2K/12/EV421N032-K12@12....,Even&Odd,False,[],"[{'key': 'discountRate', 'value': '-20%', 'tra...",clothing,,[],"24,99 €","19,99 €",False,False,False,False,,
81,NI111A0IU-Q11,RYZ - Baskets basses - black/white,"[35.5, 36, 36.5, 37.5, 38, 38.5, 39, 40, 40.5,...",nike-sportswear-uptear-baskets-basses-ni111a0i...,[{'path': 'NI/11/1A/0I/UQ/11/NI111A0IU-Q11@3.j...,Nike Sportswear,False,[],"[{'key': 'campaign', 'value': 'HOT DROP', 'tra...",shoe,,[],"89,95 €","76,45 €",True,False,True,False,,
82,NM321G014-K11,NMDEBRA - Veste en jean - medium blue denim,"[36, 38, 40, 42, 44]",noisy-may-nmdebra-veste-en-jean-nm321g014-k11,[{'path': 'NM/32/1G/01/4K/11/NM321G014-K11@12....,Noisy May,False,[],"[{'key': 'discountRate', 'value': '-20%', 'tra...",clothing,,"[{'key': 'fast_delivery_flag', 'label': 'Livré...","29,99 €","23,99 €",False,False,False,False,,


In [8]:
# Get the total number of pages
total_pages=results['pagination']['page_count']
total_pages

892

In [9]:
df=pd.DataFrame()
for i in range(2, total_pages):
    k=84*i
    url = f'https://www.zalando.fr/api/catalog/articles?categories=promo-femme&limit=84&offset={k}&sort=popularity'
    response = requests.get(url, headers=headers)
    results = response.json()
    flattened_data = json_normalize(results)
    flattened_data1 = json_normalize(flattened_data.articles[0])
    flattened_data1=flattened_data1.set_index('sku')
    df = df.append(flattened_data1)

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  sort=sort,


In [10]:
df.columns

Index(['amount', 'brand_name', 'delivery_promises', 'family_articles', 'flags',
       'is_premium', 'media', 'name', 'outfits', 'price.base_price',
       'price.has_different_original_prices', 'price.has_different_prices',
       'price.has_different_promotional_prices',
       'price.has_discount_on_selected_sizes_only', 'price.original',
       'price.promotional', 'product_group', 'sizes', 'url_key'],
      dtype='object')

In [11]:
display(df)

Unnamed: 0_level_0,amount,brand_name,delivery_promises,family_articles,flags,is_premium,media,name,outfits,price.base_price,price.has_different_original_prices,price.has_different_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,price.original,price.promotional,product_group,sizes,url_key
sku,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
JY121I0AI-K11,,JDY,[],[],"[{'key': 'discountRate', 'value': '-20%', 'tra...",False,[{'path': 'JY/12/1I/0A/IK/11/JY121I0AI-K11@13....,JDYMAX LONG - Pullover - cloud dancer/dark den...,,,False,False,False,False,"26,99 €","21,59 €",clothing,"[XS, S, M, L]",jdy-jdymax-long-pullover-jy121i0ai-k11
UG111X01X-N11,,UGG,[],[],"[{'key': 'discountRate', 'value': '-10%', 'tra...",False,[{'path': 'UG/11/1X/01/XN/11/UG111X01X-N11@6.j...,CLASSIC MINI II - Boots à talons - eucalytpus ...,,,False,False,False,False,"169,95 €","152,95 €",shoe,"[36, 37, 38, 39, 40, 41, 42, 43]",ugg-classic-mini-ii-bottines-ug111x01x-n11
TO151H0EB-Q11,,Tommy Hilfiger,[],[],"[{'key': 'discountRate', 'value': '-20%', 'tra...",False,[{'path': 'TO/15/1H/0E/BQ/11/TO151H0EB-Q11@11....,Cabas - black,,,False,False,False,False,"79,95 €","63,95 €",accessoires,[One Size],tommy-hilfiger-cabas-black-to151h0eb-q11
NL021I0E6-Q11,,New Look,[],[],"[{'key': 'discountRate', 'value': '-15%', 'tra...",False,[{'path': 'NL/02/1I/0E/6Q/11/NL021I0E6-Q11@4.j...,LETTUCE EDGE STAND NECK - Pullover - black,,,False,False,False,False,"23,99 €","20,39 €",clothing,"[34, 36, 38, 40, 42, 46]",new-look-lettuce-edge-stand-neck-pullover-blac...
NI121J0AW-C11,,Nike Sportswear,[],[],"[{'key': 'discountRate', 'value': 'Jusqu’à -15...",False,[{'path': 'NI/12/1J/0A/WC/11/NI121J0AW-C11@4.j...,Sweat à capuche - dark grey heather/white,,,False,True,True,False,"48,95 €","41,55 €",clothing,"[XS, S, M, L, XL, XXL]",nike-sportswear-hoodie-sweatshirt-ni121j0aw-c11
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
C7641E034-Q11,,Champion,[],[],"[{'key': 'discountRate', 'value': '-35%', 'tra...",False,[{'path': 'C7/64/1E/03/4Q/11/C7641E034-Q11@4.j...,CAPRI PANTS - Pantalon 3/4 de sport - black,,,False,False,False,False,"24,95 €","16,15 €",clothing,"[XS, S, XL]",champion-capri-pants-collants-black-c7641e034-q11
VA111A0M1-Q11,,Vagabond,[],[],"[{'key': 'discountRate', 'value': '-15%', 'tra...",False,[{'path': 'VA/11/1A/0M/1Q/11/VA111A0M1-Q11@11....,ZOE PLATFORM - Baskets basses - black,,,False,False,False,False,"99,95 €","84,95 €",shoe,"[36, 37, 40, 41]",vagabond-zoe-platform-baskets-basses-black-va1...
KS821C03T-K11,,Karen by Simonsen,[],[],"[{'key': 'discountRate', 'value': '-30%', 'tra...",False,[{'path': 'KS/82/1C/03/TK/11/KS821C03T-K11@25....,Robe d'été - night sky,,,False,False,False,False,"95,00 €","66,50 €",clothing,"[36, 40, 42, 44]",karen-by-simonsen-robe-dete-night-sky-ks821c03...
AX911N02H-O11,,Call it Spring,[],[],"[{'key': 'campaign', 'value': 'Végane', 'track...",False,[{'path': 'AX/91/1N/02/HO/11/AX911N02H-O11@5.j...,HIGHRISE - Bottines à talons hauts - rust,,,False,False,False,False,"69,95 €","62,95 €",shoe,"[35, 40, 41, 42.5]",call-it-spring-highrise-boots-a-talons-rust-ax...


## Bonus

Extract the following information from the data:

* The trending brand.

* The product(s) with the highest discount.

* The sum of discounts of all goods (sum_discounted_prices divided by sum_original_prices).

In [12]:
df['price.original']=df['price.original'].str.extract('(\d*,\d*)')
df['price.promotional']=df['price.promotional'].str.extract('(\d*,\d*)')

df['price.original'] = [x.replace(',', '.') for x in df['price.original']]
df['price.promotional'] = [x.replace(',', '.') for x in df['price.promotional']]

In [13]:
df['discount_amount']=df['price.original'].astype(float)-df['price.promotional'].astype(float)
df['discount_amount']

sku
JY121I0AI-K11     5.4
UG111X01X-N11    17.0
TO151H0EB-Q11    16.0
NL021I0E6-Q11     3.6
NI121J0AW-C11     7.4
                 ... 
C7641E034-Q11     8.8
VA111A0M1-Q11    15.0
KS821C03T-K11    28.5
AX911N02H-O11     7.0
ON321A10V-C11    14.0
Name: discount_amount, Length: 74760, dtype: float64

In [14]:
df1=df.copy()

In [15]:
total_disc=df1.groupby(['brand_name']).sum().discount_amount

In [16]:
# Trending brand:
total_disc.sort_values(ascending=False).index[0]

'myMo'

In [17]:
# This command is necessary to do the next exercise, because before the 'sku' name was like the index of the table
# and not a column to work with.
df1.reset_index()

Unnamed: 0,sku,amount,brand_name,delivery_promises,family_articles,flags,is_premium,media,name,outfits,...,price.has_different_original_prices,price.has_different_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,price.original,price.promotional,product_group,sizes,url_key,discount_amount
0,JY121I0AI-K11,,JDY,[],[],"[{'key': 'discountRate', 'value': '-20%', 'tra...",False,[{'path': 'JY/12/1I/0A/IK/11/JY121I0AI-K11@13....,JDYMAX LONG - Pullover - cloud dancer/dark den...,,...,False,False,False,False,26.99,21.59,clothing,"[XS, S, M, L]",jdy-jdymax-long-pullover-jy121i0ai-k11,5.4
1,UG111X01X-N11,,UGG,[],[],"[{'key': 'discountRate', 'value': '-10%', 'tra...",False,[{'path': 'UG/11/1X/01/XN/11/UG111X01X-N11@6.j...,CLASSIC MINI II - Boots à talons - eucalytpus ...,,...,False,False,False,False,169.95,152.95,shoe,"[36, 37, 38, 39, 40, 41, 42, 43]",ugg-classic-mini-ii-bottines-ug111x01x-n11,17.0
2,TO151H0EB-Q11,,Tommy Hilfiger,[],[],"[{'key': 'discountRate', 'value': '-20%', 'tra...",False,[{'path': 'TO/15/1H/0E/BQ/11/TO151H0EB-Q11@11....,Cabas - black,,...,False,False,False,False,79.95,63.95,accessoires,[One Size],tommy-hilfiger-cabas-black-to151h0eb-q11,16.0
3,NL021I0E6-Q11,,New Look,[],[],"[{'key': 'discountRate', 'value': '-15%', 'tra...",False,[{'path': 'NL/02/1I/0E/6Q/11/NL021I0E6-Q11@4.j...,LETTUCE EDGE STAND NECK - Pullover - black,,...,False,False,False,False,23.99,20.39,clothing,"[34, 36, 38, 40, 42, 46]",new-look-lettuce-edge-stand-neck-pullover-blac...,3.6
4,NI121J0AW-C11,,Nike Sportswear,[],[],"[{'key': 'discountRate', 'value': 'Jusqu’à -15...",False,[{'path': 'NI/12/1J/0A/WC/11/NI121J0AW-C11@4.j...,Sweat à capuche - dark grey heather/white,,...,False,True,True,False,48.95,41.55,clothing,"[XS, S, M, L, XL, XXL]",nike-sportswear-hoodie-sweatshirt-ni121j0aw-c11,7.4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
74755,C7641E034-Q11,,Champion,[],[],"[{'key': 'discountRate', 'value': '-35%', 'tra...",False,[{'path': 'C7/64/1E/03/4Q/11/C7641E034-Q11@4.j...,CAPRI PANTS - Pantalon 3/4 de sport - black,,...,False,False,False,False,24.95,16.15,clothing,"[XS, S, XL]",champion-capri-pants-collants-black-c7641e034-q11,8.8
74756,VA111A0M1-Q11,,Vagabond,[],[],"[{'key': 'discountRate', 'value': '-15%', 'tra...",False,[{'path': 'VA/11/1A/0M/1Q/11/VA111A0M1-Q11@11....,ZOE PLATFORM - Baskets basses - black,,...,False,False,False,False,99.95,84.95,shoe,"[36, 37, 40, 41]",vagabond-zoe-platform-baskets-basses-black-va1...,15.0
74757,KS821C03T-K11,,Karen by Simonsen,[],[],"[{'key': 'discountRate', 'value': '-30%', 'tra...",False,[{'path': 'KS/82/1C/03/TK/11/KS821C03T-K11@25....,Robe d'été - night sky,,...,False,False,False,False,95.00,66.50,clothing,"[36, 40, 42, 44]",karen-by-simonsen-robe-dete-night-sky-ks821c03...,28.5
74758,AX911N02H-O11,,Call it Spring,[],[],"[{'key': 'campaign', 'value': 'Végane', 'track...",False,[{'path': 'AX/91/1N/02/HO/11/AX911N02H-O11@5.j...,HIGHRISE - Bottines à talons hauts - rust,,...,False,False,False,False,69.95,62.95,shoe,"[35, 40, 41, 42.5]",call-it-spring-highrise-boots-a-talons-rust-ax...,7.0


In [18]:
# The product(s) with the highest discount in euros:
df1.groupby(['sku']).sum().discount_amount.sort_values(ascending=False).head()

sku
MT021U00X-O11    796.0
23B21U00L-Q11    780.0
NOE21U005-B11    680.0
C7311X007-C11    465.0
MQ121B00K-Q11    450.0
Name: discount_amount, dtype: float64

In [19]:
# The sum of discounts of all goods (sum_discounted_prices divided by sum_original_prices):
df['sum_discounts'] = df['price.promotional'].astype(float).sum() / df['price.original'].astype(float).sum()
df['sum_discounts']

sku
JY121I0AI-K11    0.727879
UG111X01X-N11    0.727879
TO151H0EB-Q11    0.727879
NL021I0E6-Q11    0.727879
NI121J0AW-C11    0.727879
                   ...   
C7641E034-Q11    0.727879
VA111A0M1-Q11    0.727879
KS821C03T-K11    0.727879
AX911N02H-O11    0.727879
ON321A10V-C11    0.727879
Name: sum_discounts, Length: 74760, dtype: float64

In [20]:
# To have the percentage of the sum of discounts of all goods, it's like the mean:
(df['sum_discounts']*100).value_counts()

72.787866    74760
Name: sum_discounts, dtype: int64