# Challenge: Promotions

In this challenge, you'll develop codes to parse and analyze data returned from another API on Zalando such as [Promos homme (Men's Promotions)
](https://www.zalando.fr/promo-homme/) or [Promos femme (Women's Promotions)](https://www.zalando.fr/promo-femme/). The workflow is almost the same as in the guided lesson but you'll work with different data.

## Obtaining the link

Wrote your codes in the cell below to obtain the data from the API endpoint you choose. A recap of the workflow:

1. Examine the webpages and choose one that you want to work with.

1. Use Google Chrome's DevTools to inspect the XHR network requests. Find out the API endpoint that serves data to the webpage.

1. Test the API endpoint in the browser to verify its data.

1. Change the page number offset of the API URL to test if it's working.

**1.- Examine the webpages and choose one that you want to work with.**

Category choosen: https://www.zalando.fr/promo-sport-femme/


**2.- Use Google Chrome's DevTools to inspect the XHR network requests. Find out the API endpoint that serves data to the webpage.**

API endpoint: https://www.zalando.fr/api/catalog/articles?categories=promo-sport-femme&limit=84&offset=84&sort=sale

## Reading the data

In the next cell, use Python to obtain data from the API endpoint you chose in the previous step. Workflow:

1. Import libraries.

1. Define the initial API endpoint URL.

1. Make request to obtain data of the 1st page. Flatten the data and store it in an empty object variable.

1. Find out the total page count in the 1st page data.

1. Use a FOR loop to make requests for the additional pages from 2 to page count. Append the data of each additional page to the flatterned data object.

1. Print and review the data you obtained.

**1.- Import libraries.**

In [1]:
import json
import requests
import pandas as pd
from pandas.io.json import json_normalize

**2.- Define the initial API endpoint URL.**

In [2]:
# your code here
header = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}
initial_endpoint = 'https://www.zalando.fr/api/catalog/articles?categories=promo-sport-femme&limit=84&offset=0&sort=sale'

**3.- Make request to obtain data of the 1st page. Flatten the data and store it in an empty object variable.**

In [3]:
flattened_data = pd.DataFrame()
response = requests.get(initial_endpoint,headers=header)
results = response.json()
results

{'total_count': 6360,
 'pagination': {'page_count': 76, 'current_page': 1, 'per_page': 84},
 'sort': 'sale',
 'articles': [{'sku': 'AD544E0CT-A12',
   'name': 'Casquette - white/black',
   'price': {'original': '17,95\xa0€',
    'promotional': '16,16\xa0€',
    'has_different_prices': False,
    'has_different_original_prices': False,
    'has_different_promotional_prices': False,
    'has_discount_on_selected_sizes_only': False},
   'sizes': ['58'],
   'url_key': 'adidas-performance-casquette-whiteblack-ad544e0ct-a12',
   'media': [{'path': 'AD/54/4E/0C/TA/12/AD544E0CT-A12@13.1.jpg',
     'role': 'DEFAULT',
     'packet_shot': False},
    {'path': 'AD/54/4E/0C/TA/12/AD544E0CT-A12@7.1.jpg',
     'role': 'HOVER',
     'packet_shot': False}],
   'brand_name': 'adidas Performance',
   'is_premium': False,
   'family_articles': [{'sku': 'AD544E0CT-A12',
     'url_key': 'adidas-performance-casquette-whiteblack-ad544e0ct-a12',
     'media': [{'path': 'AD/54/4E/0C/TA/12/AD544E0CT-A12@10.1.jpg

**4.- Find out the total page count in the 1st page data**

In [4]:
pages = results['pagination']['page_count']
pages

76

**5.- Use a FOR loop to make requests for the additional pages from 2 to page count. Append the data of each additional page to the flatterned data object.**

In [5]:
data = pd.DataFrame()
for i in range(pages):
    k = i*84
    url = f'https://www.zalando.fr/api/catalog/articles?categories=promo-sport-femme&limit=84&offset={k}&sort=sale'
    response = requests.get(url,headers=header)
    results = response.json()
    flattened_data = json_normalize(results)
    flattened_data2 = json_normalize(flattened_data.articles[0]) #take the articles
    articles = flattened_data2.set_index('sku') #set sku as index
    data = data.append(articles,sort=False) #append the data of the page

**6.- Print and review the data you obtained.**

In [6]:
data.head()

Unnamed: 0_level_0,amount,brand_name,family_articles,flags,is_premium,media,name,price.has_different_original_prices,price.has_different_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,price.original,price.promotional,product_group,sizes,tracking_information.impression_beacon,tracking_information.metrigo_impression_urls,tracking_information.source,url_key,outfits
sku,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
AD544E0CT-A12,,adidas Performance,"[{'sku': 'AD544E0CT-A12', 'url_key': 'adidas-p...","[{'key': 'sponsored', 'value': 'Sponsorisé', '...",False,[{'path': 'AD/54/4E/0C/TA/12/AD544E0CT-A12@13....,Casquette - white/black,False,False,False,False,"17,95 €","16,16 €",accessoires,[58],https://ccp-et.metrigo.zalan.do/event/sbv?z=39...,[https://ccp-et.metrigo.zalan.do/event/sbv?z=3...,ccp,adidas-performance-casquette-whiteblack-ad544e...,
D2941D00H-A11,,Diadora,"[{'sku': 'D2941D00H-A11', 'url_key': 'diadora-...","[{'key': 'sponsored', 'value': 'Sponsorisé', '...",False,[{'path': 'D2/94/1D/00/HA/11/D2941D00H-A11@11....,TANK CLAY - T-shirt de sport - optical white,False,False,False,False,"39,95 €","14,00 €",clothing,"[XS, S, M, L, XL]",https://ccp-et.metrigo.zalan.do/event/sbv?z=39...,[https://ccp-et.metrigo.zalan.do/event/sbv?z=3...,ccp,diadora-tank-clay-debardeur-optical-white-d294...,
AD541E0R4-Q11,,adidas Performance,"[{'sku': 'AD541E0R4-Q11', 'url_key': 'adidas-p...","[{'key': 'sponsored', 'value': 'Sponsorisé', '...",False,[{'path': 'AD/54/1E/0R/4Q/11/AD541E0R4-Q11@18....,HOW WE DO 3/4-TIGHTS - Pantalon 3/4 de sport -...,False,True,True,False,"59,95 €","44,95 €",clothing,"[XXS, XS, S, M, L, XL]",https://ccp-et.metrigo.zalan.do/event/sbv?z=39...,[https://ccp-et.metrigo.zalan.do/event/sbv?z=3...,ccp,adidas-performance-how-we-do-collants-ad541e0r...,
N1241A0NO-Q11,,Nike Performance,"[{'sku': 'N1241A0NO-Q11', 'url_key': 'nike-per...","[{'key': 'discountRate', 'value': 'Jusqu’à -30...",False,[{'path': 'N1/24/1A/0N/OQ/11/N1241A0NO-Q11@9.1...,REVOLUTION 4 - Chaussures de running neutres -...,False,True,True,False,"49,95 €","35,00 €",shoe,"[36, 37.5, 38, 38.5, 39, 40, 40.5, 41, 42.5, 4...",,,,nike-performance-revolution-4-eu-chaussures-de...,
N1241A0S2-A14,201 g,Nike Performance,"[{'sku': 'N1241A0S2-A14', 'url_key': 'nike-per...","[{'key': 'discountRate', 'value': 'Jusqu’à -50...",False,[{'path': 'N1/24/1A/0S/2A/14/N1241A0S2-A14@2.j...,EPIC REACT FLYKNIT 2 - Chaussures de running n...,False,True,True,False,"149,95 €","75,00 €",shoe,"[35.5, 36, 36.5, 37.5, 38, 38.5, 39, 40, 40.5,...",,,,nike-performance-epic-react-flyknit-2-chaussur...,


## Bonus

Extract the following information from the data:

* The trending brand.

* The product(s) with the highest discount.

* The sum of discounts of all goods (sum_discounted_prices divided by sum_original_prices).

#### Trending Brand

In [7]:
brands = pd.DataFrame(data['brand_name'].value_counts())

In [8]:
brands.head()

Unnamed: 0,brand_name
Nike Performance,720
adidas Performance,459
Puma,328
Reebok,278
ASICS,198


#### Products with the highest discount

In [9]:
data[['price.original_num', 'price.promotional_num']] = data[['price.original','price.promotional']].replace('[\€,]','', regex=True).astype(int).div(100)

In [10]:
data.head()

Unnamed: 0_level_0,amount,brand_name,family_articles,flags,is_premium,media,name,price.has_different_original_prices,price.has_different_prices,price.has_different_promotional_prices,...,price.promotional,product_group,sizes,tracking_information.impression_beacon,tracking_information.metrigo_impression_urls,tracking_information.source,url_key,outfits,price.original_num,price.promotional_num
sku,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
AD544E0CT-A12,,adidas Performance,"[{'sku': 'AD544E0CT-A12', 'url_key': 'adidas-p...","[{'key': 'sponsored', 'value': 'Sponsorisé', '...",False,[{'path': 'AD/54/4E/0C/TA/12/AD544E0CT-A12@13....,Casquette - white/black,False,False,False,...,"16,16 €",accessoires,[58],https://ccp-et.metrigo.zalan.do/event/sbv?z=39...,[https://ccp-et.metrigo.zalan.do/event/sbv?z=3...,ccp,adidas-performance-casquette-whiteblack-ad544e...,,17.95,16.16
D2941D00H-A11,,Diadora,"[{'sku': 'D2941D00H-A11', 'url_key': 'diadora-...","[{'key': 'sponsored', 'value': 'Sponsorisé', '...",False,[{'path': 'D2/94/1D/00/HA/11/D2941D00H-A11@11....,TANK CLAY - T-shirt de sport - optical white,False,False,False,...,"14,00 €",clothing,"[XS, S, M, L, XL]",https://ccp-et.metrigo.zalan.do/event/sbv?z=39...,[https://ccp-et.metrigo.zalan.do/event/sbv?z=3...,ccp,diadora-tank-clay-debardeur-optical-white-d294...,,39.95,14.0
AD541E0R4-Q11,,adidas Performance,"[{'sku': 'AD541E0R4-Q11', 'url_key': 'adidas-p...","[{'key': 'sponsored', 'value': 'Sponsorisé', '...",False,[{'path': 'AD/54/1E/0R/4Q/11/AD541E0R4-Q11@18....,HOW WE DO 3/4-TIGHTS - Pantalon 3/4 de sport -...,False,True,True,...,"44,95 €",clothing,"[XXS, XS, S, M, L, XL]",https://ccp-et.metrigo.zalan.do/event/sbv?z=39...,[https://ccp-et.metrigo.zalan.do/event/sbv?z=3...,ccp,adidas-performance-how-we-do-collants-ad541e0r...,,59.95,44.95
N1241A0NO-Q11,,Nike Performance,"[{'sku': 'N1241A0NO-Q11', 'url_key': 'nike-per...","[{'key': 'discountRate', 'value': 'Jusqu’à -30...",False,[{'path': 'N1/24/1A/0N/OQ/11/N1241A0NO-Q11@9.1...,REVOLUTION 4 - Chaussures de running neutres -...,False,True,True,...,"35,00 €",shoe,"[36, 37.5, 38, 38.5, 39, 40, 40.5, 41, 42.5, 4...",,,,nike-performance-revolution-4-eu-chaussures-de...,,49.95,35.0
N1241A0S2-A14,201 g,Nike Performance,"[{'sku': 'N1241A0S2-A14', 'url_key': 'nike-per...","[{'key': 'discountRate', 'value': 'Jusqu’à -50...",False,[{'path': 'N1/24/1A/0S/2A/14/N1241A0S2-A14@2.j...,EPIC REACT FLYKNIT 2 - Chaussures de running n...,False,True,True,...,"75,00 €",shoe,"[35.5, 36, 36.5, 37.5, 38, 38.5, 39, 40, 40.5,...",,,,nike-performance-epic-react-flyknit-2-chaussur...,,149.95,75.0


In [11]:
data['discount'] =  data['price.original_num'] - data['price.promotional_num']

In [16]:
data = data.sort_values(by=['discount'], ascending=False)

In [19]:
data[['brand_name','name','discount']].head()

Unnamed: 0_level_0,brand_name,name,discount
sku,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
PE441E00S-K11,Peak Performance,Pantalon classique - blue steel,350.0
F2241F03A-A11,Bogner Fire + Ice,BINE - Veste Hardshell - white,222.0
F1441G00C-O11,Filippa K,HOOD - Sweat à capuche - camel/melange,201.0
DI541F00U-B11,Didriksons,REX WOMEN'S - Veste imperméable - beige,168.0
1SM44E024-C11,Smith Optics,TRACE MIPS - Casque - matte gravy,165.0


#### The sum of discounts of all goods (sum_discounted_prices divided by sum_original_prices).

In [22]:
sum_of_discounts = data['discount'].sum() / data['price.original_num'].sum()

In [23]:
print(f'The sum of discounts of all goods: {round(sum_of_discounts*100,2)}%')

The sum of discounts of all goods: 32.5%
