# Challenge: Promotions

In this challenge, you'll develop codes to parse and analyze data returned from another API on Zalando such as [Promos homme (Men's Promotions)
](https://www.zalando.fr/promo-homme/) or [Promos femme (Women's Promotions)](https://www.zalando.fr/promo-femme/). The workflow is almost the same as in the guided lesson but you'll work with different data.

## Obtaining the link

Wrote your codes in the cell below to obtain the data from the API endpoint you choose. A recap of the workflow:

1. Examine the webpages and choose one that you want to work with.

1. Use Google Chrome's DevTools to inspect the XHR network requests. Find out the API endpoint that serves data to the webpage.

1. Test the API endpoint in the browser to verify its data.

1. Change the page number offset of the API URL to test if it's working.

In [1]:
# your code here

## Reading the data

In the next cell, use Python to obtain data from the API endpoint you chose in the previous step. Workflow:

1. Import libraries.

1. Define the initial API endpoint URL.

1. Make request to obtain data of the 1st page. Flatten the data and store it in an empty object variable.

1. Find out the total page count in the 1st page data.

1. Use a FOR loop to make requests for the additional pages from 2 to page count. Append the data of each additional page to the flatterned data object.

1. Print and review the data you obtained.

In [2]:
# your code here
# Import libraries.
import json
import requests
import pandas as pd
from pandas.io.json import json_normalize

In [3]:
#Define the initial API endpoint URL.

url = 'https://www.zalando.fr/api/catalog/articles?categories=promo-homme&limit=84&offset=84&sort=sale'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}

In [4]:
#Make request to obtain data of the 1st page. Flatten the data and store it in an empty object variable.

response = requests.get(url,headers=headers)
results = response.json()
results

{'total_count': 53198,
 'pagination': {'page_count': 634, 'current_page': 2, 'per_page': 84},
 'sort': 'sale',
 'articles': [{'sku': 'OS322T06P-Q11',
   'name': 'ONSCOME TRUCKER   - Veste en jean - black denim',
   'price': {'original': '33,99\xa0€',
    'promotional': '27,99\xa0€',
    'has_different_prices': False,
    'has_different_original_prices': False,
    'has_different_promotional_prices': False,
    'has_discount_on_selected_sizes_only': False},
   'sizes': ['XS', 'S', 'M', 'L', 'XL'],
   'url_key': 'only-and-sons-onscome-trucker-veste-en-jean-black-denim-os322t06p-q11',
   'media': [{'path': 'OS/32/2T/06/PQ/11/OS322T06P-Q11@8.jpg',
     'role': 'DEFAULT',
     'packet_shot': False},
    {'path': 'OS/32/2T/06/PQ/11/OS322T06P-Q11@3.jpg',
     'role': 'HOVER',
     'packet_shot': False}],
   'brand_name': 'Only & Sons',
   'is_premium': False,
   'family_articles': [{'sku': 'OS322T06P-Q11',
     'url_key': 'only-and-sons-onscome-trucker-veste-en-jean-black-denim-os322t06p-q11'

In [5]:
flat_data = json_normalize(results)
flat_data

Unnamed: 0,total_count,sort,articles,query_path,previous_page_path,next_page_path,page_gender,premium,filters,total_article_count,...,iconPaths.filters.standard_delivery_filter,iconPaths.filters.fast_delivery_filter,iconPaths.filters.zalando_plus,iconPaths.mobileFilters.standard_delivery_filter,iconPaths.mobileFilters.fast_delivery_filter,iconPaths.mobileFilters.zalando_plus,iconPaths.flags.slow_delivery_flag,iconPaths.flags.fast_delivery_flag,iconPaths.flags.plus_delivery_flag,iconPaths.flags.zalando_plus
0,53198,sale,"[{'sku': 'OS322T06P-Q11', 'name': 'ONSCOME TRU...",/promo-homme/?p=2&order=sale,/promo-homme/?order=sale,/promo-homme/?p=3&order=sale,men,False,"[{'key': 'sizes', 'label': 'Taille', 'url_key'...",53197,...,icons/truck.svg,icons/truck-fast.svg,icons/plus-short-1.svg,icons/truck.svg,icons/truck-fast.svg,icons/plus-short-1.svg,icons/clock.svg,icons/truck-fast-orange-3.svg,icons/plus-short-1.svg,icons/zalando-plus.svg


In [6]:
flat_data = json_normalize(flat_data.articles[0])
flat_data

Unnamed: 0,sku,name,sizes,url_key,media,brand_name,is_premium,family_articles,flags,product_group,delivery_promises,price.original,price.promotional,price.has_different_prices,price.has_different_original_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,outfits
0,OS322T06P-Q11,ONSCOME TRUCKER - Veste en jean - black denim,"[XS, S, M, L, XL]",only-and-sons-onscome-trucker-veste-en-jean-bl...,[{'path': 'OS/32/2T/06/PQ/11/OS322T06P-Q11@8.j...,Only & Sons,False,"[{'sku': 'OS322T06P-Q11', 'url_key': 'only-and...","[{'key': 'discountRate', 'value': '-18%', 'tra...",clothing,[],"33,99 €","27,99 €",False,False,False,False,
1,JAM22Q01I-K11,JPRMARK MERINO KNIT CREW NECK - Pullover - mar...,"[XS, S, M, L, XL, XXL]",jack-and-jones-premium-jprmark-crew-neck-pullo...,[{'path': 'JA/M2/2Q/01/IK/11/JAM22Q01I-K11@13....,Jack & Jones PREMIUM,False,"[{'sku': 'JAM22Q01I-K11', 'url_key': 'jack-and...","[{'key': 'discountRate', 'value': '-30%', 'tra...",clothing,[],"59,95 €","41,99 €",False,False,False,False,
2,C1822O04J-A11,CORE INSTITUTIONAL LOGO TEE - T-shirt imprimé ...,"[XS, S, L, XL, XXL]",calvin-klein-jeans-core-institutional-logo-sli...,[{'path': 'C1/82/2O/04/JA/11/C1822O04J-A11@20....,Calvin Klein Jeans,False,"[{'sku': 'C1822O04J-A11', 'url_key': 'calvin-k...","[{'key': 'discountRate', 'value': '-20%', 'tra...",clothing,[],"29,95 €","23,95 €",False,False,False,False,
3,VA222S03K-Q11,CLASSIC HOODIE - Sweat à capuche - black/white,"[XS, S, M, L, XL]",vans-classic-hoodie-sweatshirt-va222s03k-q11,[{'path': 'VA/22/2S/03/KQ/11/VA222S03K-Q11@7.j...,Vans,False,"[{'sku': 'VA222S03K-Q11', 'url_key': 'vans-cla...","[{'key': 'discountRate', 'value': '-25%', 'tra...",clothing,[],"64,95 €","48,45 €",False,False,False,False,"[{'id': 'AkWBSfFFTU2', 'url_key': '/outfits/Ak..."
4,JA222O2QK-K11,JCOSHAWN TEE CREW NECK - T-shirt imprimé - sky...,"[XXS, XS, S, M, L, XL, XXL]",jack-and-jones-jcoshawn-tee-crew-neck-t-shirt-...,[{'path': 'JA/22/2O/2Q/KK/11/JA222O2QK-K11@17....,Jack & Jones,False,"[{'sku': 'JA222O2QK-K11', 'url_key': 'jack-and...","[{'key': 'discountRate', 'value': '-40%', 'tra...",clothing,[],"12,95 €","7,74 €",False,False,False,False,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
79,JA222P04R-A11,JJEBASIC - Polo - white,"[XS, S, M, L, XL, XXL]",jack-and-jones-jjebasic-polo-ja222p04r-a11,[{'path': 'JA/22/2P/04/RA/11/JA222P04R-A11@8.j...,Jack & Jones,False,"[{'sku': 'JA222P04R-A11', 'url_key': 'jack-and...","[{'key': 'discountRate', 'value': '-20%', 'tra...",clothing,[],"14,99 €","11,99 €",False,False,False,False,
80,PI922S049-Q11,Sweatshirt - black,"[XS, S, M, L, XL, XXL]",pier-one-sweatshirt-black-pi922s049-q11,[{'path': 'PI/92/2S/04/9Q/11/PI922S049-Q11@9.j...,Pier One,False,"[{'sku': 'PI922S049-Q11', 'url_key': 'pier-one...","[{'key': 'discountRate', 'value': '-30%', 'tra...",clothing,[],"27,95 €","19,59 €",False,False,False,False,
81,GY322G00Z-K11,Jeans Skinny - raw indigo,"[28, 30, 32, 34, 36]",gym-king-jeans-skinny-raw-indigo-gy322g00z-k11,[{'path': 'GY/32/2G/00/ZK/11/GY322G00Z-K11@9.j...,Gym King,False,"[{'sku': 'GY322G00Z-K11', 'url_key': 'gym-king...","[{'key': 'discountRate', 'value': '-35%', 'tra...",clothing,[],"59,95 €","38,95 €",False,False,False,False,
82,PI922D01Q-A11,Polo - white,"[XS, S, M, L, XL, XXL, 3XL]",pier-one-polo-blanc-pi922d01q-a11,[{'path': 'PI/92/2D/01/QA/11/PI922D01Q-A11@17....,Pier One,False,"[{'sku': 'PI922D01Q-A11', 'url_key': 'pier-one...","[{'key': 'discountRate', 'value': '-20%', 'tra...",clothing,[],"14,99 €","11,99 €",False,False,False,False,


In [7]:
#Find out the total page count in the 1st page data.

page_count = results['pagination']['page_count'] 
page_count

634

In [8]:
#Use a FOR loop to make requests for the additional pages from 2 to page count. 
#Append the data of each additional page to the flatterned data object.

data = pd.DataFrame()


for page in range(page_count):
    url='https://www.zalando.fr/api/catalog/articles?categories=promo-homme&limit=84&offset=' + str(page*84) + '&sort=sale'
    r = requests.get(url, headers=headers)
    result = r.json()
    flat_data2 = json_normalize(result)
    flat_data2 = json_normalize(flat_data2.articles[0]) 
    flat_data.append(flat_data2)
    data = data.append(flat_data2)

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  sort=sort,


In [9]:
data.shape

(53189, 20)

## Bonus

Extract the following information from the data:

* The trending brand.

* The product(s) with the highest discount.

* The sum of discounts of all goods (sum_discounted_prices divided by sum_original_prices).

In [10]:
# your code here
data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 53189 entries, 0 to 16
Data columns (total 20 columns):
amount                                       1186 non-null object
brand_name                                   53189 non-null object
delivery_promises                            53189 non-null object
family_articles                              53189 non-null object
flags                                        53189 non-null object
is_premium                                   53189 non-null bool
media                                        53189 non-null object
name                                         53189 non-null object
outfits                                      1222 non-null object
price.base_price                             200 non-null object
price.has_different_original_prices          53189 non-null bool
price.has_different_prices                   53189 non-null bool
price.has_different_promotional_prices       53189 non-null bool
price.has_discount_on_selected_size

In [12]:
data['brand_name'].value_counts().head(1)

Pier One    2222
Name: brand_name, dtype: int64