# Challenge: Promotions

In this challenge, you'll develop codes to parse and analyze data returned from another API on Zalando such as [Promos homme (Men's Promotions)
](https://www.zalando.fr/promo-homme/) or [Promos femme (Women's Promotions)](https://www.zalando.fr/promo-femme/). The workflow is almost the same as in the guided lesson but you'll work with different data.

## Obtaining the link

Wrote your codes in the cell below to obtain the data from the API endpoint you choose. A recap of the workflow:

1. Examine the webpages and choose one that you want to work with.

1. Use Google Chrome's DevTools to inspect the XHR network requests. Find out the API endpoint that serves data to the webpage.

1. Test the API endpoint in the browser to verify its data.

1. Change the page number offset of the API URL to test if it's working.

## Reading the data

In the next cell, use Python to obtain data from the API endpoint you chose in the previous step. Workflow:

1. Import libraries.

1. Define the initial API endpoint URL.

1. Make request to obtain data of the 1st page. Flatten the data and store it in an empty object variable.

1. Find out the total page count in the 1st page data.

1. Use a FOR loop to make requests for the additional pages from 2 to page count. Append the data of each additional page to the flatterned data object.

1. Print and review the data you obtained.

In [3]:
import json
import urllib
import requests
import pandas as pd
from pandas import json_normalize


In [4]:
url = 'https://www.zalando.es/api/catalog/articles?categories=calzado-hombre&limit=84&offset=0'

In [6]:
response = urllib.request.urlopen(url)
results = json.load(response)

flattened_data = json_normalize(results)
flattened_data1 = json_normalize(flattened_data.articles[0])
flattened_data1.head(2)

Unnamed: 0,sku,name,sizes,url_key,media,brand_name,is_premium,family_articles,flags,product_group,outfits,delivery_promises,price.original,price.promotional,price.has_different_prices,price.has_different_original_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,amount
0,NI112B02W-002,AIR FORCE 1 '07 - Zapatillas - white,"[38.5, 40, 40.5, 41, 42, 42.5, 43, 44, 44.5, 4...",nike-sportswear-air-force-1-07-bajas-blanco-ni...,[{'path': 'spp-media-p1/3bbdd0a4e3d83a49b75d5b...,Nike Sportswear,False,[],[],shoe,"[{'id': 'ymzVwiKKS9u', 'url_key': '/outfits/ym...",[],"99,95 €","99,95 €",False,False,False,False,
1,NI112B02W-802,AIR FORCE 1 '07 - Zapatillas - black,"[38.5, 39, 41, 42, 42.5, 44.5, 45, 45.5, 46, 4...",nike-sportswear-air-force-bajas-ni112b02w-802,[{'path': 'spp-media-p1/140673e1e9b1351cbb01cb...,Nike Sportswear,False,[],[],shoe,,[],"94,95 €","94,95 €",False,False,False,False,


In [10]:
page_count=results['pagination']['page_count']

In [13]:
for i in range(2, page_count):
    limit = 84
    offset= 84*limit
    url=  f'https://www.zalando.es/api/catalog/articles?categories=calzado-hombre&limit={limit}&offset={offset}'
    response = urllib.request.urlopen(url)
    flattened_data.append(json_normalize(results))
    flattened_data1.append(json_normalize(flattened_data.articles[0]))
    print( f"getting page {i}" )
flattened_data1.head(2)

getting page 2
getting page 3
getting page 4
getting page 5
getting page 6
getting page 7
getting page 8
getting page 9
getting page 10
getting page 11
getting page 12
getting page 13
getting page 14
getting page 15
getting page 16
getting page 17
getting page 18
getting page 19
getting page 20
getting page 21
getting page 22
getting page 23
getting page 24
getting page 25
getting page 26
getting page 27
getting page 28
getting page 29
getting page 30
getting page 31
getting page 32
getting page 33
getting page 34
getting page 35
getting page 36
getting page 37
getting page 38
getting page 39
getting page 40
getting page 41
getting page 42
getting page 43
getting page 44
getting page 45
getting page 46
getting page 47
getting page 48
getting page 49
getting page 50
getting page 51
getting page 52
getting page 53
getting page 54
getting page 55
getting page 56
getting page 57
getting page 58
getting page 59
getting page 60
getting page 61
getting page 62
getting page 63
getting page 64


Unnamed: 0,sku,name,sizes,url_key,media,brand_name,is_premium,family_articles,flags,product_group,outfits,delivery_promises,price.original,price.promotional,price.has_different_prices,price.has_different_original_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,amount
0,NI112B02W-002,AIR FORCE 1 '07 - Zapatillas - white,"[38.5, 40, 40.5, 41, 42, 42.5, 43, 44, 44.5, 4...",nike-sportswear-air-force-1-07-bajas-blanco-ni...,[{'path': 'spp-media-p1/3bbdd0a4e3d83a49b75d5b...,Nike Sportswear,False,[],[],shoe,"[{'id': 'ymzVwiKKS9u', 'url_key': '/outfits/ym...",[],"99,95 €","99,95 €",False,False,False,False,
1,NI112B02W-802,AIR FORCE 1 '07 - Zapatillas - black,"[38.5, 39, 41, 42, 42.5, 44.5, 45, 45.5, 46, 4...",nike-sportswear-air-force-bajas-ni112b02w-802,[{'path': 'spp-media-p1/140673e1e9b1351cbb01cb...,Nike Sportswear,False,[],[],shoe,,[],"94,95 €","94,95 €",False,False,False,False,


## Bonus

Extract the following information from the data:

* The trending brand.

* The product(s) with the highest discount.

* The sum of discounts of all goods (sum_discounted_prices divided by sum_original_prices).

In [15]:
# your code here
flattened_data1['brand_name'].value_counts().index[0]

'Nike Sportswear'

In [28]:
flattened_data1['price.original']=flattened_data1['price.original'].str.extract('(\d*,\d*)')
flattened_data1['price.promotional']=flattened_data1['price.promotional'].str.extract('(\d*,\d*)')

flattened_data1['price.original']=flattened_data1['price.original'].str.replace(',','.')
flattened_data1['price.promotional']=flattened_data1['price.promotional'].str.replace(',','.')





0      99.95
1      94.95
2      99.95
3      99.95
4     109.95
       ...  
79     59.95
80     49.95
81    139.95
82    109.95
83     64.95
Name: price.original, Length: 84, dtype: object

In [35]:
flattened_data1['total discount'] = flattened_data1['price.original'].astype(float) -  flattened_data1['price.promotional'].astype(float)

In [42]:
flattened_data1.sort_values('total discount',ascending=False).loc[0]['name']

"AIR FORCE 1 '07 - Zapatillas - white"

In [49]:
flattened_data1['price.promotional'].astype(float).sum() / flattened_data1['price.original'].astype(float).sum()

0.9276267387547691