# Challenge: Promotions

In this challenge, you'll develop codes to parse and analyze data returned from another API on Zalando such as [Promos homme (Men's Promotions)
](https://www.zalando.fr/promo-homme/) or [Promos femme (Women's Promotions)](https://www.zalando.fr/promo-femme/). The workflow is almost the same as in the guided lesson but you'll work with different data.

## Obtaining the link

Wrote your codes in the cell below to obtain the data from the API endpoint you choose. A recap of the workflow:

1. Examine the webpages and choose one that you want to work with.

1. Use Google Chrome's DevTools to inspect the XHR network requests. Find out the API endpoint that serves data to the webpage.

1. Test the API endpoint in the browser to verify its data.

1. Change the page number offset of the API URL to test if it's working.

In [1]:
url = 'https://www.zalando.fr/api/catalog/articles?categories=promo-homme&limit=84&offset=84&sort=sale'

In [14]:
# your code here
import pandas as pd
import json
import requests
import urllib.request
from pandas.io.json import json_normalize


uf = urllib.request.urlopen(url)
response = uf.read().decode('utf-8')
results = json.loads(response)

flattened_data = json_normalize(results)
flattened_data1 = json_normalize(flattened_data.articles[0])
flattened_data1.head()

## Reading the data

In the next cell, use Python to obtain data from the API endpoint you chose in the previous step. Workflow:

1. Import libraries.

1. Define the initial API endpoint URL.

1. Make request to obtain data of the 1st page. Flatten the data and store it in an empty object variable.

1. Find out the total page count in the 1st page data.

1. Use a FOR loop to make requests for the additional pages from 2 to page count. Append the data of each additional page to the flatterned data object.

1. Print and review the data you obtained.

In [15]:
# your code here

# Get the total number of pages
total_pages=results['pagination']['page_count']

# Your code
df=pd.DataFrame()
for i in range(total_pages):
    k=84*i
    url=f'https://www.zalando.fr/api/catalog/articles?categories=promo-homme&limit=84&offset={k}&sort=sale'
    uf = urllib.request.urlopen(url)
    response = uf.read().decode('utf-8')
    results = json.loads(response)
    flattened_data = json_normalize(results)
    flattened_data1 = json_normalize(flattened_data.articles[0])
    flattened_data1=flattened_data1.set_index('sku')
    df = df.append(flattened_data1)

df

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  sort=sort)


Unnamed: 0_level_0,amount,brand_name,family_articles,flags,is_premium,media,name,outfits,price.base_price,price.has_different_original_prices,price.has_different_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,price.original,price.promotional,product_group,sizes,url_key
sku,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
NI112O074-O11,,Nike Sportswear,"[{'sku': 'NI112O074-O11', 'url_key': 'nike-spo...","[{'key': 'campaign', 'value': 'HOT DROP', 'tra...",False,[{'path': 'NI/11/2O/07/4O/11/NI112O074-O11@2.j...,AIR FORCE 1 '07 LV8 2 - Baskets basses - deser...,,,False,True,True,False,"109,95 €","55,00 €",shoe,"[38.5, 39, 40, 40.5, 41, 42, 42.5, 43, 44, 44....",nike-sportswear-air-force-1-07-lv8-2-baskets-b...
L0642G00E-K11,,Lacoste Sport,"[{'sku': 'L0642G00E-K11', 'url_key': 'lacoste-...","[{'key': 'discountRate', 'value': 'Jusqu’à -50...",False,[{'path': 'L0/64/2G/00/EK/11/L0642G00E-K11@10....,HOODY SH2128 - Sweat à capuche - marine/argent...,,,False,True,True,False,"99,95 €","50,00 €",clothing,"[S, M, L, XL, XXL, 3XL]",lacoste-sport-hoody-sweatshirt-marineargent-ch...
NI112O05I-C11,,Nike Sportswear,"[{'sku': 'NI112O05I-C11', 'url_key': 'nike-spo...","[{'key': 'campaign', 'value': 'HOT DROP', 'tra...",False,[{'path': 'NI/11/2O/05/IC/11/NI112O05I-C11@9.j...,AIR MAX 270 - Baskets basses - anthracite/volt...,,,False,False,False,False,"149,95 €","90,00 €",shoe,"[38.5, 39, 40, 40.5, 41, 42, 42.5, 43, 44, 44....",nike-sportswear-air-max-270-baskets-basses-ni1...
NI122S09X-Q11,,Nike Sportswear,"[{'sku': 'NI122S09X-Q11', 'url_key': 'nike-spo...","[{'key': 'discountRate', 'value': '-35%', 'tra...",False,[{'path': 'NI/12/2S/09/XQ/11/NI122S09X-Q11@7.j...,Sweatshirt - black/white,,,False,False,False,False,"59,95 €","39,00 €",clothing,"[S, M, L, XL, XXL]",nike-sportswear-sweatshirt-blackwhite-ni122s09...
LE222S01V-Q11,,Levi's®,"[{'sku': 'LE222S01V-Q11', 'url_key': 'levisr-g...","[{'key': 'discountRate', 'value': 'Jusqu’à -35...",False,[{'path': 'LE/22/2S/01/VQ/11/LE222S01V-Q11@10....,GRAPHIC CREW - Sweatshirt - logo crew mineral...,,,False,True,True,False,"59,95 €","39,00 €",clothing,"[S, M, L, XL, XXL]",levisr-graphic-crew-sweatshirt-le222s01v-q11
BO122G06X-K11,,BOSS,"[{'sku': 'BO122G06X-K11', 'url_key': 'boss-cas...","[{'key': 'discountRate', 'value': '-25%', 'tra...",True,[{'path': 'BO/12/2G/06/XK/11/BO122G06X-K11@10....,TABER - Jeans fuselé - bright blue,,,False,False,False,False,"119,95 €","90,00 €",clothing,"[30x34, 31x32, 31x34, 32x32, 32x34, 33x32, 33x...",boss-casual-taber-jeans-fusele-bright-blue-bo1...
BO122E027-B12,,BOSS,"[{'sku': 'BO122E027-B12', 'url_key': 'boss-ora...","[{'key': 'discountRate', 'value': '-20%', 'tra...",True,[{'path': 'BO/12/2E/02/7B/12/BO122E027-B12@8.j...,SCHINO - Chino - open beige,,,False,False,False,False,"99,95 €","80,00 €",clothing,"[29x32, 30x32, 30x34, 31x32, 31x34, 32x32, 32x...",boss-orange-slim-pantalon-classique-bo122e027-b12
LE222O03M-A11,,Levi's®,"[{'sku': 'LE222O03M-A11', 'url_key': 'levisr-h...","[{'key': 'discountRate', 'value': '-30%', 'tra...",False,[{'path': 'LE/22/2O/03/MA/11/LE222O03M-A11@4.j...,HOUSEMARK GRAPHIC TEE - T-shirt imprimé - white,,,False,False,False,False,"24,95 €","17,46 €",clothing,"[XS, S, M, L, XL, XXL, 3XL]",levisr-housemark-graphic-tee-t-shirt-imprime-l...
SO222Q0CC-Q11,,s.Oliver,"[{'sku': 'SO222Q0CC-Q11', 'url_key': 'soliver-...","[{'key': 'discountRate', 'value': '-35%', 'tra...",False,[{'path': 'SO/22/2Q/0C/CQ/11/SO222Q0CC-Q11@10....,LANGARM - Pullover - black,,,False,False,False,False,"29,99 €","19,49 €",clothing,"[M, L, XL, XXL]",soliver-langarm-pullover-black-so222q0cc-q11
LA222O02C-K11,,Lacoste,"[{'sku': 'LA222O02C-K11', 'url_key': 'lacoste-...","[{'key': 'discountRate', 'value': '-25%', 'tra...",False,[{'path': 'LA/22/2O/02/CK/11/LA222O02C-K11@7.j...,T-shirt imprimé - navy blue,,,False,False,False,False,"49,95 €","37,46 €",clothing,"[XS, S, M, L, XL, XXL, 3XL]",lacoste-t-shirt-imprime-navy-blue-la222o02c-k11


## Bonus

Extract the following information from the data:

* The trending brand.

* The product(s) with the highest discount.

* The sum of discounts of all goods (sum_discounted_prices divided by sum_original_prices).

In [16]:
# your code here

# Trending Brand

df.brand_name.value_counts().index[0]

'Pier One'

In [18]:
# Products with the highest discounts

df['price.original']=df['price.original'].str.extract('(\d*,\d*)')
df['price.promotional']=df['price.promotional'].str.extract('(\d*,\d*)')

df['price.original'] = [x.replace(',', '.') for x in df['price.original']]
df['price.promotional'] = [x.replace(',', '.') for x in df['price.promotional']]

In [19]:
df['discount_amount']=df['price.original'].astype(float)-df['price.promotional'].astype(float)
df1=df.copy()

In [20]:
# group by discount amount and brand

total_disc=df1.groupby(['brand_name']).sum().discount_amount

In [24]:
maxDiscProduct = total_disc.sort_values(ascending=False).index[0]

print('Product with the highest discount: {}'.format(maxDiscProduct))

Product with the highest discount: Nike Performance


In [55]:
discountSorted = dict(total_disc.sort_values(ascending=False))
print('\nTop ten discounted brands: \n')
print('\nBrand Name:\t\t\t\t\t Sum of Discounts:\n')
for k,v in discountSorted.items():
    print('{:<40}\t{:>20}'.format(k, round(v, 2)))


Top ten discounted brands: 


Brand Name:					 Sum of Discounts:

Nike Performance                        	              4082.0
Puma                                    	             3762.41
Tommy Hilfiger                          	             3684.51
Pier One                                	             3627.02
adidas Performance                      	              3518.5
KIOMI                                   	             2886.54
Superdry                                	             2700.08
Jack & Jones                            	             2644.27
BOSS                                    	             2546.06
YOURTURN                                	             2532.28
G-Star                                  	             2213.53
Dreimaster                              	             2169.97
Nike Sportswear                         	             2166.04
Zign                                    	             1705.29
Bugatti                                 	             1640.56
Evi

In [59]:
df['price.original']

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [67]:
print('The sum of discounts of all goods: {}'.format(round((df['price.original'].astype(float)/df['price.promotional'].astype(float)).sum()), 3))

The sum of discounts of all goods: 10898.0
