# Challenge: Promotions

In this challenge, you'll develop codes to parse and analyze data returned from another API on Zalando such as [Promos homme (Men's Promotions)
](https://www.zalando.fr/promo-homme/) or [Promos femme (Women's Promotions)](https://www.zalando.fr/promo-femme/). The workflow is almost the same as in the guided lesson but you'll work with different data.

## Obtaining the link

Wrote your codes in the cell below to obtain the data from the API endpoint you choose. A recap of the workflow:

1. Examine the webpages and choose one that you want to work with.

1. Use Google Chrome's DevTools to inspect the XHR network requests. Find out the API endpoint that serves data to the webpage.

1. Test the API endpoint in the browser to verify its data.

1. Change the page number offset of the API URL to test if it's working.

In [None]:
# your code here


## Reading the data

In the next cell, use Python to obtain data from the API endpoint you chose in the previous step. Workflow:

1. Import libraries.

1. Define the initial API endpoint URL.

1. Make request to obtain data of the 1st page. Flatten the data and store it in an empty object variable.

1. Find out the total page count in the 1st page data.

1. Use a FOR loop to make requests for the additional pages from 2 to page count. Append the data of each additional page to the flatterned data object.

1. Print and review the data you obtained.

In [1]:
# your code here
import json
import requests
import pandas as pd
from pandas.io.json import json_normalize

In [2]:
url = 'https://www.zalando.fr/api/catalog/articles?categories=promo-homme&limit=84&offset=84&sort=sale'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}


In [3]:
response = requests.get(url,headers=headers)
results = response.json()
results

{'total_count': 41754,
 'pagination': {'page_count': 498, 'current_page': 2, 'per_page': 84},
 'sort': 'sale',
 'articles': [{'sku': 'TOB22O03X-A11',
   'name': 'CONTRAST POCKET TEE - T-shirt imprimé - white',
   'price': {'original': '29,95\xa0€',
    'promotional': '24,00\xa0€',
    'has_different_prices': False,
    'has_different_original_prices': False,
    'has_different_promotional_prices': False,
    'has_discount_on_selected_sizes_only': False},
   'sizes': ['XS', 'M', 'L', 'XL', 'XXL'],
   'url_key': 'tommy-jeans-contrast-pocket-tee-t-shirt-imprime-white-tob22o03x-a11',
   'media': [{'path': 'TO/B2/2O/03/XA/11/TOB22O03X-A11@6.jpg',
     'role': 'DEFAULT',
     'packet_shot': False},
    {'path': 'TO/B2/2O/03/XA/11/TOB22O03X-A11@3.jpg',
     'role': 'HOVER',
     'packet_shot': False}],
   'brand_name': 'Tommy Jeans',
   'is_premium': False,
   'family_articles': [{'sku': 'TOB22O03X-A11',
     'url_key': 'tommy-jeans-contrast-pocket-tee-t-shirt-imprime-white-tob22o03x-a11',
  

In [4]:
#Flatten the data and store it in an empty object variable

flat_data = json_normalize(results)
flat_data


Unnamed: 0,articles,articlesToShow,breadcrumbs,carouselTeaser,categoryTree,collection,contentPositions.entry-point-teasers,contentPositions.in-cat-carousel,contentPositions.in-cat-carousel-fullwidth,contentPositions.in-cat-carousel-mobile,...,total_article_count,total_count,upperInCatTeaser,variants.fullWidthCatalog,variants.hideCategories,variants.mobileLightFilters,variants.myBrandsFilter,variants.outwardTeaserCard,variants.premiumCatalog,wishlist
0,"[{'sku': 'TOB22O03X-A11', 'name': 'CONTRAST PO...",84,"[{'items': [{'label': 'Homme', 'url_key': 'hom...",,"[{'label': 'Promotions', 'id': '9191', 'url_ke...",,"[7, 14, 20, 26]",9,8,6,...,41755,41754,,False,False,False,True,False,False,


In [5]:
flat_data = json_normalize(flat_data.articles[0])
flat_data

Unnamed: 0,brand_name,family_articles,flags,is_premium,media,name,outfits,price.has_different_original_prices,price.has_different_prices,price.has_different_promotional_prices,price.has_discount_on_selected_sizes_only,price.original,price.promotional,product_group,sizes,sku,url_key
0,Tommy Jeans,"[{'sku': 'TOB22O03X-A11', 'url_key': 'tommy-je...","[{'key': 'discountRate', 'value': '-20%', 'tra...",False,[{'path': 'TO/B2/2O/03/XA/11/TOB22O03X-A11@6.j...,CONTRAST POCKET TEE - T-shirt imprimé - white,,False,False,False,False,"29,95 €","24,00 €",clothing,"[XS, M, L, XL, XXL]",TOB22O03X-A11,tommy-jeans-contrast-pocket-tee-t-shirt-imprim...
1,le coq sportif,"[{'sku': 'LE112O020-K11', 'url_key': 'le-coq-s...","[{'key': 'discountRate', 'value': '-65%', 'tra...",False,[{'path': 'LE/11/2O/02/0K/11/LE112O020-K11@12....,ZEPP - Baskets basses - dress blue,,False,False,False,False,"94,95 €","32,95 €",shoe,"[39, 40, 41, 42, 43, 44, 45, 46]",LE112O020-K11,le-coq-sportif-zepp-baskets-basses-dress-blue-...
2,Puma,"[{'sku': 'PU115B00D-Q12', 'url_key': 'puma-icr...","[{'key': 'discountRate', 'value': '-20%', 'tra...",False,[{'path': 'PU/11/5B/00/DQ/12/PU115B00D-Q12@12....,ICRA TRAINER - Baskets basses - black/white,,False,False,False,False,"46,95 €","37,45 €",shoe,"[36, 37.5, 38, 38.5, 40, 40.5, 42, 42.5, 44, 4...",PU115B00D-Q12,puma-icra-trainer-baskets-basses-pu115b00d-q12
3,Mennace,"[{'sku': 'MEF22O01J-A11', 'url_key': 'mennace-...","[{'key': 'discountRate', 'value': '-20%', 'tra...",False,[{'path': 'ME/F2/2O/01/JA/11/MEF22O01J-A11@7.j...,SNAKE PRINT PANNELED TEE - T-shirt imprimé - w...,,False,False,False,False,"35,95 €","28,75 €",clothing,"[XS, S, M, XL]",MEF22O01J-A11,mennace-snake-print-panneled-tee-t-shirt-impri...
4,Only & Sons,"[{'sku': 'OS322P01Q-A11', 'url_key': 'only-and...","[{'key': 'discountRate', 'value': '-30%', 'tra...",False,[{'path': 'OS/32/2P/01/QA/11/OS322P01Q-A11@7.j...,ONSLUCAS - Polo - white,,False,False,False,False,"19,99 €","14,00 €",clothing,"[M, L, XL, XXL]",OS322P01Q-A11,only-and-sons-onslucas-polo-tee-polo-os322p01q...
5,Versace Jeans,"[{'sku': '1VJ12N003-Q11', 'url_key': 'versace-...","[{'key': 'discountRate', 'value': '-50%', 'tra...",False,[{'path': '1V/J1/2N/00/3Q/11/1VJ12N003-Q11@8.j...,Baskets montantes - black,,False,False,False,False,"174,95 €","87,00 €",shoe,"[39, 40, 41, 42, 45]",1VJ12N003-Q11,versace-jeans-baskets-montantes-black-1vj12n00...
6,Fila,"[{'sku': '1FI42D01G-C11', 'url_key': 'fila-flo...","[{'key': 'discountRate', 'value': '-70%', 'tra...",False,[{'path': '1F/I4/2D/01/GC/11/1FI42D01G-C11@4.j...,FLO - T-shirt de sport - ebony heater,,False,False,False,False,"59,95 €","17,95 €",clothing,"[S, M, L, XXL]",1FI42D01G-C11,fila-flo-polo-ebony-heater-1fi42d01g-c11
7,Kings Will Dream,"[{'sku': 'KIE22G00B-K11', 'url_key': 'kings-wi...","[{'key': 'discountRate', 'value': '-60%', 'tra...",False,[{'path': 'KI/E2/2G/00/BK/11/KIE22G00B-K11@6.j...,HAZARD - Jeans Skinny - indigo,,False,False,False,False,"54,95 €","22,00 €",clothing,"[28, 30, 32, 36]",KIE22G00B-K11,kings-will-dream-hazard-jeans-skinny-kie22g00b...
8,Nike SB,"[{'sku': 'NS422S01H-C11', 'url_key': 'nike-sb-...","[{'key': 'discountRate', 'value': '-50%', 'tra...",False,[{'path': 'NS/42/2S/01/HC/11/NS422S01H-C11@9.j...,CREW ICON - Sweatshirt - dark grey heather/black,,False,False,False,False,"54,95 €","27,48 €",clothing,"[XS, S, M, L, XL]",NS422S01H-C11,nike-sb-crew-icon-sweatshirt-dark-grey-heather...
9,YOURTURN,"[{'sku': 'YO122E00V-B11', 'url_key': 'your-tur...","[{'key': 'discountRate', 'value': '-25%', 'tra...",False,[{'path': 'YO/12/2E/00/VB/11/YO122E00V-B11@17....,Pantalon cargo - camel,,False,False,False,False,"34,99 €","26,24 €",clothing,"[28, 30, 31, 32, 33, 34, 36]",YO122E00V-B11,your-turn-pantalon-cargo-camel-yo122e00v-b11


In [6]:
#Find out the total page count in the 1st page data

page_count = results['pagination']['page_count'] 
page_count

498

In [10]:
#Use a FOR loop to make requests for the additional pages from 2 to page count. 
#Append the data of each additional page to the flattened data object.

#creo un dataframe buit i crearem un loop amb tots els passos anteriors per introduir totes les dades al dataframe

data = pd.DataFrame()


for page in range(page_count):
    url='https://www.zalando.fr/api/catalog/articles?categories=promo-homme&limit=84&offset=' + str(page*84) + '&sort=sale'
    r = requests.get(url, headers=headers)
    result = r.json()
    flat_data2 = json_normalize(result)
    flat_data2 = json_normalize(flat_data2.articles[0]) 
    flat_data.append(flat_data2)
    data = data.append(flat_data2)
     

In [11]:
data.shape

(41796, 22)

## Bonus

Extract the following information from the data:

* The trending brand.

* The product(s) with the highest discount.

* The sum of discounts of all goods (sum_discounted_prices divided by sum_original_prices).

In [12]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 41796 entries, 0 to 47
Data columns (total 22 columns):
amount                                          600 non-null object
brand_name                                      41796 non-null object
family_articles                                 41796 non-null object
flags                                           41796 non-null object
is_premium                                      41796 non-null bool
media                                           41796 non-null object
name                                            41796 non-null object
outfits                                         964 non-null object
price.base_price                                52 non-null object
price.has_different_original_prices             41796 non-null bool
price.has_different_prices                      41796 non-null bool
price.has_different_promotional_prices          41796 non-null bool
price.has_discount_on_selected_sizes_only       41796 non-null bool
p

In [15]:
# your code here
#Finding The trending brand

(data['brand_name'].value_counts()).head(1)

YOURTURN    1172
Name: brand_name, dtype: int64

In [16]:
#The product(s) with the highest discount



ValueError: could not convert string to float: '1\xa0189.95\xa0'