# Consumindo dados de uma API para analise de dados

#### Neste projeto estou utilizando uma API que contém uma variaedade de produtos de maquiagem, invocando o método GET consegui trazer essa lista de produtos e criar um DataFrame. Neste Dataframe separei alguns dados que julguei interessante para realizar algumas analises.

## Importando Bibliotecas

In [1]:
import pandas as pd
import requests
import json

## Realizando uma requisição a API para retornar uma lista de produtos.

In [2]:
# Invocando o método get da biblioteca requests
request = requests.get("https://makeup-api.herokuapp.com/api/v1/products.json")
retorno = json.loads(request.content)

In [20]:
# Salvando retorno em um Dataframe
data = pd.DataFrame(retorno)

## Transformando DataFrame

#### Aqui realizo algumas alterações na estrutura original do DataFrame para que eu consiga realizar minhas análises.

In [21]:
# Uma das colunas tem uma lista de valores, aqui consigo transformar os valores destas lista em strings.
data['tag_list'] = data['tag_list'].apply(lambda x: ', '.join(dict.fromkeys(x).keys()))
data.head()

Unnamed: 0,id,brand,name,price,price_sign,currency,image_link,product_link,website_link,description,rating,category,product_type,tag_list,created_at,updated_at,product_api_url,api_featured_image,product_colors
0,1048,colourpop,Lippie Pencil,5.0,$,CAD,https://cdn.shopify.com/s/files/1/1338/0845/co...,https://colourpop.com/collections/lippie-pencil,https://colourpop.com,Lippie Pencil A long-wearing and high-intensit...,,pencil,lip_liner,"cruelty free, Vegan",2018-07-08T23:45:08.056Z,2018-07-09T00:53:23.301Z,https://makeup-api.herokuapp.com/api/v1/produc...,//s3.amazonaws.com/donovanbailey/products/api_...,"[{'hex_value': '#B28378', 'colour_name': 'BFF ..."
1,1047,colourpop,Blotted Lip,5.5,$,CAD,https://cdn.shopify.com/s/files/1/1338/0845/pr...,https://colourpop.com/collections/lippie-stix?...,https://colourpop.com,Blotted Lip Sheer matte lipstick that creates ...,,lipstick,lipstick,"cruelty free, Vegan",2018-07-08T22:01:20.178Z,2018-07-09T00:53:23.287Z,https://makeup-api.herokuapp.com/api/v1/produc...,//s3.amazonaws.com/donovanbailey/products/api_...,"[{'hex_value': '#b72227', 'colour_name': 'Bee'..."
2,1046,colourpop,Lippie Stix,5.5,$,CAD,https://cdn.shopify.com/s/files/1/1338/0845/co...,https://colourpop.com/collections/lippie-stix,https://colourpop.com,"Lippie Stix Formula contains Vitamin E, Mango,...",,lipstick,lipstick,"cruelty free, Vegan",2018-07-08T21:47:49.858Z,2018-07-09T00:53:23.274Z,https://makeup-api.herokuapp.com/api/v1/produc...,//s3.amazonaws.com/donovanbailey/products/api_...,"[{'hex_value': '#F2DEC3', 'colour_name': 'Fair..."
3,1045,colourpop,No Filter Foundation,12.0,$,CAD,https://cdn.shopify.com/s/files/1/1338/0845/pr...,https://colourpop.com/products/no-filter-matte...,https://colourpop.com/products/no-filter-matte...,"Developed for the Selfie Age, our buildable fu...",,liquid,foundation,"cruelty free, Vegan",2018-07-08T18:22:25.273Z,2018-07-09T00:53:23.313Z,https://makeup-api.herokuapp.com/api/v1/produc...,//s3.amazonaws.com/donovanbailey/products/api_...,"[{'hex_value': '#F2DEC3', 'colour_name': 'Fair..."
4,1044,boosh,Lipstick,26.0,$,CAD,https://cdn.shopify.com/s/files/1/1016/3243/pr...,https://www.boosh.ca/collections/all,https://www.boosh.ca/,All of our products are free from lead and hea...,,lipstick,lipstick,"Chemical Free, Organic",2018-07-08T17:32:28.088Z,2018-09-02T22:52:06.669Z,https://makeup-api.herokuapp.com/api/v1/produc...,//s3.amazonaws.com/donovanbailey/products/api_...,"[{'hex_value': '#CB4975', 'colour_name': 'Babs..."


In [22]:
# Removendo colunas com dados que não julgo necessário analisar.
data.drop(columns={'price_sign','currency', 'image_link', 'product_link', 'website_link', 'rating', 'product_api_url','api_featured_image','product_colors'}, inplace=True)

In [6]:
# Identificando tipo de dados e valores nulos/vazios.
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 931 entries, 0 to 930
Data columns (total 10 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   id            931 non-null    int64 
 1   brand         919 non-null    object
 2   name          931 non-null    object
 3   price         917 non-null    object
 4   description   930 non-null    object
 5   category      517 non-null    object
 6   product_type  931 non-null    object
 7   tag_list      931 non-null    object
 8   created_at    931 non-null    object
 9   updated_at    931 non-null    object
dtypes: int64(1), object(9)
memory usage: 72.9+ KB


In [23]:
# Tratando dados nulos/vazios.
data['brand'] = data['brand'].fillna("indefinido")
data['category'] = data['category'].fillna("indefinido")
data['description'] = data['description'].fillna("indefinido")


In [30]:
data.loc[data['tag_list'] == "", 'tag_list'] = 'indefinido'

In [24]:
# Modificando tipo de dado de uma coluna que contém valores númericos.
data['price'] = pd.to_numeric(data['price'])

In [25]:
# Tratando dados nulos/vazios.
media = data['price'].mean()
data['price'] = data['price'].fillna(media)

In [10]:
# Resultado após tratar dados nulos/vazios.
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 931 entries, 0 to 930
Data columns (total 10 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   id            931 non-null    int64  
 1   brand         931 non-null    object 
 2   name          931 non-null    object 
 3   price         931 non-null    float64
 4   description   931 non-null    object 
 5   category      931 non-null    object 
 6   product_type  931 non-null    object 
 7   tag_list      931 non-null    object 
 8   created_at    931 non-null    object 
 9   updated_at    931 non-null    object 
dtypes: float64(1), int64(1), object(8)
memory usage: 72.9+ KB


## Inicio das análises

#### Aqui começo realizando algumas análises dos dados

In [11]:
# Média de preço dos produtos por marca.
media_produto_marca = data[['product_type', 'brand', 'price']].groupby(['product_type', 'brand']).mean()
display(media_produto_marca)

Unnamed: 0_level_0,Unnamed: 1_level_0,price
product_type,brand,Unnamed: 2_level_1
blush,almay,14.490000
blush,anna sui,27.000000
blush,annabelle,7.990000
blush,cargo cosmetics,29.000000
blush,clinique,23.051074
...,...,...
nail_polish,salon perfect,6.990000
nail_polish,sante,19.290000
nail_polish,sinful colours,2.990000
nail_polish,suncoat,15.640000


In [34]:
# Quantidade de produtos por caracteristica e categoria
# EX: product_type = eyeshadow (sombra) -> caracteristicas =  Vegano, Organico -> categoria = creme
caracteristica_produto = data[['product_type', 'category','tag_list', 'id']].groupby(['tag_list','category','product_type']).count()
display(caracteristica_produto)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,id
tag_list,category,product_type,Unnamed: 3_level_1
Canadian,cream,blush,1
Canadian,cream,foundation,2
Canadian,indefinido,bronzer,2
Canadian,indefinido,eyeshadow,2
Canadian,indefinido,lip_liner,1
...,...,...,...
"purpicks, USDA Organic, Organic",,mascara,1
"purpicks, USDA Organic, Organic",cream,blush,1
"purpicks, USDA Organic, Organic",palette,eyeshadow,5
"purpicks, Vegan, Organic",cream,eyeshadow,1


In [13]:
# Quantidade de produtos por categoria.
produto_categoria = data[['product_type', 'category', 'id']].groupby(['product_type', 'category']).count()
display(produto_categoria)

Unnamed: 0_level_0,Unnamed: 1_level_0,id
product_type,category,Unnamed: 2_level_1
blush,cream,12
blush,indefinido,25
blush,powder,41
bronzer,indefinido,67
bronzer,powder,2
eyebrow,indefinido,48
eyebrow,pencil,1
eyeliner,cream,3
eyeliner,gel,3
eyeliner,indefinido,15


In [14]:
# Média de preço dos produtos por categoria.
media_produto_categoria = data[['product_type', 'category', 'price']].groupby(['product_type', 'category']).mean()
display(media_produto_categoria)

Unnamed: 0_level_0,Unnamed: 1_level_0,price
product_type,category,Unnamed: 2_level_1
blush,cream,20.12
blush,indefinido,19.136344
blush,powder,15.048258
bronzer,indefinido,23.417612
bronzer,powder,0.0
eyebrow,indefinido,21.307471
eyebrow,pencil,0.0
eyeliner,cream,9.666667
eyeliner,gel,6.333333
eyeliner,indefinido,20.1
