# API Mercado libre (ML_api)

El objetivo es poder armar un dataset con registros de inmuebles publicados en la página de mercado libre.
Los datos de clasidicados son públicos, detallan precios, condiciones de venta/alquiler, características y zonificación del inmueble.
Estos datos publicados por Mercado Libre (ML) en su página http://www.mercadolibre.com.ar son registrados por propietarios o agentes inmobiliarios para su libre consulta.


Para mayor detalle inspeccionar la pág oficial de ML.

In [1]:
# Imports necesarios
import requests
from bs4 import BeautifulSoup
import re
from lxml import etree
import pandas as pd
import json
from pprint import pprint

### API Mercado libre: _propiedades_.

Vamos a hacer peticiones (GET) a la api pública de mercado libre.

La búsqueda puede ser performada por query (consulta global) o category (dirigida a una categoría):

Performs a search by query

    GET
        /sites/MLA/search?q=

Performs a search by category

    GET
        /sites/MLA/search?category=

Performs a search by seller_id

    GET
        /sites/MLA/search?seller_id=

Performs a search by nickname

    GET
        /sites/MLA/search?nickname=

Performs a search by special filters

    GET
        /sites/MLA/search?special_filter=

Performs a search with a specific sorting method

    GET
        /sites/MLA/search?q=ipod&sort=sortId=

Performs a search applying filters

    GET
        /sites/MLA/search?q=ipod&FilterID=FilterValue

En el header tiene los siguientes parámetros {code} que podemos modificar

Utilizaremos el siguiente métos pero inspeccionaremos el funcionamiento de otras búsquedas.

url:

    https://api.mercadolibre.com/sites/MLA/search?category={code}&_PublishedToday_{code}_&limit={code}&offset={code}

Parámetros por categoría:

    **search?category**
    Tipo de propiedades (in la terminación luego de _ aplica a toda operación 'venta', 'alquiler', etc.)
    casas = 'MLA1466' #_242075_242060'
    campos = 'MLA1496_242059'
    cocheras = 'MLA50541_267198'
    departamentos = 'MLA1472_242062'
    depositos_galpones = 'MLA1475_245003'
    locales = 'MLA79242_242065'
    oficinas = 'MLA50538_242067'
    phs = 'MLA105179_242069'
    quintas = 'MLA50547_242070'
    terrenos_lotes = 'MLA1493_242071'

Tiempo de publicación:

    **_PublishedToday_** 
    ('YES', 'NO')
    
paginación

    **limit**
    máx = 50

    **offset**
    número de paginación
    0, 1 , ..., n

códigos de operación

    venta = 242075, alquiler = 242073, alquiler_temporal = 242074

# Propiedades de interes.

In [2]:
# Hay más, estas son las de interes por category id
property_type = {
    'departamentos': 'MLA1472',
    'casas': 'MLA1466',
    'terrenos_lotes': 'MLA1493',
    'phs': 'MLA105179',
    'locales': 'MLA79242',
    'oficinas': 'MLA50538',
    'depositos_galpones': 'MLA1475',
    'cocheras': 'MLA50541',
    'campos': 'MLA1496',
    'quintas': 'MLA50547',
}

# Estuctura de la respuesta a la API

### Data response:

    {
      "site_id": "MLA",
      "country_default_time_zone": "GMT-03:00",
      "paging": {},
      "results": [],
      "sort": {},
      "available_sorts": [],
      "filters": [],
      "available_filters": []
    }

Referencia al citio y la zona.
    
    "site_id": "MLA"
    "country_default_time_zone": "GMT-03:00"

Información de la búsqueda y paginación de la request.
    
    "paging": {
        "total": 219007,            # Total de registros para búsqueda.
        "primary_results": 1000,    # Registros relevantes?
        "offset": 0,                # Inico de búsqueda del [0 ; total/limit) devuelve "_limit_" registros
        "limit": 50,                # Registros devueltos (máx 50) por petición

Informacion requerida.

    "results": []                   # len("results") = limit, siempre que no sea la última página.

# Inspección de prueba

hacemos una petición de un solo registro para conocer la estructura de la información.

In [3]:
# Prueba
#: params
today = 'YES'
# Parámetros mínimos para inspección de estructura.
limit = 3
offs = 0

In [5]:
# url = "https://api.mercadolibre.com/sites/MLA/search?q=Departamentos&date_from=2021-01-01&date_to=2022-02-01&limit=3"
# r = requests.get(url)
# r.status_code

In [4]:
#'https://api.mercadolibre.com/sites/MLA/search?category=¿?&_PublishedToday_¿?limit=¿?&offset=¿?' # f'search?q=Departamentos'\
api = f'https://api.mercadolibre.com/sites/MLA/'\
        f'search?category={property_type["departamentos"]}'\
        f'&_PublishedToday_{today}'\
        f'&limit={limit}'\
        f'&offset={offs}'

r = requests.get(api)
r.status_code

200

## Observando la respuesta

Buscamos ver como se estructuran los datos para resgistrar solo aquellos que puedan ser de interes.

In [6]:
data = json.loads(r.content)

type(data), len(data), data.keys()

(dict,
 8,
 dict_keys(['site_id', 'country_default_time_zone', 'paging', 'results', 'sort', 'available_sorts', 'filters', 'available_filters']))

Vemos que el json de respuesta contiene 8 entradas.
Veamos que información trae cada una de ellas.

In [7]:
for i, key in enumerate(data):
    # result and avilable_filters: inspección por separado.
    if key != 'results' and key != 'available_filters':
        print(i ,'-> key: ',key)
        pprint(data[key])
        print(40*'-')


0 -> key:  site_id
'MLA'
----------------------------------------
1 -> key:  country_default_time_zone
'GMT-03:00'
----------------------------------------
2 -> key:  paging
{'limit': 3, 'offset': 0, 'primary_results': 1000, 'total': 299136}
----------------------------------------
4 -> key:  sort
{'id': 'relevance', 'name': 'Más relevantes'}
----------------------------------------
5 -> key:  available_sorts
[{'id': 'price_asc', 'name': 'Menor precio'},
 {'id': 'price_desc', 'name': 'Mayor precio'}]
----------------------------------------
6 -> key:  filters
[{'id': 'category',
  'name': 'Categorías',
  'type': 'text',
  'values': [{'id': 'MLA1472',
              'name': 'Departamentos',
              'path_from_root': [{'id': 'MLA1459', 'name': 'Inmuebles'},
                                 {'id': 'MLA1472', 'name': 'Departamentos'}]}]}]
----------------------------------------


## 'available_filters'

In [8]:
type(data['available_filters']), len(data['available_filters'])

(list, 32)

aviable_filter contiene 32 variables.

Diccionarios con 4 llaves. Contiene los filtros hablitados en la variable filter.

    ['id', 'name', 'type', 'values']


In [9]:
# Inspección de 'id', 'name', 'type', 'values'.
for i, ty in enumerate (data['available_filters']):
    print(f'{i} -> id = {ty["id"]}:\n{ty["name"]}[{ty["type"]}]: {ty["values"]}', end="\n")
    print(50*'-')

0 -> id = official_store:
Tiendas oficiales[text]: [{'id': 'all', 'name': 'Todas las tiendas oficiales', 'results': 38387}, {'id': '2973', 'name': 'Tormes Propiedades', 'results': 90}, {'id': '2622', 'name': 'Estudio Yacoub', 'results': 2888}, {'id': '2695', 'name': 'Goldstein Propiedades', 'results': 662}, {'id': '2743', 'name': 'Sistema Coldwell Banker', 'results': 1921}, {'id': '2636', 'name': 'Toribio Achaval', 'results': 1381}, {'id': '2980', 'name': 'KRELL BROKERS', 'results': 361}, {'id': '3035', 'name': 'DUIT Propiedades', 'results': 241}, {'id': '2900', 'name': 'Oppel Inmobiliaria', 'results': 121}, {'id': '2969', 'name': 'Turdo Estudio Inmobiliario', 'results': 1}]
--------------------------------------------------
1 -> id = state:
Ubicación[text]: [{'id': 'TUxBUENBUGw3M2E1', 'name': 'Capital Federal', 'results': 90030}, {'id': 'TUxBUEdSQWU4ZDkz', 'name': 'Bs.As. G.B.A. Norte', 'results': 36001}, {'id': 'TUxBUENPU2ExMmFkMw', 'name': 'Bs.As. Costa Atlántica', 'results': 32511}

#### Resumen _'_available_filter_'_

Para atomizar el pedido a la API de ML se puede aplicar diferentes filtros concatenados utlizando
* Grupo de la búsqueda:
    /search?q=id o /search?category=id 
* Filtros concatendados:
    &FilterID=FilterValue&FilterID=FilterValue  

ejemplo:
URI:
'https://api.mercadolibre.com/sites/MLA/'

| Búsqueda |  q_id_filter| category_id_filter|
| --- | --- | --- |
| propiedades = Departamento | /search?q=Departamentos | /search?category=MLA1472|

URI:
'https://api.mercadolibre.com/sites/MLA/search?category=MLA1472'

| Filtro | filter_key | value | syntax |
| --- | --- | --- | --- |
| Antigüedad = 6 a 25 años | PROPERTY_AGE | [6años-25años] | &PROPERTY_AGE=[6años-25años] |
| Ambientes = 3 y 4 | ROOMS | [3-4] | &ROOMS=[3-4] |
| Ubicación = Capital Federal | state | TUxBUENBUGw3M2E1 | &state=TUxBUENBUGw3M2E1 |


URI:
'https://api.mercadolibre.com/sites/MLA/search?category=MLA1472&PROPERTY_AGE=[6años-25años]&ROOMS=[3-4]&state=TUxBUENBUGw3M2E1'


## Inspección de 'results'

In [10]:
# # results
len(data['results']), type(data['results'])

(3, list)

Es una lista con un solo contenido.
Veamos que tiene un dict con 41 varialbes.

In [11]:
# results, largo, tipo y contenidos.
len(data['results'][0]), type(data['results'][0]), data['results'][0].keys()

(41,
 dict,
 dict_keys(['id', 'site_id', 'title', 'seller', 'price', 'prices', 'sale_price', 'currency_id', 'available_quantity', 'sold_quantity', 'buying_mode', 'listing_type_id', 'stop_time', 'condition', 'permalink', 'thumbnail', 'thumbnail_id', 'accepts_mercadopago', 'installments', 'address', 'promotions', 'shipping', 'seller_address', 'seller_contact', 'location', 'attributes', 'original_price', 'category_id', 'official_store_id', 'domain_id', 'catalog_product_id', 'tags', 'order_backend', 'use_thumbnail_id', 'offer_score', 'offer_share', 'match_score', 'winner_item_id', 'melicoin', 'discounts', 'inventory_id']))

Conte nidos de results[0].

In [12]:
# results
for i, k in enumerate (data['results'][0]):
    if k != 'attributes':
        print(f'{i} -> id = {k}:\nValue = {data["results"][0][k]}')#: {ty["title"]}', end="\n")
        print(50*'-')

0 -> id = id:
Value = MLA1218638438
--------------------------------------------------
1 -> id = site_id:
Value = MLA
--------------------------------------------------
2 -> id = title:
Value = Departamento. Semipiso. 4 Ambientes. Balcon Corrido
--------------------------------------------------
3 -> id = seller:
Value = {'id': 169745983, 'permalink': 'http://perfil.mercadolibre.com.ar/DIMITRI+PROPIEDADES', 'registration_date': '2014-10-28T11:40:40.000-04:00', 'car_dealer': False, 'real_estate_agency': True, 'tags': ['real_estate_agency', 'nsm_low', 'messages_as_seller'], 'home_image_url': 'https://resources.mlstatic.com/classifieds_accounts/MLA_real_estate_agency/169745983_home.png', 'seller_reputation': {'power_seller_status': None, 'level_id': None, 'metrics': {'cancellations': {'period': '365 days', 'rate': 0, 'value': 0}, 'claims': {'period': '365 days', 'rate': 0, 'value': 0}, 'delayed_handling_time': {'period': '365 days', 'rate': 0, 'value': 0}, 'sales': {'period': '365 days', 

### Vemos que hay dentro de '_results_' '_attributes_'

In [13]:
# keys.
list(data['results'][0]['attributes'][0].keys())

['name',
 'value_struct',
 'values',
 'source',
 'value_type',
 'id',
 'value_id',
 'value_name',
 'attribute_group_id',
 'attribute_group_name']

In [14]:
for i, attr in enumerate(data['results'][0]['attributes']):
    print(f'{i} -> id = {attr["id"]}:\n')
    pprint(attr)
    print(50*'-')

0 -> id = HAS_AIR_CONDITIONING:

{'attribute_group_id': 'COMOYAMEN',
 'attribute_group_name': 'Comodidades y amenities',
 'id': 'HAS_AIR_CONDITIONING',
 'name': 'Aire acondicionado',
 'source': 4709975701260268,
 'value_id': '242085',
 'value_name': 'Sí',
 'value_struct': None,
 'value_type': 'boolean',
 'values': [{'id': '242085',
             'name': 'Sí',
             'source': 4709975701260268,
             'struct': None}]}
--------------------------------------------------
1 -> id = HAS_TELEPHONE_LINE:

{'attribute_group_id': 'COMOYAMEN',
 'attribute_group_name': 'Comodidades y amenities',
 'id': 'HAS_TELEPHONE_LINE',
 'name': 'Línea telefónica',
 'source': 4709975701260268,
 'value_id': '242084',
 'value_name': 'No',
 'value_struct': None,
 'value_type': 'boolean',
 'values': [{'id': '242084',
             'name': 'No',
             'source': 4709975701260268,
             'struct': None}]}
--------------------------------------------------
2 -> id = BEDROOMS:

{'attribute_group

In [15]:
data['results'][0]['seller']['real_estate_agency']

True

# Buscando los datos de interes

 id     | header | type | description | location|
 :----: | :----: | :----: | :---- | :----
 0  | id           | [str] | Identificador de la publicación, permite .../items/id_items| data['results'][0]['id']
 1  | star_date    | [dtime] | Inicio de la publicación, solo puedo acceder si ingreso a cada publicación| .../items/id_items
 2  | end_date     | [dtime] | Fecha en la que se programa el fin de la publicación| data['results'][0]['stop_time']
 3  | last_update  | [dtime]| Última actualización del precio publicado           | data['results'][0]['prices']['prices']['last_updated']
 4  | seller_type  | [bool] | Publicado por agente imobiliario o tiendas oficiales| data['results'][0]['seller']['real_estate_agency']
 5  | title        | [str] | Título de la publicación                             | data['results'][0]['title']
 6  | condition    | [str] | Describe si el inmueble es nuevo o usado             | data['results'][0]['condition']
 7  | property_type| [str] | Departamento, casa, PH, casa quinta                  | data['results'][0]['attributes'][{'id':'PROPERTY_TYPE', 'value_name': "n"}]
 8  | rooms        | [int] | Cantidad de ambientes                                | data['results'][0]['attributes'][{'id':'ROOMS', 'value_name': 'n'}]
 9  | bathrooms    | [int] | Cantidad de baños                                    | data['results'][0]['attributes'][{'id':'FULL_BATHROOMS', 'value_name': 'n'}]
 10 | bedrooms     | [int] | Cantidad de dormitorios                              | data['results'][0]['attributes'][{'id':'BEDROOMS', 'value_name': "n"}]
 11 | operation_type | [str] | Alquiler, alquiler temporal, venta                 | data['results'][0]['attributes'][{'id':'OPERATION', 'value_name': "n"}]
 12 | currency     | [str] | Típo de moneda                                       | data['results'][0]['currency_id']
 13 | price        | [float] | Importe publicado del tipo de operación            | data['results'][0]['price']
 14 | surface_covered | [float] | Supeficie cubierta en m²                        | data['results'][0]['attributes'][{'id':'COVERED_AREA', 'value_name': "n"}]
 15 | surface_tota | [float] | Supeficie total en m²                              | data['results'][0]['attributes'][{'id':'TOTAL_AREA', 'value_name': "n"}]
 16 | real_estate_agency | [bool] | Publicado por inmobiliaria                    | data['results'][0]['seller']['real_estate_agency']
 17 | country      | [float] | Identificador del pasí ej. "AR" Argentina          | data['results'][0]['location']['country][name']
 18 | state        | [str] | Región, zona, localidades ej. Bs.As. G.B.A. Oeste    | data['results'][0]['location']['state']['name']
 19 | city         | [str] | Ciudad, Partido ej. La Matanza, Capital Federal      | data['results'][0]['location']['city']['name']
 20 | neighborhood | [str] | Localidad, Barrio, vecindario ej. Ramos Mejía, Belgrano| data['results'][0]['location']['neighborhood']['name']
 21 | lat          | [float] | Latitud                                            | data['results'][0]['location']['latitude']
 22 | lon          | [float] | Longitud                                           | data['results'][0]['location']['longitude']
  

In [34]:
# requests.get("https://api.mercadolibre.com/items/MLA1214335876")
# df = pd.json_normalize()

In [36]:
df = pd.json_normalize(data, "results")

In [62]:
pd.json_normalize(df['attributes'])

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,"{'id': 'HAS_TELEPHONE_LINE', 'value_name': 'Sí...","{'value_name': 'Sí', 'attribute_group_id': 'CO...","{'source': 6316276763983939, 'id': 'BEDROOMS',...","{'source': 6316276763983939, 'value_type': 'nu...","{'value_name': '1', 'attribute_group_id': 'FIN...","{'attribute_group_name': 'Ficha técnica', 'sou...","{'value_type': 'number_unit', 'values': [{'id'...","{'id': 'OPERATION', 'name': 'Operación', 'valu...","{'id': 'PROPERTY_TYPE', 'value_id': '242062', ...","{'id': 'ITEM_CONDITION', 'name': 'Condición de...","{'values': [{'struct': None, 'source': 9423343..."
1,{'attribute_group_name': 'Características adic...,"{'id': 'HAS_AIR_CONDITIONING', 'values': [{'id...","{'attribute_group_name': 'Ficha técnica', 'sou...","{'value_id': None, 'attribute_group_id': 'FIND...","{'value_id': None, 'value_name': '1', 'value_s...","{'source': 7092, 'value_id': None, 'value_stru...","{'source': 7092, 'value_type': 'number_unit', ...","{'value_name': 'Alquiler temporal', 'attribute...","{'name': 'Inmueble', 'value_id': '242062', 'va...","{'id': 'WITH_VIRTUAL_TOUR', 'value_name': 'No'...",
2,"{'value_id': None, 'value_struct': None, 'valu...","{'attribute_group_id': 'FIND', 'attribute_grou...","{'name': 'Baños', 'value_id': None, 'values': ...","{'attribute_group_name': 'Ficha técnica', 'sou...","{'id': 'TOTAL_AREA', 'name': 'Superficie total...","{'attribute_group_id': 'MAIN', 'attribute_grou...","{'name': 'Inmueble', 'value_struct': None, 'va...","{'attribute_group_id': 'OTHERS', 'attribute_gr...","{'attribute_group_id': 'OTHERS', 'source': 942...",,


In [41]:
df.columns.to_list()

['id',
 'site_id',
 'title',
 'price',
 'sale_price',
 'currency_id',
 'available_quantity',
 'sold_quantity',
 'buying_mode',
 'listing_type_id',
 'stop_time',
 'condition',
 'permalink',
 'thumbnail',
 'thumbnail_id',
 'accepts_mercadopago',
 'installments',
 'promotions',
 'attributes',
 'original_price',
 'category_id',
 'official_store_id',
 'domain_id',
 'catalog_product_id',
 'tags',
 'order_backend',
 'use_thumbnail_id',
 'offer_score',
 'offer_share',
 'match_score',
 'winner_item_id',
 'melicoin',
 'discounts',
 'inventory_id',
 'seller.id',
 'seller.permalink',
 'seller.registration_date',
 'seller.car_dealer',
 'seller.real_estate_agency',
 'seller.tags',
 'seller.seller_reputation.power_seller_status',
 'seller.seller_reputation.level_id',
 'seller.seller_reputation.metrics.cancellations.period',
 'seller.seller_reputation.metrics.cancellations.rate',
 'seller.seller_reputation.metrics.cancellations.value',
 'seller.seller_reputation.metrics.claims.period',
 'seller.seller

In [37]:
# data = json.loads(r.content)
#import flatdict

In [30]:
pd.json_normalize(data, record_path = 'results', sep='-').columns.to_list()

['id',
 'site_id',
 'title',
 'price',
 'sale_price',
 'currency_id',
 'available_quantity',
 'sold_quantity',
 'buying_mode',
 'listing_type_id',
 'stop_time',
 'condition',
 'permalink',
 'thumbnail',
 'thumbnail_id',
 'accepts_mercadopago',
 'installments',
 'promotions',
 'attributes',
 'original_price',
 'category_id',
 'official_store_id',
 'domain_id',
 'catalog_product_id',
 'tags',
 'order_backend',
 'use_thumbnail_id',
 'offer_score',
 'offer_share',
 'match_score',
 'winner_item_id',
 'melicoin',
 'discounts',
 'inventory_id',
 'seller-id',
 'seller-permalink',
 'seller-registration_date',
 'seller-car_dealer',
 'seller-real_estate_agency',
 'seller-tags',
 'seller-seller_reputation-power_seller_status',
 'seller-seller_reputation-level_id',
 'seller-seller_reputation-metrics-cancellations-period',
 'seller-seller_reputation-metrics-cancellations-rate',
 'seller-seller_reputation-metrics-cancellations-value',
 'seller-seller_reputation-metrics-claims-period',
 'seller-seller

In [31]:
df = pd.json_normalize(data, "results")

In [49]:
#requests.get(api).json()
#pd.DataFrame.from_dict(df.attributes)['attributes'].to_dict()

In [17]:
df.filter(regex=("prices-*"))

Unnamed: 0,prices.id,prices.prices,prices.presentation.display_currency,prices.payment_method_prices,prices.reference_prices,prices.purchase_discounts
0,MLA1214575464,"[{'id': '1', 'type': 'standard', 'amount': 350...",ARS,[],[],[]
1,MLA1214335876,"[{'id': '4', 'type': 'standard', 'amount': 110...",ARS,[],"[{'id': '5', 'type': 'min_standard', 'conditio...",[]
2,MLA1214352880,"[{'id': '1', 'type': 'standard', 'amount': 120...",ARS,[],[],[]


In [194]:
data[]

{'site_id': 'MLA',
 'country_default_time_zone': 'GMT-03:00',
 'paging': {'total': 300926, 'primary_results': 1000, 'offset': 0, 'limit': 3},
 'results': [{'id': 'MLA1214140056',
   'site_id': 'MLA',
   'title': 'Triplex 3 Amb . 94 M² . Frente . Balcón . Terraza C/parrilla . 2 Cocheras . Muy Luminoso',
   'seller': {'id': 1120816653,
    'permalink': 'http://perfil.mercadolibre.com.ar/REALPROPSREALPROPS',
    'registration_date': '2022-08-09T14:01:54.000-04:00',
    'car_dealer': False,
    'real_estate_agency': True,
    'tags': ['real_estate_agency', 'messages_as_seller'],
    'seller_reputation': {'level_id': None,
     'power_seller_status': None,
     'transactions': {'canceled': 0,
      'completed': 0,
      'period': 'historic',
      'ratings': {'negative': 0, 'neutral': 0, 'positive': 0},
      'total': 0},
     'metrics': {'sales': {'period': '365 days', 'completed': 0},
      'claims': {'period': '365 days', 'rate': 0, 'value': 0},
      'delayed_handling_time': {'period': 

In [321]:
#, meta=['id', 'site_id', 'title', 'seller', 'price', 'prices', 'sale_price', 'currency_id', 'available_quantity', 'sold_quantity', 'buying_mode', 'listing_type_id', 'stop_time', 'condition', 'permalink', 'thumbnail', 'thumbnail_id', 'accepts_mercadopago', 'installments', 'address', 'promotions', 'shipping', 'seller_address', 'seller_contact', 'location', 'attributes', 'original_price', 'category_id', 'official_store_id', 'domain_id', 'catalog_product_id', 'tags', 'order_backend', 'use_thumbnail_id', 'offer_score', 'offer_share', 'match_score', 'winner_item_id', 'melicoin', 'discounts', 'inventory_id'])
#pd.json_normalize(data, "prices", ["results"][0])
# ["id", "type", "amount", "regular_amount", "currency_id", "last_updated", "conditions", "exchange_rate_context", "metadata"]
serie = pd.json_normalize(data["results"], sep="-",)['prices-prices'].keys()
pd.json_normalize(serie)

0
1
2


In [322]:
serie

RangeIndex(start=0, stop=3, step=1)

In [250]:
pd.json_normalize(data, "filters", )#, [, ["id", "type", "amount", "regular_amount", "currency_id", "last_updated", "conditions"]])#, errors='ignore')

Unnamed: 0,id,name,type,values
0,category,Categorías,text,"[{'id': 'MLA1472', 'name': 'Departamentos', 'p..."


In [183]:
df["prices.prices"]

0    [{'id': '1', 'type': 'standard', 'amount': 185...
1    [{'id': '1', 'type': 'standard', 'amount': 350...
2    [{'id': '1', 'type': 'standard', 'amount': 420...
Name: prices.prices, dtype: object

In [None]:
for offset in range(0,paginators):
    url = self.ml_url + self.objeto.mercadolibre_id + '&_PublishedToday_YES&limit=50&offset=' + str(offset*50)
    jsdata = self.request_get(url)
    if(jsdata is not None): self.items = self.items + jsdata['results']
self.adapt()

In [None]:
json.dumps(data['results'][0]['attributes'])

In [37]:
import requests
import json
import pandas as pd

url = 'https://api.mercadolibre.com/items/MLA1216652308'
r = requests.get(url)
data = json.loads(r.content)

pd.json_normalize(data, 'attributes')[['name', 'value_name']]

Unnamed: 0,name,value_name
0,Balcón,Sí
1,Dormitorio en suite,Sí
2,Placards,Sí
3,Comedor,Sí
4,Cocina,Sí
5,Living,Sí
6,Terraza,Sí
7,Agua corriente,Sí
8,Línea telefónica,Sí
9,Apto profesional,Sí


In [54]:
atrs = pd.json_normalize(data, 'attributes')[['name', 'value_name']].T
atrs = atrs.rename(columns=atrs.iloc[0]).drop(atrs.index[0])
atrs

Unnamed: 0,Balcón,Dormitorio en suite,Placards,Comedor,Cocina,Living,Terraza,Agua corriente,Línea telefónica,Apto profesional,...,Antigüedad,Ambientes,Superficie total,Número de piso de la unidad,Bodegas,Operación,Inmueble,Número del departamento,Condición del ítem,Tour virtual
value_name,Sí,Sí,Sí,Sí,Sí,Sí,Sí,Sí,Sí,Sí,...,8 años,1,38 m²,4,0,Alquiler,Departamento,C,Usado,No


In [55]:
df = pd.json_normalize(data)
df

Unnamed: 0,id,site_id,title,subtitle,seller_id,category_id,official_store_id,price,base_price,original_price,...,location.address_line,location.zip_code,location.neighborhood.id,location.neighborhood.name,location.city.id,location.city.name,location.state.id,location.state.name,location.country.id,location.country.name
0,MLA1216652308,MLA,Excepcional Monoambiente De Categoría Próximo ...,,29097609,MLA1473,,79000,79000,,...,"Bulnes 2710, Buenos Aires, Argentina",,TUxBQlBBTDI1MTVa,Palermo,TUxBQ0NBUGZlZG1sYQ,Capital Federal,TUxBUENBUGw3M2E1,Capital Federal,AR,Argentina


In [71]:
df.loc[:,df.columns.str.contains('time')]

Unnamed: 0,start_time,stop_time
0,2022-10-28T17:59:10.000Z,2022-12-27T04:00:59.000Z


In [75]:
df.loc[:,df.columns.str.startswith('location')]

Unnamed: 0,location.address_line,location.zip_code,location.neighborhood.id,location.neighborhood.name,location.city.id,location.city.name,location.state.id,location.state.name,location.country.id,location.country.name
0,"Bulnes 2710, Buenos Aires, Argentina",,TUxBQlBBTDI1MTVa,Palermo,TUxBQ0NBUGZlZG1sYQ,Capital Federal,TUxBUENBUGw3M2E1,Capital Federal,AR,Argentina


In [116]:
df.loc[:, df.columns.str.startswith('seller'),]

Index(['seller_id', 'seller_address.city.id', 'seller_address.city.name',
       'seller_address.state.id', 'seller_address.state.name',
       'seller_address.country.id', 'seller_address.country.name',
       'seller_address.search_location.neighborhood.id',
       'seller_address.search_location.neighborhood.name',
       'seller_address.search_location.city.id',
       'seller_address.search_location.city.name',
       'seller_address.search_location.state.id',
       'seller_address.search_location.state.name', 'seller_address.id',
       'seller_contact.contact', 'seller_contact.other_info',
       'seller_contact.country_code', 'seller_contact.area_code',
       'seller_contact.phone', 'seller_contact.country_code2',
       'seller_contact.area_code2', 'seller_contact.phone2',
       'seller_contact.email', 'seller_contact.webpage'],
      dtype='object')

In [24]:

df.drop(columns=['site_id', 'site_id', 'location.neighborhood.id', 'location.city.id', 'location.state.id', 'location.country.id'], inplace=True)
df

Unnamed: 0,id,title,subtitle,seller_id,category_id,official_store_id,price,base_price,original_price,currency_id,...,seller_contact.area_code2,seller_contact.phone2,seller_contact.email,seller_contact.webpage,location.address_line,location.zip_code,location.neighborhood.name,location.city.name,location.state.name,location.country.name
0,MLA1216652308,Excepcional Monoambiente De Categoría Próximo ...,,29097609,MLA1473,,79000,79000,,ARS,...,,,,,"Bulnes 2710, Buenos Aires, Argentina",,Palermo,Capital Federal,Capital Federal,Argentina


In [11]:
df.drop(columns=['seller_contact.area_code2', 'seller_contact.phone2', 'seller_contact.email', 'seller_contact.webpage'], inplace=True)
df

Unnamed: 0,id,title,subtitle,seller_id,category_id,official_store_id,price,base_price,original_price,currency_id,...,seller_contact.country_code,seller_contact.area_code,seller_contact.phone,seller_contact.country_code2,location.address_line,location.zip_code,location.neighborhood.name,location.city.name,location.state.name,location.country.name
0,MLA1216652308,Excepcional Monoambiente De Categoría Próximo ...,,29097609,MLA1473,,79000,79000,,ARS,...,,,,,"Bulnes 2710, Buenos Aires, Argentina",,Palermo,Capital Federal,Capital Federal,Argentina


In [25]:
#list(df.filter(regex='seller_contact'))
df.loc[:,df.columns.str.startswith('seller')]

Unnamed: 0,seller_id,seller_address.city.id,seller_address.city.name,seller_address.state.id,seller_address.state.name,seller_address.country.id,seller_address.country.name,seller_address.search_location.neighborhood.id,seller_address.search_location.neighborhood.name,seller_address.search_location.city.id,...,seller_contact.contact,seller_contact.other_info,seller_contact.country_code,seller_contact.area_code,seller_contact.phone,seller_contact.country_code2,seller_contact.area_code2,seller_contact.phone2,seller_contact.email,seller_contact.webpage
0,29097609,TUxBQlBBTDI1MTVa,Palermo,AR-C,Capital Federal,AR,Argentina,TUxBQlBBTDI1MTVa,Palermo,TUxBQ0NBUGZlZG1sYQ,...,,,,,,,,,,


In [26]:
df = df.loc[:,~df.columns.str.startswith('seller')]
df

Unnamed: 0,id,title,subtitle,category_id,official_store_id,price,base_price,original_price,currency_id,initial_quantity,...,shipping.local_pick_up,shipping.free_shipping,shipping.logistic_type,shipping.store_pick_up,location.address_line,location.zip_code,location.neighborhood.name,location.city.name,location.state.name,location.country.name
0,MLA1216652308,Excepcional Monoambiente De Categoría Próximo ...,,MLA1473,,79000,79000,,ARS,1,...,False,False,,False,"Bulnes 2710, Buenos Aires, Argentina",,Palermo,Capital Federal,Capital Federal,Argentina


In [27]:
df.loc[:, df.columns.str.startswith('shipping')]

Unnamed: 0,shipping.mode,shipping.methods,shipping.tags,shipping.dimensions,shipping.local_pick_up,shipping.free_shipping,shipping.logistic_type,shipping.store_pick_up
0,not_specified,[],[],,False,False,,False


In [28]:
df = df.loc[:, ~df.columns.str.startswith('shipping')]
df

Unnamed: 0,id,title,subtitle,category_id,official_store_id,price,base_price,original_price,currency_id,initial_quantity,...,last_updated,health,catalog_listing,channels,location.address_line,location.zip_code,location.neighborhood.name,location.city.name,location.state.name,location.country.name
0,MLA1216652308,Excepcional Monoambiente De Categoría Próximo ...,,MLA1473,,79000,79000,,ARS,1,...,2022-11-03T11:43:32.000Z,0.6,False,[marketplace],"Bulnes 2710, Buenos Aires, Argentina",,Palermo,Capital Federal,Capital Federal,Argentina


In [30]:
df.columns.str.startswith('attributes')

array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False,  True, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False])