# Adquisición de datos para finanzas

## 1. Adquisición de datos a partir de ficheros

### 1.1 Ficheros de texto.

### 1.2. Ficheros separados por coma (CSV).

In [26]:
import pandas as pd

# Separador por defecto ','
invoices_df = pd.read_csv('./data/ecommerce.csv')
print(invoices_df.head())

  InvoiceNo StockCode                          Description  Quantity   
0    536365    85123A   WHITE HANGING HEART T-LIGHT HOLDER         6  \
1    536365     71053                  WHITE METAL LANTERN         6   
2    536365    84406B       CREAM CUPID HEARTS COAT HANGER         8   
3    536365    84029G  KNITTED UNION FLAG HOT WATER BOTTLE         6   
4    536365    84029E       RED WOOLLY HOTTIE WHITE HEART.         6   

      InvoiceDate  UnitPrice  CustomerID         Country  
0  12/1/2010 8:26       2.55     17850.0  United Kingdom  
1  12/1/2010 8:26       3.39     17850.0  United Kingdom  
2  12/1/2010 8:26       2.75     17850.0  United Kingdom  
3  12/1/2010 8:26       3.39     17850.0  United Kingdom  
4  12/1/2010 8:26       3.39     17850.0  United Kingdom  


In [27]:
# Cuando el separador no es ',' hay que especificarlo. Puede ser ';', tabulación, '#' u otros
invoices_semicolon_sep_df = pd.read_csv('./data/ecommerce_semicolon_sep.csv')
print(invoices_semicolon_sep_df.head())

  InvoiceNo;StockCode;Description;Quantity;InvoiceDate;UnitPrice;CustomerID;Country
0  536365;85123A;WHITE HANGING HEART T-LIGHT HOLD...                               
1  536365;71053;WHITE METAL LANTERN;6;12/1/2010 8...                               
2  536365;84406B;CREAM CUPID HEARTS COAT HANGER;8...                               
3  536365;84029G;KNITTED UNION FLAG HOT WATER BOT...                               
4  536365;84029E;RED WOOLLY HOTTIE WHITE HEART.;6...                               


In [28]:
invoices_semicolon_sep_df = pd.read_csv('./data/ecommerce_semicolon_sep.csv', sep=';')
print(invoices_semicolon_sep_df.head())

  InvoiceNo StockCode                          Description  Quantity   
0    536365    85123A   WHITE HANGING HEART T-LIGHT HOLDER         6  \
1    536365     71053                  WHITE METAL LANTERN         6   
2    536365    84406B       CREAM CUPID HEARTS COAT HANGER         8   
3    536365    84029G  KNITTED UNION FLAG HOT WATER BOTTLE         6   
4    536365    84029E       RED WOOLLY HOTTIE WHITE HEART.         6   

      InvoiceDate  UnitPrice  CustomerID         Country  
0  12/1/2010 8:26       2.55     17850.0  United Kingdom  
1  12/1/2010 8:26       3.39     17850.0  United Kingdom  
2  12/1/2010 8:26       2.75     17850.0  United Kingdom  
3  12/1/2010 8:26       3.39     17850.0  United Kingdom  
4  12/1/2010 8:26       3.39     17850.0  United Kingdom  


### 1.3. Ficheros de Excel.

### 1.4. Ficheros JSON.

### 5. Ficheros en formato parquet.

## 2. Adquisición de datos a través de APIs.

In [26]:
import requests
import pandas as pd
from pandas import json_normalize


url = "https://real-time-product-search.p.rapidapi.com/search"

querystring = {"q":"Nike shoes","country":"us","language":"en","limit":"30"}

headers = {
	"X-RapidAPI-Key": "be814bcabbmshc4f57ebcf4b7568p1eb15djsn52335224755f",
	"X-RapidAPI-Host": "real-time-product-search.p.rapidapi.com"
}

response = requests.get(url, headers=headers, params=querystring).json()
print(response)


{'status': 'OK', 'request_id': '17d7eace-781f-4077-b79e-a704b69a87a8', 'data': [{'product_id': '16474445837437288542', 'product_id_v2': '16474445837437288542:10863833567065188839', 'product_title': "Nike Court Borough Low Recraft White/Pink Foam Grade School Girls' Shoes, Size: 6.5", 'product_description': "Run (don't walk) to your new favourite neighbourhood. Built to last, this redesigned legend uses a combination of recycled materials in the upper and outsole for a revamped classic look. A redesigned toe cap and midfoot give your feet extra room to run, jump and play longer.Synthetic leather upper made from a combination of recycled materials.Pivot point in the pattern and grooved channels in the sole to provide strategic flexibility for growing feet.", 'product_photos': ['https://encrypted-tbn2.gstatic.com/shopping?q=tbn:ANd9GcRa9kfx7dco_zS608zwb1d8jDzlvfiIORHQ_ZgmTz6g5zTK7Xl3pvxQmZbOHvCLwCvv8KMDJSQJifNpWha0BICku_rTzwKvAw&usqp=CAE', 'https://encrypted-tbn0.gstatic.com/shopping?q=tb

In [25]:
data_dict = response["data"]
print(data_dict)


[{'product_id': '1895888000104236047', 'product_id_v2': '1895888000104236047:17750431743876774496', 'product_title': 'Nike PS Dunk Low - White / Black 11.5C', 'product_description': "The Nike Dunk Low Retro White Black (PS) sneakers combine iconic style with modern comfort. With its timeless white and black colorway, these sneakers are versatile and perfect for any occasion. The retro design pays homage to the original Nike Dunk, while the low-top silhouette offers a contemporary vibe. Crafted with premium materials, these sneakers provide durability and support. Whether you're hitting the skate park or strolling the streets, the Nike Dunk Low Retro White Black (PS) sneakers will elevate your footwear game.", 'product_photos': ['https://encrypted-tbn2.gstatic.com/shopping?q=tbn:ANd9GcSQa1yIcq2PSPAFale5P3hSHy0ztLtCv6BZlJfehg1BdCY17IzXYSrYa2oQuyh1sXxq2l1fODkh59QrNEiTQRgqmudtN2fx&usqp=CAE', 'https://encrypted-tbn3.gstatic.com/shopping?q=tbn:ANd9GcQZZn2-w-DfKNBSSrtVwvSPeklf-JVDRBlttQ52m3PI

In [24]:
selected_cols = [
    'product_id',
    'product_title',
    'product_rating',
    'typical_price_range',
    'offer'
]
data_df = pd.DataFrame(data_dict)[selected_cols]
print(data_df.head())

Dataframe generado con los datos como diccionario
             product_id                                      product_title  \
0   1895888000104236047             Nike PS Dunk Low - White / Black 11.5C   
1  16474445837437288542  Nike Court Borough Low Recraft White/Pink Foam...   
2   2334515098854897626      Jordan 4 Retro Travis Scott Cactus Jack (F&F)   
3   9068457195677879257  Nike Women's Court Legacy Lift Shoes, Size 10,...   
4   2060730182710679218  Nike Court Vision Low Next Nature White/Pink W...   

   product_rating typical_price_range  \
0             4.5    [$70.00, $87.00]   
1             4.7    [$50.00, $67.00]   
2             4.3  [$11,963, $12,095]   
3             4.5    [$90.00, $90.00]   
4             4.3          [$65, $85]   

                                               offer  
0  {'store_name': 'Nike', 'store_rating': 4.5, 'o...  
1  {'store_name': 'Macy's', 'store_rating': 4.4, ...  
2  {'store_name': 'StockX', 'store_rating': 4.1, ...  
3  {'store_nam

In [28]:
# Aplanar el diccionario dentro de la columna 'datos'
df_aplanado = json_normalize(data_df['offer'])

# Concatenar el DataFrame aplanado con el DataFrame original
df_resultante = pd.concat([data_df, df_aplanado], axis=1)

cols_to_drop = [
    'offer',
    'offer_page_url',
    'store_reviews_page_url',
    'original_price',
    'product_condition',
    'buy_now_url',
    'on_sale',
    'shipping'
]
df_resultante = df_resultante.drop(columns=cols_to_drop, axis=1)
print(df_resultante.head())

             product_id                                      product_title  \
0   1895888000104236047             Nike PS Dunk Low - White / Black 11.5C   
1  16474445837437288542  Nike Court Borough Low Recraft White/Pink Foam...   
2   2334515098854897626      Jordan 4 Retro Travis Scott Cactus Jack (F&F)   
3   9068457195677879257  Nike Women's Court Legacy Lift Shoes, Size 10,...   
4   2060730182710679218  Nike Court Vision Low Next Nature White/Pink W...   

   product_rating typical_price_range       store_name  store_rating  \
0             4.5    [$70.00, $87.00]             Nike           4.5   
1             4.7    [$50.00, $67.00]           Macy's           4.4   
2             4.3  [$11,963, $12,095]           StockX           4.1   
3             4.5    [$90.00, $90.00]  Rack Room Shoes           4.6   
4             4.3          [$65, $85]    Shoe Carnival           4.7   

   store_review_count       price                  tax  
0                1099      $70.00      +$

## 3. Adquisición de datos a través de conexiones a bases de datos (BBDD).

### 3.1 Bases de datos relacionales (SQL).

Se ha generado una base de datos PostgreSQL en https://console.neon.tech/app/projects de forma gratuita para este caso. Se han insertado 18 registros del CSV de ecommerce trabajado previamente.

In [None]:
import pandas as pd
from sqlalchemy import create_engine, URL

url_object = URL.create(
    "postgresql",
    username="ismaelcazalilla",
    password="dxcvRtW4N3KL",
    host="ep-throbbing-haze-36918596.eu-central-1.aws.neon.tech",
    database="adquisicion_datos",
)

# Generamos una instancia de motor de conexión a la base de datos
db_engine = create_engine(url_object)

# Conectamos con la base de datos y lanzamos una query para leer los datos
with db_engine.connect() as conn, conn.begin():  
    df = pd.read_sql_query("SELECT * FROM adquisicion.ecommerce WHERE invoiceno='536365'",con=db_engine)
    print(df.head())




invoiceno      7
stockcode      7
description    7
quantity       7
invoicedate    7
unitprice      7
customerid     7
country        7
dtype: int64
