Import the necessary libraries

In [1]:
import pandas as pd
import numpy as np
from datetime import datetime, timedelta


Upload the files

In [2]:

auction_df = pd.read_csv('auction_small.csv')
historical_prices_df = pd.read_csv('historical_prices.csv')
print("Files loaded successfully.")
print(f"First rows of auction_df:\n{auction_df.head()}")
print(f"First rows of historical_prices_df:\n{historical_prices_df.head()}")

Files loaded successfully.
First rows of auction_df:
   auction_id  bid_in_gold  buyout_in_gold  unit_price  quantity time_left  \
0   962303457      24.4708         27.1898     27.1898         1     SHORT   
1   962303587      75.5135         83.9039     83.9039         1     SHORT   
2   962303596     260.2195        289.1328    289.1328         1     SHORT   
3   962303611      28.1145         31.2383     31.2383         1     SHORT   
4   962303620     117.5188        130.5764    130.5764         1     SHORT   

   item_id              item_name   quality item_class  ...  \
0    13009        Cow King's Hide      Rare      Armor  ...   
1    24684     Archmage Bracelets  Uncommon      Armor  ...   
2    24695  Bonechewer Chestpiece  Uncommon      Armor  ...   
3     7535          Cabalist Belt  Uncommon      Armor  ...   
4    36069       Daggercap Jerkin  Uncommon      Armor  ...   

  purchase_price_gold  required_level  item_level  sell_price_gold  \
0                   7        

Convert date columns to datetime type

In [3]:

historical_prices_df['datetime'] = pd.to_datetime(historical_prices_df['datetime'])
auction_df['first_appearance_timestamp'] = pd.to_datetime(auction_df['first_appearance_timestamp'])
print("Datetime conversion completed.")
print(f"First rows of auction_df with converted datetime:\n{auction_df.head()}")


Datetime conversion completed.
First rows of auction_df with converted datetime:
   auction_id  bid_in_gold  buyout_in_gold  unit_price  quantity time_left  \
0   962303457      24.4708         27.1898     27.1898         1     SHORT   
1   962303587      75.5135         83.9039     83.9039         1     SHORT   
2   962303596     260.2195        289.1328    289.1328         1     SHORT   
3   962303611      28.1145         31.2383     31.2383         1     SHORT   
4   962303620     117.5188        130.5764    130.5764         1     SHORT   

   item_id              item_name   quality item_class  ...  \
0    13009        Cow King's Hide      Rare      Armor  ...   
1    24684     Archmage Bracelets  Uncommon      Armor  ...   
2    24695  Bonechewer Chestpiece  Uncommon      Armor  ...   
3     7535          Cabalist Belt  Uncommon      Armor  ...   
4    36069       Daggercap Jerkin  Uncommon      Armor  ...   

  purchase_price_gold  required_level  item_level  sell_price_gold  \
0

Function to calculate the historical average price of the last 7 days

In [28]:

def calculate_historical_price(row):
    item_id = row['item_id']
    end_date = row['first_appearance_timestamp']
    start_date = end_date - timedelta(days=7)
    relevant_prices = historical_prices_df[(historical_prices_df['item_id'] == item_id) &
                                           (historical_prices_df['datetime'] >= start_date) &
                                           (historical_prices_df['datetime'] <= end_date)]
    
    if not relevant_prices.empty:
        print(f"Item ID: {item_id}")
        print(f"Date range: from {start_date} to {end_date}")
        print(f"Number of historical records found: {len(relevant_prices)}")
        print(f"Relevant prices:\n{relevant_prices['price'].values}")
        return relevant_prices['price'].mean()
    else:
        return np.nan


Apply the function to each row of auction_df

In [29]:
print("Calculating the average historical price for the last 7 days for each auction...")
auction_df['historical_price'] = auction_df.apply(calculate_historical_price, axis=1)
print("Calculation of the historical price completed.")
print(f"First rows of auction_df with historical_price:\n{auction_df[['item_id', 'historical_price']].head()}")

Calculando el precio histórico promedio de los últimos 7 días para cada subasta...
Item ID: 13009
Rango de fechas: desde 2023-12-26 00:00:00 hasta 2024-01-02 00:00:00
Número de registros históricos encontrados: 1
Precios relevantes:
[27.1895]
Item ID: 24684
Rango de fechas: desde 2023-12-26 00:00:00 hasta 2024-01-02 00:00:00
Número de registros históricos encontrados: 1
Precios relevantes:
[83.9039]
Item ID: 24695
Rango de fechas: desde 2023-12-26 00:00:00 hasta 2024-01-02 00:00:00
Número de registros históricos encontrados: 1
Precios relevantes:
[239.63945]
Item ID: 7535
Rango de fechas: desde 2023-12-26 00:00:00 hasta 2024-01-02 00:00:00
Número de registros históricos encontrados: 1
Precios relevantes:
[27.72205]
Item ID: 36069
Rango de fechas: desde 2023-12-26 00:00:00 hasta 2024-01-02 00:00:00
Número de registros históricos encontrados: 1
Precios relevantes:
[130.5764]
Item ID: 6419
Rango de fechas: desde 2023-12-26 00:00:00 hasta 2024-01-02 00:00:00
Número de registros históricos 

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  limited_auction_df['historical_price'] = limited_auction_df.apply(calculate_historical_price, axis=1)


calculate the relative difference

In [30]:
print("Calculating the relative difference...")
auction_df['relative_historical_difference'] = (
    auction_df['unit_price'] - auction_df['historical_price']
) / auction_df['historical_price']
print("Calculation of the relative difference completed.")
print(f"First rows of auction_df with relative_historical_difference:\n{auction_df[['item_id', 'relative_historical_difference']].head()}")

Cálculo de la diferencia relativa...
Cálculo de la diferencia relativa completado.
Primeras filas de limited_auction_df con relative_historical_difference:
   item_id  relative_historical_difference
0    13009                        0.000011
1    24684                        0.000000
2    24695                        0.206533
3     7535                        0.126839
4    36069                        0.000000


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  limited_auction_df['relative_historical_difference'] = (


Save the new DataFrame to an updated CSV file

In [24]:

output_path = r'C:\\Users\\julio\\Documents\\work\\auction_classic\\data\\historical\\auction_small_updated.csv'
auction_df.to_csv(output_path, index=False)
print(f"New columns added and file saved as '{output_path}'")

Nuevas columnas añadidas y archivo guardado como 'C:\\Users\\julio\\Documents\\work\\auction_classic\\data\\historical\\auction_small_limited_updated.csv'
