# Tibia Auction Market Analysis - Price Prediction Model

Building on the data exploration phase, this notebook focuses on creating a predictive model to estimate character auction prices. Using machine learning techniques, I aim to build a reliable price prediction system that can help players and traders assess character value.

## Analysis Goals:
1. **Correlation Analysis** - identifying key price drivers
2. **Feature Engineering** - creating vocation-specific variables  
3. **Model Development** - building and training ML algorithms
4. **Model Evaluation** - testing accuracy and feature importance

## Research Questions:
- Which character attributes have the strongest correlation with price?
- How accurately can we predict auction prices?
- What's the relative importance of different features?
- How do vocation-specific skills impact pricing?

## Expected Outcomes:
- Quantified relationships between character stats and prices
- Trained model capable of price prediction
- Feature importance ranking for pricing decisions


**Libraries imports**

In [2]:
# data manipulation and analysis
import pandas as pd
import numpy as np

# DB connection
import psycopg2
from sqlalchemy import create_engine, text
import os
from dotenv import load_dotenv

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# for ML
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.preprocessing import StandardScaler, LabelEncoder

# statistical analysis
from scipy.stats import pearsonr

print("Libraries imported successfully.")

Libraries imported successfully.


**Connection with database**

In [3]:
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Database connection config from environment variables
DB_CONFIG = {
    'host': os.getenv('DB_HOST', 'localhost'),
    'port': os.getenv('DB_PORT', '5432'),
    'database': os.getenv('DB_NAME', 'auction_data'),
    'user': os.getenv('DB_USER', 'scraper'),
    'password': os.getenv('DB_PASSWORD')
}

# Create connection string
connection_string = f"postgresql://{DB_CONFIG['user']}:{DB_CONFIG['password']}@{DB_CONFIG['host']}:{DB_CONFIG['port']}/{DB_CONFIG['database']}"

# Create engine
engine = create_engine(connection_string)

print("Database connection configured.")


Database connection configured.


In [5]:
# Indexes were created in data_exploration.ipynb

schema_query = """
SELECT 
    a.id AS auction_id,
    a.current_bid,
    a.auction_end,
    c.id AS character_id, 
    c.vocation_id, 
    c.level, 
    c.sex,
    c.achievement_points,
    c.boss_points,
    c.charm_total,
    c.charm_expansion,
    c.prey_slot,
    c.hunting_slot,
    c.transfer,
    c.gems_greater,
    c.outfits_count,
    c.mounts_count,
    c.store_mounts_count,
    c.store_outfits_count,
    c.hirelings_count,
    s.magic, s.axe, s.sword, s.club, s.distance, s.shielding, s.fist,
    w.pvp_type,
    w.battleye,
    w.location
FROM auctions a 
JOIN characters c ON a.character_id = c.id
LEFT JOIN skills s ON c.skills_id = s.id
LEFT JOIN worlds w ON c.world_id = w.id
WHERE a.has_been_bidded = true
    AND a.is_historical = true
"""
auctions_df = pd.read_sql(schema_query, engine)

auctions_df['auction_end_dt'] = pd.to_datetime(auctions_df['auction_end'], unit='s')
auctions_df['auction_month'] = auctions_df['auction_end_dt'].dt.month
auctions_df['auction_day_of_week'] = auctions_df['auction_end_dt'].dt.dayofweek

display(auctions_df.head())
print(auctions_df.info())

Unnamed: 0,auction_id,current_bid,auction_end,character_id,vocation_id,level,sex,achievement_points,boss_points,charm_total,...,club,distance,shielding,fist,pvp_type,battleye,location,auction_end_dt,auction_month,auction_day_of_week
0,1969938,5002,1759500000,1969938,1,508,False,931,5425,6565,...,80.36,13.88,112.91,21.49,Optional,False,BR,2025-10-03 14:00:00,10,4
1,1967415,2000,1759136400,1967415,2,346,False,443,2610,2982,...,12.94,115.01,103.26,16.17,Open,False,,2025-09-29 09:00:00,9,0
2,1919028,268,1753434000,1919028,1,199,False,45,10,240,...,41.12,26.42,104.31,10.24,Optional,False,EU,2025-07-25 09:00:00,7,4
3,1311129,1772,1690966800,1311129,4,326,False,212,20,1045,...,12.96,12.64,37.11,13.63,Open,False,EU,2023-08-02 09:00:00,8,2
4,907809,101,1656639900,907809,1,53,True,19,0,5,...,40.42,24.61,98.18,12.32,Retro Open,False,,2022-07-01 01:45:00,7,4


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 815152 entries, 0 to 815151
Data columns (total 33 columns):
 #   Column               Non-Null Count   Dtype         
---  ------               --------------   -----         
 0   auction_id           815152 non-null  int64         
 1   current_bid          815152 non-null  int64         
 2   auction_end          815152 non-null  int64         
 3   character_id         815152 non-null  int64         
 4   vocation_id          815152 non-null  int64         
 5   level                815152 non-null  int64         
 6   sex                  815152 non-null  bool          
 7   achievement_points   815152 non-null  int64         
 8   boss_points          815152 non-null  int64         
 9   charm_total          815152 non-null  int64         
 10  charm_expansion      815152 non-null  bool          
 11  prey_slot            815152 non-null  bool          
 12  hunting_slot         815152 non-null  bool          
 13  transfer      