<a href="https://colab.research.google.com/github/Jaderfonseca/Medical-Diagnostics-with-Bayesian-Reasoning/blob/main/AItrader.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [25]:
%%bash
set -e
mkdir -p ai-trader/data/raw
mkdir -p ai-trader/data/processed
mkdir -p ai-trader/src/backtest
mkdir -p ai-trader/src/features
mkdir -p ai-trader/src/risk
mkdir -p ai-trader/configs
mkdir -p ai-trader/reports
mkdir -p ai-trader/docs
touch ai-trader/README.md


In [26]:
%%writefile ai-trader/configs/baseline.yaml
# Configuração base do AI Trading Agent
market: "BTCUSDT"
timeframe: "15m"

sessions:
  - { start: "09:00", end: "11:00", tz: "America/Los_Angeles" }
  - { start: "13:00", end: "15:00", tz: "America/Los_Angeles" }

risk:
  risk_per_trade_pct: 0.01         # 1% do capital por trade
  daily_loss_limit_pct: 0.03       # 3% do capital por dia

position_sizing:
  atr_period: 14
  atr_sl_mult: 1.2
  tp_mult: 2.0                     # alvo 2R (duas vezes o risco)

data:
  source: "csv"                    # inicialmente vamos usar CSV local
  min_history_days: 365

logging:
  trade_log: "reports/trades.csv"
  equity_log: "reports/equity.csv"


Writing ai-trader/configs/baseline.yaml


In [27]:
%%writefile ai-trader/README.md
# AI Trader — Baseline Setup

## Objetivo
Construir um agente de trading baseado em **price action (candles)** com controle rígido de risco.

## Configurações iniciais
- Ativo: BTC/USDT
- Timeframe: 15 minutos
- Risco por trade: 1% do capital
- Perda diária máxima: 3% do capital
- Position sizing: baseado em ATR(14)
- Sessões operacionais:
  - 09:00–11:00 PT
  - 13:00–15:00 PT

## Estrutura de diretórios
ai-trader/
  ├─ data/
  │   ├─ raw/          # dados brutos (CSV OHLCV)
  │   └─ processed/    # dados tratados
  ├─ src/
  │   ├─ backtest/     # motor de simulação
  │   ├─ features/     # indicadores e price action
  │   ├─ risk/         # funções de sizing e kill-switch
  ├─ configs/          # configs YAML
  ├─ reports/          # logs de trade e equity
  ├─ docs/             # anotações semanais
  └─ README.md


Overwriting ai-trader/README.md


In [28]:
%%bash
echo "Estrutura criada:"
find ai-trader -maxdepth 2 -type d | sort
echo
echo "Arquivo de config:"
sed -n '1,50p' ai-trader/configs/baseline.yaml


Estrutura criada:
ai-trader
ai-trader/configs
ai-trader/data
ai-trader/data/processed
ai-trader/data/raw
ai-trader/docs
ai-trader/reports
ai-trader/src
ai-trader/src/backtest
ai-trader/src/features
ai-trader/src/risk

Arquivo de config:
# Configuração base do AI Trading Agent
market: "BTCUSDT"
timeframe: "15m"

sessions:
  - { start: "09:00", end: "11:00", tz: "America/Los_Angeles" }
  - { start: "13:00", end: "15:00", tz: "America/Los_Angeles" }

risk:
  risk_per_trade_pct: 0.01         # 1% do capital por trade
  daily_loss_limit_pct: 0.03       # 3% do capital por dia

position_sizing:
  atr_period: 14
  atr_sl_mult: 1.2
  tp_mult: 2.0                     # alvo 2R (duas vezes o risco)

data:
  source: "csv"                    # inicialmente vamos usar CSV local
  min_history_days: 365

logging:
  trade_log: "reports/trades.csv"
  equity_log: "reports/equity.csv"


In [29]:
%%bash
set -e
mkdir -p ai-trader/data/raw
mkdir -p ai-trader/data/processed
pip -q install pandas pyarrow


In [32]:
from datetime import datetime, timezone

SYMBOL   = "BTCUSDT"
INTERVAL = "15m"

# 2 anos (como combinamos para o Dia 2)
START = "2023-01-01 00:00:00"  # UTC
END   = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")  # UTC agora


In [35]:
import os, io, zipfile, requests, pandas as pd
from datetime import datetime
from dateutil.relativedelta import relativedelta

BASE = "https://data.binance.vision/data/spot/monthly/klines"

COLS = [
    "open_time","open","high","low","close","volume",
    "close_time","quote_volume","num_trades",
    "taker_base_vol","taker_quote_vol","ignore"
]

def month_range(start_dt: datetime, end_dt: datetime):
    cur = datetime(start_dt.year, start_dt.month, 1)
    endm = datetime(end_dt.year, end_dt.month, 1)
    while cur <= endm:
        yield cur.strftime("%Y-%m")
        cur += relativedelta(months=1)

def download_month(symbol: str, interval: str, ym: str) -> pd.DataFrame | None:
    fn = f"{symbol}-{interval}-{ym}.zip"
    url = f"{BASE}/{symbol}/{interval}/{fn}"
    try:
        r = requests.get(url, timeout=60)
        r.raise_for_status()
    except Exception:
        return None

    z = zipfile.ZipFile(io.BytesIO(r.content))
    inner = [n for n in z.namelist() if n.endswith(".csv")]
    if not inner:
        return None

    with z.open(inner[0]) as f:
        df = pd.read_csv(
            f,
            header=None,
            names=COLS,
            dtype=str,            # lê tudo como string para evitar overflow
            on_bad_lines="skip"   # ignora linhas corrompidas
        )

    # Conversão numérica segura
    for c in ["open","high","low","close","volume","quote_volume","taker_base_vol","taker_quote_vol"]:
        df[c] = pd.to_numeric(df[c], errors="coerce")

    # Converte timestamps (ignora inválidos)
    df["open_time"]  = pd.to_datetime(df["open_time"], unit="ms", utc=True, errors="coerce")
    df["close_time"] = pd.to_datetime(df["close_time"], unit="ms", utc=True, errors="coerce")

    # Remove linhas inválidas
    df = df.dropna(subset=["open_time","close_time"]).reset_index(drop=True)
    return df

def load_binance_vision(symbol: str, interval: str, start_str: str, end_str: str) -> pd.DataFrame:
    start_dt = datetime.strptime(start_str, "%Y-%m-%d %H:%M:%S")
    end_dt   = datetime.strptime(end_str,   "%Y-%m-%d %H:%M:%S")

    frames = []
    for ym in month_range(start_dt, end_dt):
        dfm = download_month(symbol, interval, ym)
        if dfm is None:
            continue
        frames.append(dfm)

    if not frames:
        raise RuntimeError("Nenhum mês encontrado. Verifique símbolo/intervalo/período.")

    df = pd.concat(frames, ignore_index=True)

    # Ordena e remove duplicatas
    df = df.sort_values("open_time").drop_duplicates(subset=["open_time"]).reset_index(drop=True)

    # Recorta exatamente o intervalo pedido
    df = df[(df["open_time"] >= pd.to_datetime(start_str, utc=True)) &
            (df["open_time"] <= pd.to_datetime(end_str,   utc=True))]

    return df


In [39]:
from datetime import datetime, timezone
import os, pandas as pd

# Parâmetros do Dia 2
SYMBOL   = "BTCUSDT"
INTERVAL = "15m"
START    = "2023-01-01 00:00:00"
END      = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")

# Usa as funções que você já definiu (download_month/load_binance_vision)
raw = load_binance_vision(SYMBOL, INTERVAL, START, END)

# Renomeia e valida
df = raw.rename(columns={"open_time": "timestamp"})
assert df["timestamp"].dt.tz is not None, "timestamp precisa estar em UTC."

expected_delta = pd.Timedelta(minutes=15)
deltas = df["timestamp"].diff().dropna()
gaps = int((deltas != expected_delta).sum())

print("📊 Resumo dos dados")
print(f"- Linhas:    {len(df):,}")
print(f"- Período:   {df['timestamp'].min()}  →  {df['timestamp'].max()}")
print(f"- Gaps 15m:  {gaps}")
print(f"- Duplicatas:{len(df) - len(df.drop_duplicates('timestamp'))}")

# Salvar
os.makedirs("ai-trader/data/raw", exist_ok=True)
os.makedirs("ai-trader/data/processed", exist_ok=True)

ohclv = df[["timestamp","open","high","low","close","volume","close_time","num_trades"]]

raw_csv      = "ai-trader/data/raw/btcusdt_15m_raw.csv"
proc_csv     = "ai-trader/data/processed/btcusdt_15m.csv"
proc_parquet = "ai-trader/data/processed/btcusdt_15m.parquet"

ohclv.to_csv(raw_csv, index=False)

processed = ohclv.drop(columns=["close_time"]).sort_values("timestamp").reset_index(drop=True)
processed.to_csv(proc_csv, index=False)
processed.to_parquet(proc_parquet, index=False)

print("\n✅ Salvo com sucesso:")
print("-", raw_csv)
print("-", proc_csv)
print("-", proc_parquet)


  df["open_time"]  = pd.to_datetime(df["open_time"], unit="ms", utc=True, errors="coerce")
  df["close_time"] = pd.to_datetime(df["close_time"], unit="ms", utc=True, errors="coerce")
  df["open_time"]  = pd.to_datetime(df["open_time"], unit="ms", utc=True, errors="coerce")
  df["close_time"] = pd.to_datetime(df["close_time"], unit="ms", utc=True, errors="coerce")
  df["open_time"]  = pd.to_datetime(df["open_time"], unit="ms", utc=True, errors="coerce")
  df["close_time"] = pd.to_datetime(df["close_time"], unit="ms", utc=True, errors="coerce")
  df["open_time"]  = pd.to_datetime(df["open_time"], unit="ms", utc=True, errors="coerce")
  df["close_time"] = pd.to_datetime(df["close_time"], unit="ms", utc=True, errors="coerce")
  df["open_time"]  = pd.to_datetime(df["open_time"], unit="ms", utc=True, errors="coerce")
  df["close_time"] = pd.to_datetime(df["close_time"], unit="ms", utc=True, errors="coerce")
  df["open_time"]  = pd.to_datetime(df["open_time"], unit="ms", utc=True, errors="coe

📊 Resumo dos dados
- Linhas:    70,171
- Período:   2023-01-01 00:00:00+00:00  →  2024-12-31 23:45:00+00:00
- Gaps 15m:  1
- Duplicatas:0

✅ Salvo com sucesso:
- ai-trader/data/raw/btcusdt_15m_raw.csv
- ai-trader/data/processed/btcusdt_15m.csv
- ai-trader/data/processed/btcusdt_15m.parquet


In [40]:
import os, pandas as pd

paths = [
    "ai-trader/data/raw/btcusdt_15m_raw.csv",
    "ai-trader/data/processed/btcusdt_15m.csv",
    "ai-trader/data/processed/btcusdt_15m.parquet",
]

for p in paths:
    ok = os.path.exists(p)
    size = os.path.getsize(p) if ok else 0
    print(f"{'OK' if ok else 'X '}: {p} ({size/1_000_000:.2f} MB)")

# Amostra dos dados processados
df_check = pd.read_csv("ai-trader/data/processed/btcusdt_15m.csv", nrows=5)
print("\nPrévia do processed:")
print(df_check)


OK: ai-trader/data/raw/btcusdt_15m_raw.csv (7.70 MB)
OK: ai-trader/data/processed/btcusdt_15m.csv (5.38 MB)
OK: ai-trader/data/processed/btcusdt_15m.parquet (3.60 MB)

Prévia do processed:
                   timestamp      open      high       low     close  \
0  2023-01-01 00:00:00+00:00  16541.77  16544.76  16520.00  16520.69   
1  2023-01-01 00:15:00+00:00  16521.26  16545.70  16517.72  16544.19   
2  2023-01-01 00:30:00+00:00  16544.19  16544.61  16508.39  16515.43   
3  2023-01-01 00:45:00+00:00  16515.91  16536.84  16515.43  16529.67   
4  2023-01-01 01:00:00+00:00  16529.59  16541.80  16525.78  16538.21   

       volume  num_trades  
0  1172.53835       37484  
1  1102.62888       33528  
2  1365.65633       48518  
3   724.01214       30324  
4   977.24680       32200  
