# Stock Chart Pattern Recognition with Deep Learning
## CRISP-DM Methodology

1. Business Understanding

2. Data Understanding (‡∏Ñ‡∏ß‡∏≤‡∏°‡πÄ‡∏Ç‡πâ‡∏≤‡πÉ‡∏à‡πÉ‡∏ô‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•)
‡∏Å‡∏≤‡∏£‡πÄ‡∏Å‡πá‡∏ö‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•: ‡∏£‡∏ß‡∏ö‡∏£‡∏ß‡∏°‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•‡∏£‡∏≤‡∏Ñ‡∏≤‡∏´‡∏∏‡πâ‡∏ô‡∏¢‡πâ‡∏≠‡∏ô‡∏´‡∏•‡∏±‡∏á (OHLC: Open, High, Low, Close) ‡πÅ‡∏•‡∏∞ Volume ‡∏Å‡∏≤‡∏£‡∏™‡∏≥‡∏£‡∏ß‡∏à‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•: ‡∏î‡∏π‡∏•‡∏±‡∏Å‡∏©‡∏ì‡∏∞‡∏Å‡∏£‡∏≤‡∏ü‡πÅ‡∏ó‡πà‡∏á‡πÄ‡∏ó‡∏µ‡∏¢‡∏ô (Candlestick) ‡∏´‡∏£‡∏∑‡∏≠‡∏Å‡∏£‡∏≤‡∏ü‡πÄ‡∏™‡πâ‡∏ô ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏£‡∏∞‡∏ö‡∏∏‡∏ä‡πà‡∏ß‡∏á‡πÄ‡∏ß‡∏•‡∏≤‡∏ó‡∏µ‡πà‡πÄ‡∏Ñ‡∏¢‡πÄ‡∏Å‡∏¥‡∏î HS ‡∏Ç‡∏∂‡πâ‡∏ô‡∏à‡∏£‡∏¥‡∏á‡πÜ ‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡πÉ‡∏ä‡πâ‡πÄ‡∏õ‡πá‡∏ô‡πÅ‡∏ô‡∏ß‡∏ó‡∏≤‡∏á‡πÉ‡∏ô‡∏Å‡∏≤‡∏£‡∏ó‡∏≥ Labeling

üü¶ Cell 1: Install & Import Library

In [None]:
# ‡∏ñ‡πâ‡∏≤‡πÉ‡∏ä‡πâ‡∏Ñ‡∏£‡∏±‡πâ‡∏á‡πÅ‡∏£‡∏Å ‡πÉ‡∏´‡πâ uncomment
# !pip install settrade-v2 cassandra-driver mplfinance tensorflow scikit-learn scipy python-dotenv

import os
import numpy as np
import pandas as pd
from dotenv import load_dotenv

from cassandra.cluster import Cluster
from settrade_v2 import Investor

from scipy.signal import find_peaks
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix

import tensorflow as tf
from tensorflow.keras import layers


In [None]:
üü¶ Cell 2: Load ENV & Connect SETTRADE

In [None]:
load_dotenv()

investor = Investor(
    app_id=os.getenv("SETTRADE_APP_ID"),
    app_secret=os.getenv("SETTRADE_APP_SECRET"),
    broker_id=os.getenv("SETTRADE_BROKER_ID"),
    app_code=os.getenv("SETTRADE_APP_CODE"),
    is_auto_queue=False
)

market = investor.MarketData()
print("‚úÖ Connected to SETTRADE API")


üü¶ Cell 3: Connect Cassandra

In [None]:
cluster = Cluster(['127.0.0.1'])
session = cluster.connect('stock_data')

print("‚úÖ Connected to Cassandra")

üü¶ Cell 4: Fetch OHLCV from SETTRADE

In [None]:
df = market.get_candlestick(
    symbol="PTT",
    timeframe="1D",
    limit=600
)

df.head()


insert_stmt = session.prepare("""
INSERT INTO ohlcv_by_symbol
(symbol, timeframe, candle_time, open, high, low, close, volume)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
""")

for _, r in df.iterrows():
    session.execute(insert_stmt, (
        "PTT", "1D",
        r['datetime'],
        float(r['open']),
        float(r['high']),
        float(r['low']),
        float(r['close']),
        float(r['volume'])
    ))


üü¶ Cell 6: Load Data from Cassandra

In [None]:
rows = session.execute("""
SELECT * FROM ohlcv_by_symbol
WHERE symbol='PTT' AND timeframe='1D'
""")

df = pd.DataFrame(rows).sort_values("candle_time")
df.head()


In [None]:
3. Data Preparation (‡∏Å‡∏≤‡∏£‡πÄ‡∏ï‡∏£‡∏µ‡∏¢‡∏°‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•)
Data Labeling: ‡∏ï‡∏¥‡∏î‡∏õ‡πâ‡∏≤‡∏¢‡∏Å‡∏≥‡∏Å‡∏±‡∏ö (Label) ‡∏ß‡πà‡∏≤‡∏ä‡πà‡∏ß‡∏á‡πÉ‡∏î‡∏Ç‡∏≠‡∏á‡∏Å‡∏£‡∏≤‡∏ü‡∏ó‡∏µ‡πà‡πÄ‡∏õ‡πá‡∏ô "Head", "Left Shoulder", "Right Shoulder" ‡πÅ‡∏•‡∏∞ "Neckline" Image 

Transformation: ‡∏´‡∏≤‡∏Å‡πÉ‡∏ä‡πâ‡πÇ‡∏°‡πÄ‡∏î‡∏•‡∏õ‡∏£‡∏∞‡πÄ‡∏†‡∏ó CNN (Convolutional Neural Network) ‡∏ï‡πâ‡∏≠‡∏á‡πÅ‡∏õ‡∏•‡∏á‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•‡∏ï‡∏±‡∏ß‡πÄ‡∏•‡∏Ç‡∏£‡∏≤‡∏Ñ‡∏≤‡πÉ‡∏´‡πâ‡πÄ‡∏õ‡πá‡∏ô ‡∏£‡∏π‡∏õ‡∏†‡∏≤‡∏û‡∏Å‡∏£‡∏≤‡∏ü (‡πÄ‡∏ä‡πà‡∏ô 2D Candlestick Chart) ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡πÉ‡∏´‡πâ‡πÇ‡∏°‡πÄ‡∏î‡∏•‡∏õ‡∏£‡∏∞‡∏°‡∏ß‡∏•‡∏ú‡∏•‡πÄ‡∏´‡∏°‡∏∑‡∏≠‡∏ô‡∏™‡∏≤‡∏¢‡∏ï‡∏≤‡∏°‡∏ô‡∏∏‡∏©‡∏¢‡πå 

Normalization: ‡∏õ‡∏£‡∏±‡∏ö‡∏™‡∏±‡∏î‡∏™‡πà‡∏ß‡∏ô‡∏£‡∏≤‡∏Ñ‡∏≤ (Scaling) ‡πÉ‡∏´‡πâ‡πÄ‡∏´‡∏°‡∏≤‡∏∞‡∏™‡∏°‡πÄ‡∏û‡∏∑‡πà‡∏≠‡πÉ‡∏´‡πâ‡πÇ‡∏°‡πÄ‡∏î‡∏•‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡∏£‡∏π‡πâ‡πÑ‡∏î‡πâ‡∏î‡∏µ‡∏Ç‡∏∂‡πâ‡∏ô

üü¶ Cell 7: Sliding Window

In [None]:
WINDOW = 60

def create_windows(df):
    windows = []
    for i in range(len(df) - WINDOW):
        windows.append(df.iloc[i:i+WINDOW])
    return windows

windows = create_windows(df)
len(windows)


üü¶ Cell 8: Normalization

In [None]:
def normalize_window(w):
    base = w['close'].iloc[0]
    w[['open','high','low','close']] /= base
    return w


üü¶ Cell 9: Labeling (Rule-based)

In [None]:
def label_hs(close):
    peaks, _ = find_peaks(close, distance=5)
    if len(peaks) < 3:
        return 0

    l, h, r = peaks[-3:]
    return int(close[h] > close[l] and close[h] > close[r])


In [None]:
4. Modeling (‡∏Å‡∏≤‡∏£‡∏™‡∏£‡πâ‡∏≤‡∏á‡πÇ‡∏°‡πÄ‡∏î‡∏•) ‡πÄ‡∏•‡∏∑‡∏≠‡∏Å‡πÇ‡∏°‡πÄ‡∏î‡∏•: ‡πÄ‡∏•‡∏∑‡∏≠‡∏Å‡πÉ‡∏ä‡πâ‡πÇ‡∏Ñ‡∏£‡∏á‡∏™‡∏£‡πâ‡∏≤‡∏á Deep Learning ‡∏ó‡∏µ‡πà‡πÄ‡∏´‡∏°‡∏≤‡∏∞‡∏™‡∏° ‡πÄ‡∏ä‡πà‡∏ô: CNN: ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏ï‡∏£‡∏ß‡∏à‡∏à‡∏±‡∏ö‡∏£‡∏π‡∏õ‡∏ó‡∏£‡∏á‡∏à‡∏≤‡∏Å‡∏†‡∏≤‡∏û‡∏Å‡∏£‡∏≤‡∏ü 

LSTM / GRU: ‡∏´‡∏≤‡∏Å‡∏ï‡πâ‡∏≠‡∏á‡∏Å‡∏≤‡∏£‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•‡πÅ‡∏ö‡∏ö‡∏≠‡∏ô‡∏∏‡∏Å‡∏£‡∏°‡πÄ‡∏ß‡∏•‡∏≤ (Time Series) ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏î‡∏π‡∏Å‡∏≤‡∏£‡πÄ‡∏£‡∏µ‡∏¢‡∏á‡∏ï‡∏±‡∏ß‡∏Ç‡∏≠‡∏á‡∏£‡∏≤‡∏Ñ‡∏≤ 

Training: ‡∏ù‡∏∂‡∏Å‡∏™‡∏≠‡∏ô‡πÇ‡∏°‡πÄ‡∏î‡∏•‡∏î‡πâ‡∏ß‡∏¢‡∏ä‡∏∏‡∏î‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•‡∏ó‡∏µ‡πà‡∏ï‡∏¥‡∏î Label ‡πÑ‡∏ß‡πâ‡πÅ‡∏•‡πâ‡∏ß ‡πÇ‡∏î‡∏¢‡∏≠‡∏≤‡∏à‡πÉ‡∏ä‡πâ‡πÄ‡∏ó‡∏Ñ‡∏ô‡∏¥‡∏Ñ Data Augmentation ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡πÄ‡∏û‡∏¥‡πà‡∏°‡∏à‡∏≥‡∏ô‡∏ß‡∏ô‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•‡∏ï‡∏±‡∏ß‡∏≠‡∏¢‡πà‡∏≤‡∏á ‡πÅ‡∏•‡∏∞ ‡∏à‡∏∞‡πÉ‡∏ä‡πâ Accuracy, Precision, Recall ‡πÅ‡∏•‡∏∞ Confusion Matrix ‡∏ö‡∏ô‡∏ä‡∏∏‡∏î‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏• Validation Set ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡πÄ‡∏õ‡∏£‡∏µ‡∏¢‡∏ö‡πÄ‡∏ó‡∏µ‡∏¢‡∏ö‡∏ß‡πà‡∏≤ Algorithm ‡πÑ‡∏´‡∏ô‡πÉ‡∏´‡πâ‡∏ú‡∏•‡∏•‡∏±‡∏û‡∏ò‡πå‡∏ó‡∏≤‡∏á‡πÄ‡∏ó‡∏Ñ‡∏ô‡∏¥‡∏Ñ‡∏î‡∏µ‡∏ó‡∏µ‡πà‡∏™‡∏∏‡∏î

üü¶ Cell 10: Build Dataset

In [None]:
X, y = [], []

for w in windows:
    w = normalize_window(w.copy())
    X.append(w[['open','high','low','close','volume']].values)
    y.append(label_hs(w['close'].values))

X = np.array(X)
y = np.array(y)

X.shape, y.shape


üü¶ Cell 11: Train / Validation Split

In [None]:
X_train, X_val, y_train, y_val = train_test_split(
    X, y, test_size=0.2, shuffle=False
)


üü¶ Cell 12: LSTM Model

In [None]:
model = tf.keras.Sequential([
    layers.LSTM(64, return_sequences=True, input_shape=(60,5)),
    layers.LSTM(32),
    layers.Dense(1, activation='sigmoid')
])

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

model.summary()


üü¶ Cell 13: Training

In [None]:
history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    epochs=30,
    batch_size=32
)


5. Evaluation (‡∏Å‡∏≤‡∏£‡∏õ‡∏£‡∏∞‡πÄ‡∏°‡∏¥‡∏ô‡∏ú‡∏•)
Technical Metrics: ‡∏ï‡∏£‡∏ß‡∏à‡∏™‡∏≠‡∏ö‡∏Ñ‡πà‡∏≤ Loss, Precision, ‡πÅ‡∏•‡∏∞ Recall ‡∏ß‡πà‡∏≤‡πÇ‡∏°‡πÄ‡∏î‡∏•‡∏ï‡∏£‡∏ß‡∏à‡∏à‡∏±‡∏ö HS ‡∏û‡∏•‡∏≤‡∏î‡∏ö‡πà‡∏≠‡∏¢‡πÅ‡∏Ñ‡πà‡πÑ‡∏´‡∏ô (False Positive) ‡∏´‡∏£‡∏∑‡∏≠‡∏°‡∏≠‡∏á‡∏Ç‡πâ‡∏≤‡∏°‡∏£‡∏π‡∏õ‡πÅ‡∏ö‡∏ö‡∏ó‡∏µ‡πà‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô‡∏à‡∏£‡∏¥‡∏á‡πÑ‡∏õ‡∏´‡∏£‡∏∑‡∏≠‡πÑ‡∏°‡πà Backtesting: ‡∏ô‡∏≥‡πÇ‡∏°‡πÄ‡∏î‡∏•‡πÑ‡∏õ‡∏ó‡∏î‡∏™‡∏≠‡∏ö‡∏Å‡∏±‡∏ö‡∏Ç‡πâ‡∏≠‡∏°‡∏π‡∏•‡∏´‡∏∏‡πâ‡∏ô‡∏à‡∏£‡∏¥‡∏á‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï‡∏ó‡∏µ‡πà‡πÇ‡∏°‡πÄ‡∏î‡∏•‡πÑ‡∏°‡πà‡πÄ‡∏Ñ‡∏¢‡πÄ‡∏´‡πá‡∏ô (Unseen Data) 

üü¶ Cell 14: Evaluation

In [None]:
y_pred = (model.predict(X_val) > 0.5).astype(int)

print(classification_report(y_val, y_pred))
print(confusion_matrix(y_val, y_pred))


üü¶ Cell 15: Backtesting

In [None]:
capital = 100000
position = 0

prices = df['close'].values[-len(y_pred):]

for i, signal in enumerate(y_pred):
    if signal == 1:
        position = -1
    capital += position * (prices[i+1] - prices[i])

capital


6. Deployment (‡∏Å‡∏≤‡∏£‡∏ô‡∏≥‡πÑ‡∏õ‡πÉ‡∏ä‡πâ‡∏á‡∏≤‡∏ô‡∏à‡∏£‡∏¥‡∏á)
System Integration: ‡∏ô‡∏≥‡πÇ‡∏°‡πÄ‡∏î‡∏•‡πÑ‡∏õ‡πÄ‡∏ä‡∏∑‡πà‡∏≠‡∏°‡∏ï‡πà‡∏≠‡∏Å‡∏±‡∏ö‡πÇ‡∏õ‡∏£‡πÅ‡∏Å‡∏£‡∏°‡πÄ‡∏ó‡∏£‡∏î‡∏´‡∏£‡∏∑‡∏≠ Dashboard ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏™‡πà‡∏á‡∏™‡∏±‡∏ç‡∏ç‡∏≤‡∏ì‡πÅ‡∏à‡πâ‡∏á‡πÄ‡∏ï‡∏∑‡∏≠‡∏ô (Alert) ‡πÅ‡∏ö‡∏ö Real-time ‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏û‡∏ö‡∏£‡∏π‡∏õ‡πÅ‡∏ö‡∏ö HS ‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏Å‡πà‡∏≠‡∏ï‡∏±‡∏ß Monitoring: ‡∏ï‡∏¥‡∏î‡∏ï‡∏≤‡∏°‡∏ú‡∏•‡∏Å‡∏≤‡∏£‡∏ó‡∏≥‡∏á‡∏≤‡∏ô‡∏≠‡∏¢‡πà‡∏≤‡∏á‡∏ï‡πà‡∏≠‡πÄ‡∏ô‡∏∑‡πà‡∏≠‡∏á ‡πÄ‡∏û‡∏£‡∏≤‡∏∞‡∏û‡∏§‡∏ï‡∏¥‡∏Å‡∏£‡∏£‡∏°‡∏ï‡∏•‡∏≤‡∏î‡∏´‡∏∏‡πâ‡∏ô‡∏≠‡∏≤‡∏à‡πÄ‡∏õ‡∏•‡∏µ‡πà‡∏¢‡∏ô‡πÅ‡∏õ‡∏•‡∏á‡πÑ‡∏õ‡∏ï‡∏≤‡∏°‡∏™‡∏†‡∏≤‡∏ß‡∏∞‡πÄ‡∏®‡∏£‡∏©‡∏ê‡∏Å‡∏¥‡∏à ‡∏ã‡∏∂‡πà‡∏á‡∏≠‡∏≤‡∏à‡∏ï‡πâ‡∏≠‡∏á‡∏°‡∏µ‡∏Å‡∏≤‡∏£ Retrain ‡πÇ‡∏°‡πÄ‡∏î‡∏•‡πÉ‡∏´‡∏°‡πà‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï

üü¶ Cell 16: Real-time Detection

In [None]:
def realtime_detect(symbol="PTT"):
    df = market.get_candlestick(symbol, "1D", 60)
    w = normalize_window(df.copy())
    X = w[['open','high','low','close','volume']].values
    prob = model.predict(X.reshape(1,60,5))[0][0]
    return prob

realtime_detect("PTT")


üü¶ Cell 17: Save Signal to Cassandra

In [None]:
session.execute("""
INSERT INTO hs_signal
(symbol, timeframe, detect_time, probability)
VALUES (%s, %s, toTimestamp(now()), %s)
""", ("PTT", "1D", float(realtime_detect())))
