## This is just gpt bullshit, gonna need to change it later



## High-Level Architecture

### 1. **Modules / Files**

| File                      | Purpose                                                               |
| ------------------------- | --------------------------------------------------------------------- |
| `main.py`                 | Orchestrates the full pipeline (scrape → features → predict → email). |
| `scraper.py`              | Contains your `OpenInsiderScraper` class and data saving logic.       |
| `features.py`             | Handles feature creation and data cleaning.                           |
| `model.py`                | Loads and applies the trained XGBoost model.                          |
| `notify.py`               | Sends email alerts or notifications.                                  |
| `config.py`               | Holds paths, email credentials, and thresholds.                       |
| `run_daily.sh` (optional) | A cron-friendly script to run `python main.py` daily.                 |

---

## Workflow Overview

### **Step 1 — Scrape Data**

```python
from scraper import OpenInsiderScraper

scraper = OpenInsiderScraper()
data = scraper.scrape()  # returns a DataFrame of today's trades
```

### **Step 2 — Feature Engineering**

```python
from features import create_features

features_df = create_features(data)
```

* Handle missing values, encode categorical fields, etc.
* Make sure your feature order matches the model’s training columns.

### **Step 3 — Load and Run Model**

```python
from model import load_model, predict_trades

model = load_model("xgboost_model.joblib")
predictions = predict_trades(model, features_df)
```

* `predict_trades` returns a DataFrame with tickers, predictions, and confidence scores.

### **Step 4 — Filter and Format Results**

```python
threshold = 0.7  # e.g., only include predictions above 70% probability
alerts = predictions[predictions["prob_up"] > threshold]
```

### **Step 5 — Send Email Notification**

```python
from notify import send_email

if not alerts.empty:
    send_email(alerts, subject="OpenInsider Daily Trade Alerts")
```

---

## Example `main.py`

```python
from scraper import OpenInsiderScraper
from features import create_features
from model import load_model, predict_trades
from notify import send_email
import pandas as pd
from datetime import datetime

def main():
    print("Starting OpenInsider pipeline...")

    # 1. Scrape data
    scraper = OpenInsiderScraper()
    data = scraper.scrape()

    # 2. Feature engineering
    features = create_features(data)

    # 3. Load model and predict
    model = load_model("xgboost_model.joblib")
    results = predict_trades(model, features)

    # 4. Filter
    alerts = results[results["prob_up"] > 0.7]

    # 5. Send email if any alerts
    if not alerts.empty:
        send_email(alerts, subject=f"OpenInsider Alerts - {datetime.now():%Y-%m-%d}")
        print("Alerts sent.")
    else:
        print("No high-confidence alerts today.")

if __name__ == "__main__":
    main()
```

---

## Email Example (`notify.py`)

```python
import smtplib
from email.mime.text import MIMEText

def send_email(df, subject):
    body = df.to_string(index=False)

    msg = MIMEText(body)
    msg["Subject"] = subject
    msg["From"] = "your_email@gmail.com"
    msg["To"] = "you@gmail.com"

    with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
        server.login("your_email@gmail.com", "app_password_here")
        server.send_message(msg)
```

---

## Automation (Linux/Mac)

You can run it automatically:

```bash
crontab -e
# Run every weekday at 6 PM Melbourne time
0 18 * * 1-5 /path/to/venv/bin/python /path/to/main.py >> /path/to/log.txt 2>&1
```

---

## Summary

The whole pipeline flow:

```
main.py
 ├── scraper.scrape()           → get data
 ├── features.create_features() → build features
 ├── model.predict_trades()     → generate predictions
 ├── filter by prob > X
 └── notify.send_email()        → send summary
```

---

Would you like me to show a **skeleton implementation** for each of those files (`features.py`, `model.py`, `notify.py`, etc.) so you can build it out quickly?
