# The Ultimate Algo Trader (Neurobot 1.0)

## Introduction:
This jupyter notebook contains code for developing and testing a dynamic algorithmic trader based on Python and Machine learning strategy.

## Steps:
1. **Data preperation**: Access historical market data via Alpaca API and preprocess it for analysis.
2. **Strategy creation**: Code algorithmic trading strategy based on "TBD".
3. **Backtesting-Optimization**: Backtest the strategy using historical data and fine-tune parameters for better performance.
4. **Risk/Reward**: Calculate both the risk and the reward based on the entry price, position size, stop-loss and target price
5. **live Trading(Optional)**: Implement the strategy for paper trading on Alpaca.

**Tools and Libaries** 
- Python, Pandas, Numpy, "TBD"

#### Notes: 
- This notebook is for educational and experimental purposes only.

### Imports and Dependencies

In [None]:
import os
import numpy as np
import random
import seaborn as sns
import pandas as pd
import yfinance as yf
import hvplot.pandas
import matplotlib.pyplot as plt
import alpaca_trade_api as tradeapi
from dotenv import load_dotenv
from scipy.interpolate import interp1d
from alpaca_trade_api.rest import REST, TimeFrame
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score, precision_score, recall_score
from sklearn.metrics import mean_squared_error


# Initialize python files and import functions
import stock_data as data
import algo_strategy as strategy
import nn_models as model
import backtesting as backtest

import warnings
warnings.filterwarnings('ignore')


In [None]:
load_dotenv()

### Data Processing and collection

In [None]:
# Fetch list of tickers
# Get select top picks based on monthly highest performing stocks in sp500 
#ticker_list = data.get_clusters_from_sp500(sp500_url = os.getenv("SP500_URL"))

# fetch_stock_data based on top picks from ticker_list
stock_data = data.fetch_stock_data('2018-01-12', '2024-03-14', tickers= ['NVDA'], timeframe='1Day')

In [None]:
# data cleaning and organization
stock_df = stock_data #.rename(columns={'NVDA': 'NVDA close'})
stock_df['Daily Returns'] = stock_df['NVDA']['close'].pct_change()
stock_df = stock_df.dropna()
stock_df

#### Feauture Engineering - Time series Analysis

In [None]:
stock_df['Cumulative Returns'] = (1 + stock_df['Daily Returns']).cumprod()

In [None]:
stock_df['Daily Returns Lagged'] = stock_df['Daily Returns'].shift(-1)

In [None]:
# set window sizes based on strategy
short_window = 5
long_window = 100

stock_df['SMA_Fast'] = stock_df['NVDA']['close'].rolling(window=short_window).mean()
stock_df['SMA_Slow'] = stock_df['NVDA']['close'].rolling(window=long_window).mean()
stock_df["EMA_Fast"] = stock_df['NVDA']["close"].ewm(span=short_window).mean()
stock_df["EMA_Fast"] = stock_df['NVDA']["close"].ewm(span=long_window).mean()
stock_df = stock_df.dropna()
stock_df

# later on -> trial different training windows using DateOffset()

### Algorithim Buy/Sell  Best signal selection

In [None]:
# Initialize Signals 
signals = {
    "signal_one": strategy.strategy_one(stock_data),
    "signal_two": strategy.strategy_two(stock_data),
    "signal_three": strategy.strategy_three(stock_data),
    "signal_four": strategy.strategy_four(stock_data),
    "signal_five": strategy.strategy_five(stock_data) }

# Function for simple winning trading strategy 
def simple_winning(signals):
    best_strategy = max(signals, key=signals.get)
    winning_signal = signals[best_strategy]
    return winning_signal

# Function for strategy that combines signals using a majority vote
def majority_vote(signals):
    buy_signal = sum(1 for signal in signals.values() if signal == 1)    # "Buy"
    sell_signal = sum(1 for signal in signals.values() if signal == -1)  # "Sell"
    return 1 if buy_signal > sell_signal else -1 if sell_signal > buy_signal else 0  # "Buy", "Sell", or "Hold"

# Add signals to stock_df
stock_df['Majority_vote'] = majority_vote(signals)
stock_df['Simple_winning'] = simple_winning_(signals)

# Plot winning signal and hybrid


### Machine Learning Best Model Selection

In [None]:
# Initialize Models
models = {
    "model_one": model.model_one(),
    "model_two": model.model_two(),
    "model_three": model.model_three(),
    "model_four": model.model_four(),
    "model_five": model.model_five() }

def select_best_model(models, X_train, y_train):
    # Define initial best score and best model
    best_model_name = None
    best_model = None
    best_accuracy = 0.0
    best_mse = float('inf')
    all_scores = {}
    all_mses = {}
    
    # Define the scoring metrics you want to use
    scoring = ['accuracy', 'precision', 'recall', 'f1_score']
    
    # Train and evaluate each model using cross-validation
    for name, model in models.items():
       
        metric_scores = {}
        
        # Evaluate model using cross-validation for each scoring metric
        for metric in scoring:
            scores = cross_val_score(model, X_train, y_train, cv=5, scoring=metric)
            
            metric_scores[metric] = scores
            
        # Calculate the mean scores
        mean_scores = {metric: scores.mean() for metric, scores in metric_scores.items()}
        mse_scores = -cross_val_score(model, X_train, y_train, cv=5, scoring='neg_mean_squared_error')
        mean_mse = mse_scores.mean()
        
        # Store the scores for the current model
        all_scores[name] = metric_scores
        all_mses[name] = mse_scores
        
        # Update best model if current model performs better based on accuracy or MSE
        if mean_mse < best_mse or (mean_mse == best_mse and mean_scores['accuracy'] > best_accuracy):
            best_accuracy = mean_scores['accuracy']
            best_mse = mean_mse
            best_model_name = name
            best_model = model
    
    return best_model_name, best_model, all_scores, all_mses

best_model_name, best_model, all_scores, all_mses = select_best_model(models, X_train, y_train)


### Backtesting - Best Model and Best Strategy

### Fundamental Analysis (Predictions and Plotting)

In [None]:
stock_df[['Daily Returns Lagged', 'Daily Returns']].corr()

# maybe not to use lagged strategy 
# consider DMAC? 

In [None]:
stock_df['Cumulative Returns'].plot()

In [None]:
# Print the selected best model name
print("Selected Best Model:", best_model_name)

# Print the cross-validation scores for each model and each metric
for name, scores in all_scores.items():
    print("Model:", name)
    for metric, metric_scores in scores.items():
        print(f"Mean {metric.capitalize()} Score:", metric_scores.mean())

# Print the cross-validation MSE scores for each model
for name, mse_scores in all_mses.items():
    print("Model:", name)
    print("Cross-Validation MSE Scores:", mse_scores)

### Risk management and Rewards

### Logic to place trade (Optional)

In [None]:
# From Algo Trading 3 live 

# Submit order
api.submit_order(
    symbol="META", 
    qty=number_of_shares, 
    side=orderSide, 
    time_in_force="gtc", 
    type="limit", 
    limit_price=limit_amount
