In [2]:
from langchain import hub
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_community.embeddings import HuggingFaceEmbeddings

from langchain_core.prompts import PromptTemplate

### Notebook Loader

In [3]:
from langchain_community.document_loaders import NotebookLoader

loader = NotebookLoader(
    "stock-price-prediction-using-xgboost-prophet-arima.ipynb",
    include_outputs=True,
    max_output_length=50,
    remove_newline=False,
)

documents = loader.load()

### Azure GPT4 LLM

In [5]:
from langchain_core.messages import HumanMessage
from langchain_openai import AzureChatOpenAI
import os
llm = AzureChatOpenAI(
            temperature=0,
            deployment_name=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],
            azure_endpoint=os.environ["AZURE_OPENAI_API_BASE"],
            openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
            openai_api_key=os.environ["AZURE_OPENAI_API_KEY"],

        )

In [8]:
text = documents[0].page_content

## Summary

In [13]:
from langchain_core.prompts import PromptTemplate
from langchain.chains import LLMChain

prompt_template = """Below you have the content of a notebook with a machine learning process, I need you to give me a step by step concise summary of the content.
Focus on Pre Processing, Feature Engineering,Training Model and Evaluation.
"{text}"
CONCISE SUMMARY:"""
PROMPT = PromptTemplate(template=prompt_template, input_variables=["text"])

chain = LLMChain(llm=llm,prompt=PROMPT)
summary=chain.invoke(text)
print(summary['text'])

Preprocessing and Feature Engineering:
1. AAPL stock data is downloaded using the yfinance library.
2. Only the last 3 years of stock data are used, and all columns except 'Close' are dropped.
3. Lag features (lag1 to lag12) are created by shifting the 'Close' value by multiples of the prediction horizon (30 days).
4. Time-based features are created, including hour, day of the week, quarter, month, year, day of the year, day of the month, and week of the year.

Training Model - XGBoost:
1. The dataset is prepared for the XGBoost model by applying the feature engineering functions.
2. The dataset is split into training and test sets using a 70-30 split.
3. Optuna is used for hyperparameter optimization to find the best parameters for the XGBoost model.
4. The best parameters are used to train the final XGBoost model.
5. The model is evaluated on the test set, and the Root Mean Squared Error (RMSE) is reported.

Evaluation - XGBoost:
1. The Mean Absolute Percentage Error (MAPE) is calcul

## QnA

In [13]:
from langchain_core.prompts import PromptTemplate

#Using all the content as context to answer question
notebook_content = documents[0].page_content

prompt = (
    PromptTemplate.from_template("You are an expert Machine Learning Engineer and Data Scientist, analyze the following ML pipeline notebook to answer the provided question.")
+f"""Notebook: {notebook_content.replace('{','(').replace('}',')')} \n\n """
    + "Question: {question} \n"
    + "Helpful Answer: \n"
)
from langchain.chains import LLMChain

chain = LLMChain(llm=llm, prompt=prompt)

In [15]:
response=chain.invoke({"question":"which are all the model tested on my pipeline?"})
print(response['text'])

The models tested in your pipeline are:

1. XGBoost (Extreme Gradient Boosting)
2. Prophet (by Facebook)
3. ARIMA (AutoRegressive Integrated Moving Average)


In [16]:
response=chain.invoke({"question":'Which Feature Engineering where made to build the model, do you have any recommendation to improve it?'})

print(response['text'])

The feature engineering steps taken to build the model include:

1. **Lag Features**: The code creates lag features for the 'Close' column of the stock data. Specifically, it creates 12 lag features, each shifted by a multiple of the `num_days_pred` (30 days). These lag features are intended to capture the temporal dependencies in the stock prices.

2. **Time-based Features**: The code also creates time-based features from the index of the DataFrame, which is a DateTimeIndex. These features include:
   - `hour`: The hour of the timestamp (though this is not relevant for daily stock data and remains 0).
   - `dayofweek`: The day of the week.
   - `quarter`: The quarter of the year.
   - `month`: The month of the year.
   - `year`: The year.
   - `dayofyear`: The day of the year.
   - `dayofmonth`: The day of the month.
   - `weekofyear`: The week of the year.

Recommendations to improve feature engineering:

1. **Remove Irrelevant Features**: Since the stock data is daily, the `hour` fe

In [18]:
response=chain.invoke({"question":'Considering all the ML process, which other ML Model should I test?'})

print(response['text'])

Given the context of the notebook, which is focused on time series forecasting for AAPL stock prices, you have already experimented with XGBoost, Prophet, and ARIMA models. Each of these models has its strengths and weaknesses, and the choice of additional models to test would depend on the specific characteristics of the data and the goals of the forecasting task.

Here are a few other machine learning and statistical models that you could consider testing:

1. **Long Short-Term Memory (LSTM) Networks**: LSTMs are a type of recurrent neural network (RNN) that are well-suited for sequence prediction problems like time series forecasting. They can capture long-term dependencies in time series data, which might be beneficial for stock price predictions.

2. **Gated Recurrent Units (GRUs)**: GRUs are another type of RNN that are similar to LSTMs but with a simpler structure. They can also be effective for time series forecasting and might be faster to train than LSTMs.

3. **Temporal Conv

In [21]:
response=chain.invoke({"question":'Explain the feature importance of my XGBoost Model, and which of them are impacting the most in my predictions'})

print(response['text'])

The feature importance from your XGBoost model indicates how much each feature contributes to the model's predictions. The importance is calculated for each feature by the amount it increases the prediction accuracy or decreases impurity in the model. In XGBoost, feature importance is typically measured by the "gain", "cover", or "frequency" of the features when building the trees.

From the output provided in your notebook, the feature importances are sorted in descending order, with the most important features at the top. Here's a breakdown of the top features and their relative importance:

1. `year`: This feature has the highest importance with a score of approximately 0.508546. This suggests that the year of the stock data is the most significant predictor in your model. It could be capturing long-term trends in the stock price.

2. `lag9`: The second most important feature is `lag9` with a score of about 0.247954. This feature represents the stock's closing price shifted by 9 tim