# AI-Powered Stock Price Prediction: Tata Motors Closing Price Forecasting with Machine Learning



![AI Stock Prediction](./ai_stock.png)



## üìà Overview

This Jupyter Notebook showcases an AI-powered machine learning model designed to predict the closing price of Tata Motors stock. Leveraging historical stock data and utilizing a **Decision Tree Regressor**, this model aims to provide accurate, data-driven forecasts, supporting investors with short-term stock price predictions.



## üîç Project Description

Using historical stock data from **Yahoo Finance**, this model analyzes key stock market indicators, including:

- **Open Price**: The opening price of the stock

- **Low Price**: The lowest price of the stock for the day

- **High Price**: The highest price of the stock for the day

- **Volume**: The trading volume of the stock



These features are used to train a **Decision Tree Regressor** model, which then predicts the closing price, offering insights into potential price movements and aiding decision-making for traders and investors.



## üìä Data Collection

The dataset used for this project is sourced from **Yahoo Finance** via the `yfinance` library. Here‚Äôs a sample code snippet used to download the data:



```python

import yfinance as yf



# Download Tata Motors stock data

data = yf.download('TATAMOTORS.NS', start='YYYY-MM-DD', end='YYYY-MM-DD')



GitHub Link for flask and html code.

https://github.com/OrhFusion/Tata-Motor-Stock-AI.git

In [41]:
import pandas as pd

import numpy as np

import yfinance as yf



from sklearn.model_selection import train_test_split

from sklearn.tree import DecisionTreeRegressor

import joblib

In [2]:
df = yf.download('TATAMOTORS.NS', start='2023-01-01', end='2024-10-29')

[*********************100%***********************]  1 of 1 completed


In [3]:
type(df)

pandas.core.frame.DataFrame

In [None]:
df.to_csv('output.csv', index=False, header=True)

In [19]:
df.head(3)

Price,Adj Close,Close,High,Low,Open,Volume
Ticker,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
2023-01-02 00:00:00+00:00,392.362518,394.799988,396.0,391.0,392.5,10501357
2023-01-03 00:00:00+00:00,391.468109,393.899994,398.350006,393.0,396.0,9431220
2023-01-04 00:00:00+00:00,383.21936,385.600006,394.799988,385.0,394.799988,16121049


In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 448 entries, 2023-01-02 00:00:00+00:00 to 2024-10-28 00:00:00+00:00
Data columns (total 6 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   (Adj Close, TATAMOTORS.NS)  448 non-null    float64
 1   (Close, TATAMOTORS.NS)      448 non-null    float64
 2   (High, TATAMOTORS.NS)       448 non-null    float64
 3   (Low, TATAMOTORS.NS)        448 non-null    float64
 4   (Open, TATAMOTORS.NS)       448 non-null    float64
 5   (Volume, TATAMOTORS.NS)     448 non-null    int64  
dtypes: float64(5), int64(1)
memory usage: 24.5 KB


In [6]:
df.isnull().sum()

Price      Ticker       
Adj Close  TATAMOTORS.NS    0
Close      TATAMOTORS.NS    0
High       TATAMOTORS.NS    0
Low        TATAMOTORS.NS    0
Open       TATAMOTORS.NS    0
Volume     TATAMOTORS.NS    0
dtype: int64

In [None]:
del df['Price']

In [11]:
X = df[['High','Low','Volume']]

y = df['Close']

y

Price,Close
Ticker,TATAMOTORS.NS
Date,Unnamed: 1_level_2
2023-01-02 00:00:00+00:00,394.799988
2023-01-03 00:00:00+00:00,393.899994
2023-01-04 00:00:00+00:00,385.600006
2023-01-05 00:00:00+00:00,386.899994
2023-01-06 00:00:00+00:00,382.000000
...,...
2024-10-22 00:00:00+00:00,879.500000
2024-10-23 00:00:00+00:00,877.650024
2024-10-24 00:00:00+00:00,880.000000
2024-10-25 00:00:00+00:00,864.299988


In [12]:
df.Close

Ticker,TATAMOTORS.NS
Date,Unnamed: 1_level_1
2023-01-02 00:00:00+00:00,394.799988
2023-01-03 00:00:00+00:00,393.899994
2023-01-04 00:00:00+00:00,385.600006
2023-01-05 00:00:00+00:00,386.899994
2023-01-06 00:00:00+00:00,382.000000
...,...
2024-10-22 00:00:00+00:00,879.500000
2024-10-23 00:00:00+00:00,877.650024
2024-10-24 00:00:00+00:00,880.000000
2024-10-25 00:00:00+00:00,864.299988


In [22]:
df = pd.read_csv('output.csv')

In [23]:
df.head(4)

Unnamed: 0,Adj Close,Close,High,Low,Open,Volume
0,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS
1,392.3625183105469,394.79998779296875,396.0,391.0,392.5,10501357
2,391.4681091308594,393.8999938964844,398.3500061035156,393.0,396.0,9431220
3,383.2193603515625,385.6000061035156,394.79998779296875,385.0,394.79998779296875,16121049


In [24]:
df.isnull().sum()

Adj Close    0
Close        0
High         0
Low          0
Open         0
Volume       0
dtype: int64

In [27]:
X = df[['Open','Low','High','Volume']]

y = df['Close']

In [29]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,random_state=42)

In [31]:
tree = DecisionTreeRegressor()

tree.fit(X_train, y_train)

In [33]:
tree.score(X_train, y_train)

1.0

In [35]:
df.head(4)

Unnamed: 0,Adj Close,Close,High,Low,Open,Volume
0,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS,TATAMOTORS.NS
1,392.3625183105469,394.79998779296875,396.0,391.0,392.5,10501357
2,391.4681091308594,393.8999938964844,398.3500061035156,393.0,396.0,9431220
3,383.2193603515625,385.6000061035156,394.79998779296875,385.0,394.79998779296875,16121049


In [43]:
tree.predict([[44,33,46,3344]])



array([382.])

In [40]:
df.shape

(449, 6)

In [42]:
joblib.dump(tree,'model')

['model']