# **BACKGROUND OF THE INDUSTRY:**

The investment landscape has evolved significantly over the years, driven by advancements in technology, changing market dynamics, and shifting investor preferences. Traditional investment approaches often relied on static asset allocation strategies that did not adapt to individual risk profiles or market conditions. However, with the advent of data analytics, machine learning, and financial modeling techniques, there is a growing opportunity to provide personalized investment advice tailored to investors' unique needs.

## **INTRODUCTION:**

In today's complex financial markets, investors face the challenge of optimizing their portfolios to achieve long-term financial goals while managing risk effectively. Without personalized guidance, investors may struggle to navigate the multitude of investment options available, leading to suboptimal outcomes. This underscores the need for a sophisticated recommendation system that can analyze individual risk tolerances, investment horizons, and market conditions to provide tailored portfolio recommendations.

## **PROBLEM STATEMENT:**

**Prevailing Circumstance:** 
Currently, many investors lack access to personalized investment advice tailored to their risk tolerance levels and financial objectives. They often resort to generic investment strategies that may not align with their individual needs, leading to potential underperformance or excessive risk exposure.

**Problem We're Trying to Solve:** 
Our project aims to address this gap by developing an intelligent recommendation system that leverages machine learning algorithms and financial modeling techniques to provide personalized portfolio recommendations. By considering an investor's risk tolerance, investment horizon, and market conditions, we aim to optimize portfolio allocations and compounding strategies to maximize returns while mitigating risk.

**How the Project Aims to Solve the Problem:**
Through data-driven analysis and advanced algorithms, our project seeks to empower investors with actionable insights that align with their unique financial goals and risk preferences. By harnessing historical stock price data, financial statements, and market indices, we aim to build a robust recommendation system capable of dynamically adjusting portfolio allocations and compounding strategies to optimize long-term wealth accumulation.

## **OBJECTIVES:**

**Main Objective:** 
To develop a personalized recommendation system that optimizes portfolio allocations and compounding strategies based on individual risk tolerances and investment horizons.

**Specific Objectives:**
1. Utilize machine learning algorithms to analyze historical stock price data, financial statements, and market indices to assess risk profiles and identify suitable investment opportunities.
2. Develop dynamic portfolio optimization techniques that adjust asset allocations based on changing market conditions and individual preferences.
3. Implement compounding strategies to maximize long-term wealth accumulation and enhance portfolio returns.


## **BUSINESS UNDERSTANDING:**

**Stakeholders:**
- Investors: Individuals seeking personalized investment advice tailored to their risk tolerance levels and financial objectives.
- Financial Advisors: Professionals looking to enhance client relationships by offering sophisticated portfolio recommendations based on data-driven insights.
- Financial Institutions: Entities aiming to differentiate their services and attract clients through innovative investment solutions.

**Metric of Success:**
- Portfolio Return vs. Benchmark: Aim for a 3% annual outperformance against a suitable benchmark index, such as the S&P 500.
- Risk-adjusted Returns: Target a Sharpe ratio above 0.8 to ensure superior risk-adjusted performance compared to the benchmark.
- Portfolio Diversification: Ensure at least 90% of portfolios meet diversification standards by being well-diversified across different asset classes and industries.

## **DATA UNDERSTANDING:**

The project will gather historical stock price data, financial statements, and market indices from various reliable sources such as financial databases, APIs, and market research firms. Detailed analysis will be conducted to understand the underlying trends, patterns, and correlations within the data to inform the development of the recommendation system. Additionally, data preprocessing techniques will be applied to clean and prepare the dataset for machine learning analysis, ensuring accurate and reliable results.
Data source:
Data size:
Relevance to the project

In [6]:
from bs4 import BeautifulSoup
import pickle
import requests
import datetime as dt
import os
import pandas as pd
import numpy as np
import pandas_datareader.data as web
import matplotlib.pyplot as plt
from math import sqrt
from matplotlib import style
style.use('ggplot')

In [8]:
tickers = pd.read_csv('csv_files/tickers_new.csv')
ticker_list = tickers['ticker'].values.tolist()
company_list = tickers['company'].values.tolist()
sector_list = tickers['sector'].values.tolist() 

#### Plug in both company name and sector for each company ticket

In [9]:
Y = []
for symbols,company,sector in zip(ticker_list,company_list,sector_list):
    df = pd.read_csv(f'stock_csvs/stock_pup_{symbols}.csv')
    df['symbol'] = symbols
    df['company'] = company
    df['sector']  = sector
    cols = df.columns.tolist()
    cols = cols[-3:] + cols[:-3]
    df = df[cols]
    Y.append(df)
fund_stocks = pd.concat(Y, sort = False)

In [10]:
fund_stocks.shape

(62850, 44)

#### Quarter End to Date Time

In [11]:
# Quarter End to Date time
fund_stocks['Quarter end'] = pd.to_datetime(fund_stocks['Quarter end'])
fund_stocks.set_index("Quarter end", inplace=True)

#### Change date from 2000 to current year

In [12]:
#2000 to current year
five_yr_fstock = fund_stocks['2000':]

In [13]:
#convert strings into numeric integers
five_yr_fstock = five_yr_fstock.apply(pd.to_numeric, errors='ignore')

#### create unmatched companies in new dataframe

In [14]:
### assign variable to unmatch
nomatch_companies = five_yr_fstock[five_yr_fstock['company'] == 'nomatch']

#### Create a new data frame with nomatched companies

In [15]:
## assign variable to matched company and sector
company_fund = five_yr_fstock[five_yr_fstock['company'] != 'nomatch']

In [16]:
# 135 companies were not matched to finviz company
nomatch_companies.groupby('symbol')['company'].nunique().value_counts()

company
1    189
Name: count, dtype: int64

In [17]:
# 566 companies were matched to finviz company
company_fund.groupby('company')['symbol'].nunique().value_counts()

symbol
1    566
Name: count, dtype: int64

#### replace all 0's and None to nan's

In [18]:
company_fund = company_fund.replace(to_replace='0', value= np.nan)
company_fund = company_fund.replace(to_replace='None', value= np.nan )
company_fund = company_fund.apply(pd.to_numeric, errors='ignore')
company_fund['P/E ratio'].fillna(0, inplace=True)
company_fund.fillna(0, inplace = True)

#### Assigning reported_pe for approiate companies

In [19]:
company_fund['reported_pe'] = company_fund['P/E ratio'].apply(lambda x: 1 if x != 0 else 0)
company_fund['reported_pe'].value_counts()

reported_pe
1    35178
0     5388
Name: count, dtype: int64

#### Assigning reported_earnings for appropiate companies

In [20]:
company_fund['reported_earnings'] = company_fund['Earnings'].apply(lambda x: 1 if x != 0 else 0)
company_fund['reported_earnings'].value_counts()

reported_earnings
1    40414
0      152
Name: count, dtype: int64

In [21]:
company_fund['growth'] = company_fund['P/E ratio'].apply(lambda x: 1 if x == 0 or x >= 25 else 0)
company_fund['growth'].value_counts()

growth
0    24024
1    16542
Name: count, dtype: int64

####  Categorize Sectors

In [22]:
company_fund = company_fund.reset_index()
sectors = pd.get_dummies(company_fund['sector'], prefix= 'sector')
companies = company_fund.merge(sectors, left_index=True , right_index= True)

companies.set_index("Quarter end", inplace=True)

In [None]:
companies.to_csv('csv/companies.csv', index = True)

### Check Company Health And Growth Fundamentals

In [23]:
companies['health_cr'] = companies['Current ratio'].apply(lambda x: 1 if x >= 1.5 and x <= 3.0 else 0)
companies['health_dtbr'] = companies['Long-term debt to equity ratio'].apply(lambda x: 1 if x >= .05 else 0)
companies['growth_roa'] = companies['ROA'].apply(lambda x: 1 if x >= .05 else 0)
companies['growth_roe'] = companies['ROE'].apply(lambda x: 1 if x >= .1 else 0)
companies['health_dyp'] = companies['Dividend payout ratio'].apply(lambda x: 1 if x >= .55 else 0)

###  Prep for Classsifcation

In [24]:
classification = companies.drop(columns=['company', 'sector'])
classification_feat = classification.drop(columns =['P/E ratio', 'growth'])
classification_pred = pd.DataFrame(classification['growth'])
classification_pred = classification_pred.reset_index()
classification_pred = classification_pred.drop(columns= 'Quarter end')
sec_sym_com = pd.DataFrame(companies[['symbol', 'company', 'sector']])

In [26]:
classification_feat.to_csv('Classification_Models/classification_feat.csv')
classification_pred.to_csv('Classification_Models/classification_pred.csv')
sec_sym_com.to_csv('Classification_Models/sec_sym_com.csv')

##  Recommender System

In [27]:
companies['symbol_ID'] = companies.groupby('symbol').grouper.group_info[0]
companies_rec = companies
companies_rec.to_csv('recommendation_system/companies_rec.csv')
stz = companies[companies['symbol'] == 'STZ']