In [1]:
from IPython.display import display, Math, Latex

import pandas as pd
import numpy as np
import numpy_financial as npf
import yfinance as yf
import matplotlib.pyplot as plt
import datetime
from datetime import datetime

In [2]:
# Get all tickers from the stock file and convet to a list
tickers = pd.read_csv('Tickers.csv', header=None)
tickers.columns = ['stocks']
data = tickers['stocks'].tolist()

In [None]:
# Filter and remain stocks from the list whose currency is USD and average of daily volume is large than 10000
# in a previously empty list called 'filtered_stocks' and contain in the DataFrame called 'df'
filtered_stocks = []
for stock in data:
    if yf.Ticker(stock).info.get('currency') == 'USD':
        if yf.Ticker(stock).history(start='2023-01-01', end='2023-03-08').Volume.mean() > 10000:
            filtered_stocks.append(stock)
        else: continue
    else: continue
df = pd.DataFrame(filtered_stocks, columns = ['ticker'])
df.head()

In [None]:
# Get all past closing prices in the curtain duration in the DataFrame called 'initial_portfolio_price'
initial_portfolio_price = pd.DataFrame()
for i in range(len(df)):
    if i == 0:
        initial_portfolio_price = pd.DataFrame(yf.Ticker(df.ticker[i]).history(start='2023-01-01', end='2023-03-08')['Close'])
    else: 
        initial_portfolio_price[df.ticker[i]] = yf.Ticker(df.ticker[i]).history(start='2023-01-01', end='2023-03-08')['Close']
initial_portfolio_price.rename(columns={'Close': df.ticker[0]}, inplace=True)
initial_portfolio_price

Some necessary data filtering and collection above for further calculation.

In [None]:
# Get the returns of each stock for eligible values (from the second to the end) in the DataFrame called 
# 'initial_portfolio_pct' where the index is Date
initial_portfolio_pct = initial_portfolio_price.pct_change()[1::]
initial_portfolio_pct

In [None]:
# Get the standard deviation of each stock to be further proceeded in the DataFrame called 
# 'initial_portfolio_std' where the index is ticker
initial_portfolio_std = initial_portfolio_pct.std()
initial_portfolio_std = pd.DataFrame(initial_portfolio_std)
initial_portfolio_std.columns = ['Standard Deviation']
initial_portfolio_std.head()

Given that the portfolio value is determined on a single day, long-term stock information may not be entirely suitable. Therefore, calculations are made based on the daily percentage change, as well as the standard deviation derived from it. This approach provides more precise data that is relevant to the specific day on which the portfolio value is decided, enhancing the accuracy of the analysis and risk assessment.

In [None]:
# Get the ten stocks whose standard deviations are the top fifty in all of the filtered stocks in the DataFrame called 'biggest_std_stocks'
biggest_std_stocks = initial_portfolio_std.nlargest(50, ['Standard Deviation'])
biggest_std_stocks.head()

The objective of this strategy is to constitute the riskiest portfolio possible, where each individual stock possesses the highest volatility. This can be determined by the standard deviation of each stock. As a statistical measure, standard deviation provides an understanding of the distance from the mean of a data set, or the dispersion of returns from the mean. When applied to a portfolio of stocks, standard deviation serves as an indicator of the volatility of stocks, bonds, and other financial instruments, based on the spread of returns over a period of time.

The standard deviation of an investment is a measure of the volatility of returns. Consequently, the higher the standard deviation, the higher the volatility and risk associated with the investment. Financial securities or funds that are volatile typically display a higher standard deviation compared to stable financial securities or investment funds.

A high standard deviation is generally seen as more risky since the performance of the investment can change dramatically in any direction at any given moment. The strategy, therefore, emphasizes the selection of only ten stocks. This is based on the principle that the greater the number of stocks in a portfolio, the less risky it is, a conclusion derived from the concept of diversification in previous analyses. Hence, limiting the number of stocks in the portfolio is a key element in enhancing the risk level.

In [None]:
# Convert the DataFrame 'biggest_std_stocks' into the list called 'biggest_std_stocks' 
# where stocks are what we will invest to achieve risky portfolio
biggest_std_stocks = list(biggest_std_stocks.index.values)
biggest_std_stocks

In [None]:
# Get the returns of each invested stocks for eligible values in the DataFrame called 'formered_portfolio_pct'
formered_portfolio_pct = initial_portfolio_pct[biggest_std_stocks]
formered_portfolio_pct

In [None]:
# Get every covariance between each two invested stocks and itself in the DataFrame called 'portfolio_variance'
portfolio_variance = formered_portfolio_pct.corr()
portfolio_variance.head()

As we can see from the matrix that the selected stocks are mostly positively related to each other, indicating that they increase or decrease mostly at the same time. Thus, it is the desired combination to get the highted absolute value of our portfolio.

To get to the optimal weighting of each stock, we will need to calculate it according to what gives us maximum expected returns. And we want to utilize the tool of efficient frontier which can show us the maximum return we can get for a set level of volatility, or conversely, the volatility that we need to accept for certain level of returns.

In [None]:
# Get the volatility of each invested stock, given by standard deviation, via multiplying by 79 
# because there are totally 79 trading days over the period, in the DataFrame called 'volatility'
volatility = formered_portfolio_pct.std().apply(lambda x: x*np.sqrt(79))
volatility = pd.DataFrame(volatility)
volatility.columns = ['Volatility']
volatility.head()

In [None]:
# Get the expect returns of each invested stock in the DataFrame called 'expected_return'
expected_return = formered_portfolio_pct.mean()
expected_return = pd.DataFrame(expected_return)
expected_return.columns = ['Expected Return']

In [None]:
# Creat a DataFrame called 'assets' for visualising expect returns and volatility of invested stocks
assets = pd.concat([expected_return, volatility], axis=1)
assets

In [None]:
p_ret = [] # Define an empty array for portfolio returns
p_vol = [] # Define an empty array for portfolio volatility
p_weights = [] # Define an empty array for asset weights

num_assets = len(formered_portfolio_pct.columns)
num_portfolios = 50000

In [None]:
# Run a loop for 50000 times to consider different weights for assets and  
# calculates the return and volatility of that particular portfolio combination
for portfolio in range(num_portfolios):
    weights = np.random.random(num_assets) # Get random numbers for weights
    weights = weights/np.sum(weights) # Guarantee the sum of weights must be 1
    p_weights.append(weights)
    returns = np.dot(weights, expected_return) # Returns are the product of individual expected returns of asset and its weights 
    p_ret.append(returns)
    var = portfolio_variance.mul(weights, axis=0).mul(weights, axis=1).sum().sum() # Portfolio Variance
    sd = np.sqrt(var) # Daily standard deviation
    period_sd = sd*np.sqrt(79) # Period standard deviation = volatility
    p_vol.append(period_sd)

In [None]:
# Append the weight of each stock to  Returns and Volatility 
data = {'Returns':p_ret, 'Volatility':p_vol}
for counter, symbol in enumerate(formered_portfolio_pct.columns.tolist()):
    #print(counter, symbol)
    data[symbol+' weight'] = [w[counter] for w in p_weights]

In [None]:
# Create a DataFrame called 'portfolios' that displays returns and volatility of each group of weighting invested stocks
portfolios = pd.DataFrame(data)
portfolios.head()

In [None]:
# Filter the weighing that is less than (100/(2n))% or greater than 35%
for i in range(2, num_assets+2):
    portfolios = portfolios[(portfolios.iloc[:, i] >= 0.05) & (portfolios.iloc[:, i] <= 0.35)]
    i += 1
portfolios

In [None]:
# Plot the graph with x-axix 'Volatility', y-axix 'Returns' to visualize each portfolio
portfolios.plot.scatter(x='Volatility', y='Returns', marker='o', s=10, alpha=0.3, grid=True, figsize=[10,10])

In [None]:
# Reset the index
portfolios.reset_index(inplace=True)

In [None]:
# Drop the previous index column
portfolios.pop('index')

In [None]:
# Display the filtered portfolio weighing 
portfolios

Each point on the line (left edge) represents an optimal portfolio of stocks that maximises the returns for any given level of risk. The point in the interior are sub-optimal for a given risk level. For every interior point, there is another that offers higher returns for the same risk. Since we are looking for the riskiest portfolio, we will choose the point where volatity and return are both maximum.

In [None]:
# Find the portfolio with the biggest volatility and print that particular weightings of each invested stock
max_vol_port = portfolios.iloc[portfolios['Volatility'].idxmax()]
max_vol_port

In [None]:
# plotting the maximum volatility portfolio
plt.subplots(figsize=[10,10])
plt.scatter(portfolios['Volatility'], portfolios['Returns'],marker='o', s=10, alpha=0.3)
plt.scatter(max_vol_port[1], max_vol_port[0], color='r', marker='*', s=500)
plt.xlabel('Volatility')
plt.ylabel('Returns')

In [None]:
# Extract each weight of invested stock from the above portfolio in the DataFrame called 'max_port_weight'
max_port_weight = pd.DataFrame(max_vol_port)
max_port_weight.drop(['Returns','Volatility'], axis=0, inplace=True)
max_port_weight.index = biggest_std_stocks
max_port_weight.columns = ['Weight']
max_port_weight

In [None]:
# Extract each weight of invested stock from the above portfolio in the DataFrame called 'max_port_weight'
max_port_weight = pd.DataFrame(max_vol_port)
max_port_weight.drop(['Returns','Volatility'], axis=0, inplace=True)
max_port_weight.index = biggest_std_stocks
max_port_weight.columns = ['Weight']
max_port_weight.head()

In [None]:
# Start creating the DataFrame 'FinalPortfolio' from the beginning 'Ticker' with index from 1 to 10
FinalPortfolio = pd.DataFrame(biggest_std_stocks)
FinalPortfolio.columns = ['Ticker']
FinalPortfolio.index = FinalPortfolio.index + 1
FinalPortfolio

In [None]:
# Add a new column 'Price' to the DataFrame 'FinalPortfolio' which contains the closing price on 2021.11.26
FinalPortfolio = FinalPortfolio.append(pd.DataFrame({'Price' : []}))
for i in range(1, len(FinalPortfolio)+1):
    FinalPortfolio['Price'][i] = yf.Ticker(biggest_std_stocks[i - 1]).history('2021-11-26').Close[0]
    i += 1
FinalPortfolio

In [None]:
# Add new columns 'Investment' and 'Shares' based on the DataFrame 'max_port_weight' to create a new DataFrame called 
# 'investment' which calculates corresponding principal invested on ech stock based on weight and calculates the 
# shares bought based on the closing price on 2021.11.26
investment = max_port_weight.append(pd.DataFrame({'Investment' : []}))
investment = investment.append(pd.DataFrame({'Shares' : []}))
investment = investment.copy()
i = 0
for i in range(0,len(investment)):
    investment['Investment'][i] = 100000 * investment['Weight'][i]
    investment['Shares'][i] = investment['Investment'][i] / FinalPortfolio['Price'][i + 1]
    i += 1
investment

In [None]:
# Add a new column 'Shares' to the DataFrame 'FinalPortfolio' which contains data of shares bought that is extracted from the DataFrame 'investment'
investment.reset_index(inplace=True)
investment.index = investment.index + 1
FinalPortfolio['Shares'] = investment[['Shares']]
FinalPortfolio.head()

In [None]:
# Add a new column 'Value' to the DataFrame 'FinalPortfolio' which contains the value of each invested stock by multiplying closing price and share bought
FinalPortfolio['Value'] = FinalPortfolio['Price'] * FinalPortfolio['Shares']
FinalPortfolio.head()

In [None]:
# Add a new column 'Weight' to the DataFrame 'FinalPortfolio' which contains the weight of each invested stock that is extracted from the DataFrame called 'max_port_weight'
max_port_weight.reset_index(inplace=True)
max_port_weight.index = max_port_weight.index + 1
FinalPortfolio['Weight'] = max_port_weight[['Weight']]
FinalPortfolio.head()

In [None]:
# Show that the sum of each value of invested stock is totally 100,000 (roughly)
FinalPortfolio['Value'].sum()

In [None]:
# Show that the sum of each weighr of invested stock is totally 1 (roughly)
FinalPortfolio['Weight'].sum()

In [None]:
# Ultimately output the DataFrame 'Stocks' merely contains the info of ticker and share of each invested stock and output into a csv file called 'Stocks_Group_21.csv'
Stocks = FinalPortfolio[['Ticker', 'Shares']]
Stocks.to_csv('Stocks_Group_21.csv')


In the process of determining the weight of each stock in the investment, a loop is run 1000 times to ascertain the expected returns and volatility of each portfolio. This approach takes into account the expected return, standard deviation of each stock, as well as the covariance between each pair of stocks, which is expressed as volatility. By doing so, it becomes possible to identify the portfolio with the highest volatility and expected return, which is then selected as the final portfolio.

The integration of expected return, standard deviation, and covariance of stocks into this process is based on the principles of Markowitz Portfolio Theory. Both the Mean and Variance aspects are considered to carry out a corresponding analysis of each weight of stocks. The ultimate goal is to find the riskiest and most profitable portfolio. As illustrated in the graph, the portfolio may form an ellipse-like shape, showcasing a positive correlation between volatility and returns. The output, therefore, is the point that represents the riskiest but most profitable portfolio.

Following this, the closing price of the stocks is obtained from finance data providers like Yahoo Finance. With this data, the principal invested at this particular weight of each stock is calculated along with the following statistics. This approach ensures the determination of a high-risk, high-reward portfolio based on comprehensive analysis and financial principles.