# Setup

For my 6.S098 Final Project, I worked in collaboration with Andres Arroyo (Class Listener) to develop a risk adverse stock portfolio generator. The initial goal was to use stock variation to speculate which stocks vary together and diversify a stock portfolio across different investments. We only used data from Yahoo on the top 100 S&P 500 companies. A more rigirous approach to the problem would be using investment strageties across many types of investments, such as other stocks, bonds, mutual funds, and index traded funds and generate future data about market crashes and probable worst-case finance situations. 

Outline: 
In our setup, we first generated an initial porfolio based on market data for the top 10 S&P500 stocks based on a risk adversion parameter λ. We also tested adding an additional constraint "WeightRange", which constricts the maximum amount one can invest in any one company. This allows for some flexibility in the stock choice, but not that the most optimal solution for just risk adversion would result from simply increasing λ. Interestingly, increasing λ to extremely high values (like 1-10^-8) still does not nessasarily result in an even distribution between all stocks. This is likely because there are certain stocks to pick that have little variance. 

In [1]:
using Plots, LinearAlgebra, Convex, SCS, Random, Dates 

# Market Data

In [6]:
using StaticArrays
using MarketData
import Statistics as stats
using NLopt
using ForwardDiff

In [226]:
t = Dates.now()
# Example Data Pulled from just Apple 
MarketData.yahoo("AAPL", MarketData.YahooOpt(period1 = t - Year(2), period2 = t))

└ @ CSV /Users/jacobhansen/.julia/packages/CSV/0Elut/src/context.jl:266


504×6 TimeArray{Float64, 2, Date, Matrix{Float64}} 2020-01-27 to 2022-01-24
│            │ Open    │ High    │ Low     │ Close   │ AdjClose │ Volume     │
├────────────┼─────────┼─────────┼─────────┼─────────┼──────────┼────────────┤
│ 2020-01-27 │ 77.515  │ 77.9425 │ 76.22   │ 77.2375 │ 76.107   │ 1.6194e8   │
│ 2020-01-28 │ 78.15   │ 79.6    │ 78.0475 │ 79.4225 │ 78.26    │ 1.62234e8  │
│ 2020-01-29 │ 81.1125 │ 81.9625 │ 80.345  │ 81.085  │ 79.8982  │ 2.162292e8 │
│ 2020-01-30 │ 80.135  │ 81.0225 │ 79.6875 │ 80.9675 │ 79.7824  │ 1.267432e8 │
│ 2020-01-31 │ 80.2325 │ 80.67   │ 77.0725 │ 77.3775 │ 76.245   │ 1.995884e8 │
│ 2020-02-03 │ 76.075  │ 78.3725 │ 75.555  │ 77.165  │ 76.0356  │ 1.737884e8 │
│ 2020-02-04 │ 78.8275 │ 79.91   │ 78.4075 │ 79.7125 │ 78.5458  │ 1.366164e8 │
│ 2020-02-05 │ 80.88   │ 81.19   │ 79.7375 │ 80.3625 │ 79.1863  │ 1.188268e8 │
│ ⋮          │ ⋮       │ ⋮       │ ⋮       │ ⋮       │ ⋮        │ ⋮          │
│ 2022-01-13 │ 175.78  │ 176.62  │ 171.79  │ 172.19  │ 

Below is code for getting the stock data. We first pull stock prices over a given period and store them in arrays. We then calculate the mean returns and the variance. All of this then can be used to calculate the covariance matrix. Then returns the mean array, zeroMeanRets (just ignore that), and the covariance matrix. 

In [269]:
function pct_change(input::AbstractVector{<:Number})
    [i == 1 ? missing : (input[i]-input[i-1])/input[i-1] for i in eachindex(input)]
end

function getStockData(stocks, periodStart, periodEnd)
    dataArr = []

    for stock in stocks
        data = MarketData.yahoo(stock, MarketData.YahooOpt(period1 = periodStart, period2 = periodEnd))
        closeValues = values(data[:Close])
        push!(dataArr, closeValues)
    end
    dataDims = [length(dataArr[1]), length(dataArr)]
    meanRetsArr = Array{Float64, 2}(undef, 1, dataDims[2])
    zeroMeanRetsArr = Array{Float64, 2}(undef, dataDims[1]-1, dataDims[2])
    for i in 1:dataDims[2]
        rets = pct_change(dataArr[i])[2:end]
        meanRet = stats.mean(rets)
        meanRetsArr[i] = meanRet

        zeroMeanRets = [ret - meanRet for ret in rets]
        zeroMeanRetsArr[:, i] = zeroMeanRets
    end

    covsArr =  transpose(zeroMeanRetsArr) * zeroMeanRetsArr / (length(zeroMeanRetsArr[:, 1]) - 1)
    
    return vec(meanRetsArr), zeroMeanRetsArr, covsArr
    # return meanRetsArr, covars
end

getStockData (generic function with 1 method)

Starting with the top 10 S&P 500 Stocks

In [None]:
#stocks you want to check
stocks = ["AAPL", "MSFT", "AMZN", "GOOGL", "FB", "NVDA", "TSLA", "UNH", "BRK-B", "GOOG"]
meanRets, zeroMeanRets, covs = getStockData(stocks, t - Year(5), t - Year(3))
println(meanRets)
println(covs)

# Initial Porfolio Optimizations

In [65]:
function FindInitialPorfolio(mean, covs, λ, weightRange)

    w = Variable(length(mean))
    objective = λ*quadform(w, covs)-mean'*w
    problem = minimize(objective)

    problem.constraints += w .>= weightRange[1]
    problem.constraints += w .<= weightRange[2]
    problem.constraints += sum(w) == 1

    solve!(problem, SCS.Optimizer(verbose = false))
    return problem.optval, w.value
end

FindInitialPorfolio (generic function with 2 methods)

In [271]:
results, w = FindInitialPorfolio(meanRets, covs, 0.8, [0; 1/2])
println(results)
for i in 1:length(w)
    if w[i] > 0.01
        print("It may be optimal to invest ")
        print(w[i])
        print(" of your portfolio in ")
        println(stocks[i])
    end
end
#plot(results)

-0.0011632669824761606
It may be optimal to invest 0.046744238636264215 of your portfolio in MSFT
It may be optimal to invest 0.4999989284004524 of your portfolio in AMZN
It may be optimal to invest 0.4532568742562694 of your portfolio in UNH


-0.0011632669824761606
It may be optimal to invest 0.046744238636264215 of your portfolio in MSFT
It may be optimal to invest 0.4999989284004524 of your portfolio in AMZN
It may be optimal to invest 0.4532568742562694 of your portfolio in UNH

# Porfolio Diversification

In [125]:
text = "AAPL\nMSFT\nGOOG\nGOOGL\nAMZN\nTSLA\nFB\nBRK.B\nNVDA\nV\nJPM\nUNH\nJNJ\nPG\nWMT\nBAC\nHD\nMA\nXOM\nPFE\nDIS\nKO\nCVX\nCSCO\nADBE\nPEP\nABBV\nLLY\nTMO\nCMCSA\nAVGO\nWFC\nACN\nNFLX\nVZ\nNKE\nORCL\nABT\nCRM\nCOST\nINTC\nMRK\nPYPL\nDHR\nT\nMCD\nQCOM\nMS\nUPS\nSCHW\nLIN\nNEE\nTXN\nPM\nUNP\nLOW\nINTU\nAMD\nHON\nBMY\nMDT\nCVS\nRTX\nC\nTMUS\nAMGN\nBA\nBLK\nAXP\nAMAT\nIBM\nCAT\nGS\nPLD\nDE\nCOP\nSBUX\nAMT\nANTM\nEL\nGE\nTGT\nISRG\nLMT\nCHTR\nNOW\nSPGI\nMMM\nBKNG\nSYK\nZTS\nMU\nMDLZ\nADP\nMO\nPNC\nLRCX\nGILD\nF\nUSB\nTFC\nCB\nCME\nTJX\nMMC\nCI\nCSX\nGM\nDUK\nCCI\nSHW\nBDX\nHCA\nITW\nEW\nSO\nICE\nNSC\nCL\nFISV\nFIS\nMRNA\nFDX\nEQIX\nREGN\nETN\nMCO\nWM\nCOF\nD\nATVI\nAPD\nFCX\nNOC\nPGR\nBSX\nPSA\nECL\nEOG\nAON\nILMN\nADI\nGD\nVRTX\nKLAC\nMET\nEXC\nEMR\nADSK\nNXPI\nPXD\nSLB\nJCI\nMAR\nNEM\nTEL\nFTNT\nINFO\nBK\nHUM\nDG\nAIG\nSPG\nKMB\nSNPS\nIQV\nAPH\nROP\nCNC\nXLNX\nWBA\nSTZ\nMNST\nKHC\nAEP\nCTSH\nMPC\nDLR\nLHX\nORLY\nPAYX\nIDXX\nSRE\nBAX\nDOW\nMSCI\nPRU\nDXCM\nA\nGPN\nCDNS\nCARR\nHSY\nDD\nGIS\nTT\nMCHP\nAFL\nRSG\nTRV\nPH\nMSI\nSYY\nCMG\nAZO\nHLT\nKMI\nEBAY\nCTAS\nEA\nHPQ\nALGN\nADM\nMCK\nAPTV\nSIVB\nPPG\nPSX\nWELL\nTROW\nXEL\nROK\nYUM\nIFF\nODFL\nWMB\nOTIS\nTDG\nKR\nSBAC\nRMD\nAMP\nSTT\nROST\nALL\nDFS\nAVB\nCTVA\nMTD\nMTCH\nCBRE\nLVS\nVLO\nBIIB\nPEG\nEQR\nTSN\nDVN\nOXY\nKEYS\nCMI\nFAST\nAJG\nPCAR\nLYB\nVRSK\nFITB\nAME\nDHI\nBF.B\nARE\nWEC\nCPRT\nNDAQ\nGLW\nTWTR\nES\nFRC\nAWK\nBLL\nSWK\nANSS\nED\nLEN\nDLTR\nWY\nNUE\nWTW\nEFX\nWST\nBKR\nABC\nEPAM\nHES\nO\nCERN\nZBRA\nHRL\nOKE\nLUV\nVFC\nEXR\nLH\nEXPE\nZBH\nCDW\nFTV\nMKC\nALB\nGWW\nLYV\nHAL\nDOV\nVMC\nNTRS\nCHD\nDAL\nSYF\nMLM\nRJF\nHIG\nHBAN\nEIX\nVRSN\nTSCO\nTER\nGRMN\nSWKS\nIR\nMAA\nKEY\nCCL\nDTE\nIT\nPPL\nBBY\nSTE\nAEE\nK\nFE\nSTX\nURI\nCFG\nETR\nPKI\nFANG\nDRE\nVIAC\nESS\nCLX\nHPE\nMTB\nRF\nJBHT\nFOX\nFOXA\nCOO\nRCL\nSBNY\nVTR\nETSY\nEXPD\nPAYC\nPFG\nNTAP\nTDY\nMGM\nULTA\nWAT\nXYL\nPOOL\nFLT\nTTWO\nPEAK\nTYL\nGPC\nWDC\nCINF\nIP\nMPWR\nBR\nAMCR\nCMS\nAKAM\nTRMB\nBXP\nENPH\nBRO\nNVR\nCE\nGNRC\nVTRS\nBIO\nHOLX\nUDR\nDISH\nCTLT\nDRI\nWAB\nKMX\nCNP\nCAG\nDGX\nAVY\nCRL\nINCY\nDPZ\nCZR\nBEN\nIEX\nEMN\nJ\nCTRA\nTXT\nFDS\nOMC\nMAS\nLKQ\nAES\nROL\nMOS\nSJM\nNLOK\nTFX\nQRVO\nL\nLNT\nTECH\nEVRG\nWRB\nKIM\nCAH\nMRO\nHWM\nAAP\nPWR\nIPG\nMKTX\nATO\nFFIV\nUAL\nFMC\nCF\nABMD\nDISCA\nDISCK\nBBWI\nCHRW\nPTC\nHAS\nCPB\nPHM\nFBHS\nLDOS\nNWS\nNWSA\nPKG\nCBOE\nCTXS\nLNC\nAOS\nHST\nLUMN\nIRM\nCMA\nWHR\nSEDG\nRHI\nREG\nJKHY\nWRK\nCDAY\nXRAY\nRE\nAPA\nSNA\nDVA\nNI\nMHK\nUHS\nAAL\nJNPR\nALLE\nTAP\nPNR\nBWA\nIVZ\nGL\nHSIC\nTPR\nSEE\nWYNN\nZION\nNWL\nFRT\nLW\nNRG\nANET\nAIZ\nPBCT\nVNO\nDXC\nOGN\nRL\nPNW\nIPGP\nUA\nUAA\nHII\nPENN\nNLSN\nPVH\nALK\nGPS\nNCLH"
stocks2 = Vector{String}(split(text, "\n"))
for i in 1:length(stocks2)
    stocks2[i] = replace(stocks2[i], "." => "-")
end

In [None]:
diffValue = 10
meanRets60, zeroMeanRets60, covs60 = getStockData(stocks2[diffValue:60], t - Year(5), t - Year(3))

In [410]:
function SubsequentPort(mean, covs, λ, weightRange, r)

    w = Variable(length(mean))
    objective = λ*quadform(w, covs)-mean'*w+λ*w'*covs*r
    problem = minimize(objective)

    problem.constraints += w .>= weightRange[1]
    problem.constraints += w .<= weightRange[2]
    problem.constraints += sum(w) == 1

    solve!(problem, SCS.Optimizer(verbose = false))
    optValue = λ*w.value[1:end]'*covs*w.value[1:end]-mean'*w.value[1:end]
    return optValue, w.value[1:end]
end

SubsequentPort (generic function with 2 methods)

In [429]:
risk = 0.2
minCompanies = 1
similarCompanyPen = 10
penalty = zeros(length(meanRets60))*similarCompanyPen
results = [zeros(length(meanRets60)), [0], [0], [0], [0]]
optValues = zeros(5)
for i in 1:5
    opt, newW = SubsequentPort(meanRets60, covs60, risk, [0; 1/minCompanies], penalty)
    results[i] = newW
    optValues[i] = opt
    penalty += abs.(newW)*similarCompanyPen
end

In [430]:
for portNum in 1:5
    print("For Portfolio ")
    print(portNum)
    print(": Consider investing in ")
    for i in 1:length(results[portNum])
        if results[portNum][i] > 0.001
            print(stocks2[i+diffValue])
            print(" (")
            print(results[portNum][i])
            print("), ")
        end
    end
    print(" for an optValue of ")
    println(optValues[portNum])
end

For Portfolio 1: Consider investing in VZ (0.5370896632261059), HON (0.46291037488347825),  for an optValue of -0.0019391357356989113
For Portfolio 2: Consider investing in DHR (0.9999999998472113),  for an optValue of -0.0016817929914812147
For Portfolio 3: Consider investing in TXN (1.000000000027387),  for an optValue of -0.0008533451967226154
For Portfolio 4: Consider investing in TMO (0.4898508737214801), CRM (0.06546915053287419), QCOM (0.17470742562292788), TXN (0.26997334598274564),  for an optValue of -0.000917025173099823
For Portfolio 5: Consider investing in CRM (0.0022976026626978243), QCOM (0.33091437077184715), TXN (0.6667876870301028),  for an optValue of -0.0008749411466819875


# Portfolio Evaluation (build portfolios on 2017-2019, test on 2020-2022)

### Year 2017-2019 for top 10-100 stocks

In [None]:
meanRets100, zeroMeanRets100, covs100 = getStockData(stocks2[diffValue:100], t - Year(5), t - Year(3))

### Year 2015-2017 for top 10-50 stocks

In [None]:
meanRetsOld, zeroMeanRetsOld, covsOld = getStockData(stocks2[diffValue:40], t - Year(7), t - Year(5))

### Year 2017-2019 for top 10-50 stocks

In [None]:
meanRetsSmall, zeroMeanRetsSmall, covsSmall = getStockData(stocks2[diffValue:40], t - Year(5), t - Year(3))

### See performance of 2015-2017 analysis on 2017-2019 

In [314]:
optOld, newOptOld = FindInitialPorfolio(meanRetsOld, covsOld, 0.8, [0; 1/3])
print(0.8*newOptOld'*covsSmall*newOptOld-meanRetsSmall'*newOptOld)


[-0.0012382367445030909]

In [325]:
optSmall, newOptSmall = FindInitialPorfolio(meanRetsSmall, covsSmall, 0.8, [0; 1/3])
print(0.8*newOptSmall'*covsOld*newOptSmall-meanRetsOld'*newOptSmall)


[-0.0012858804290798807]

### Fudging Numbers to Guess the Best Portfolio

In [373]:
opt100, newOpt100 = FindInitialPorfolio(meanRets100, covs100, 0.8, [0; 1/6]) # Best guess in diverse portfolio
opt100s, newOpt100s = FindInitialPorfolio(meanRets100, covs100, 0, [0; 1]) # Investing everything in the current top stock 

(-0.00216081471384127, [-2.7413823764254177e-9; -5.313804031909187e-11; … ; 6.061657167668457e-9; 8.09712669399809e-9])

### Running My Best Guess on Years 2019-2022

In [None]:
meanRetsRecent, zeroMeanRetsRecent, covsRecent = getStockData(stocks2[diffValue:100], t - Year(3), t - Year(0))

In [387]:
print("For my diverse portfolio, the objective function results are: ")
println((0.8*newOpt100'*covsRecent*newOpt100-meanRetsRecent'*newOpt100)[1])
print("Investing ")
for i in 1:length(newOpt100)
    if newOpt100[i] > 0.001
        print(round(newOpt100[i], digits = 3))
        print(" in ")
        print(stocks2[i+diffValue])
        print(", ")
    end
end
println("")
print("Yields returns of ")
print(round((meanRetsRecent'*newOpt100*3*255)[1]*100))
println("%")
println("")
print("While for the naive investment, the objective function results are: ")
println((0.8*newOpt100s'*covsRecent*newOpt100s-meanRetsRecent'*newOpt100s)[1])


print("Naive returns are ")
print(round((meanRetsRecent'*newOpt100s*3*255)[1]*100))
println("%")
print("By investing ")
for i in 1:length(newOpt100s)
    if newOpt100s[i] > 0.001
        print(round(newOpt100s[i], digits = 3))
        print(" in ")
        print(stocks2[i+diffValue])
        print(", ")
    end
end

For my diverse portfolio, the objective function results are: -0.0007848650552123236
Investing 0.028 in PEP, 0.167 in VZ, 0.167 in DHR, 0.167 in HON, 0.167 in BLK, 0.167 in LMT, 0.139 in SPGI, 
Yields returns of 84.0%

While for the naive investment, the objective function results are: -0.001879938694584764
Naive returns are 210.0%
By investing 1.0 in HON, 

# The objective function results are: -0.0007848650552123236
# Returns are 84.0%
# The average for S&P 500 for the same time frame is 55%

Results: I'm quite impressed with how the porfolio preformed. Sadly, Andres beat me with an objective minimum of -0.00096. Though that's pretty close! Obviously we have substantial look back bias, but this does make me want to invest in stocks! Also, it is interesting that the naive solution to invest all your money in the 

In fudging the numbers in testing years 2015-2017 on 2017-2019, I found that it you could make a similar amount investing most of your money in one company, but our goal was to be able to build a secure stock portfolio. Sometime I'd like to figure out how to generate market crash data and see how our porfolios preform. 