# Get the data

Depending on each case your data may look a little bit different but we should start our portfolio optimization by acquiring information on a set of STOCK to better understand their behaviour trough, in this case, one year of activity in the market.

In [14]:
import pandas as pd
from cmath import exp
import numpy as np
import json
# Get the variable part of the filename from the terminal
n = input("Enter the number of symbols to crawl: ")

# Construct the filename
filename = f"{n}assetraw.csv"
data = pd.read_csv(filename)
data

# Unique asset list
asset_list = data["Asset"].unique()
#expected return
exp_ret = {}
return_list = []
for asset in asset_list:
    open_price = np.array(data[data["Asset"] == asset]["Open"].astype("float"))
    close_price = np.array(data[data["Asset"] == asset]["Close"].astype("float"))
        
    # Sign will be used to indicate the value gradient direction
    returns = ((close_price - open_price)/open_price)
    exp_ret[asset] = returns.mean()
    return_list.append(returns)

# Expected return on each asset
return_list = np.array(return_list)
mu = [i for i in exp_ret.values()]   
    
# Compute covariance between returns
sigma = np.cov((return_list))
filter = data.groupby("Asset").agg({"Open time":max}).reset_index()
costs = data.merge(filter, how='inner').drop_duplicates()
#print(costs)
cost_list = costs[["Asset","Open"]].to_dict('records')
# Serializing json  
data = {"mu" : mu, "sigma": sigma.tolist(), "assets": cost_list} 
json_object = json.dumps(data, indent = 4)
jsonfilename = f"{n}asset.json"
with open(jsonfilename , "w") as file:
    file.write(json_object)

  filter = data.groupby("Asset").agg({"Open time":max}).reset_index()


Following this we will calculate the average expected revenue for each asset. This is done by getting the difference between *opening* and *closing* time scaled to the opening price. This way we make sure each asset is independently evaluated.

We will also compute the covariance between each asset so that we can consider this values as part of our portfolio diversification constraint.

Here $\mu$ is the value associated with the expected average return for each asset.

And $\sigma$ is the covariance between those very same assets.

It is important to know what the cost is of each asset so that we can also limit the budget we would like to spend in our investment.

We will store this information so that it can be used later.