# An Asset Selling Model
In this notebook you will optimize some simple parametric policies for the asset selling problem.

We start by creating an instance of the model and an instance of a policy. Let's start with the Sell-Low-policy. First, we instantiatie a model and specify the initial value for the state (price) and the length of the time horizon T in the constructor. It would also be possible to exert more control on the exogenous information process by specifying parameters for it, that are used inside the process.

In [1]:
import sys
sys.path.append("../")
import numpy as np
import pandas as pd
import plotly.express as px
import AssetSellingModel as asm
import AssetSellingPolicies as asp
from BaseClasses import Util as util

model = asm.AssetSellingModel(S0={"price": 20}, T=30)

Next, we create a policy for this model. The high-low-policy has two tunable parameters, namely `theta_low` and `theta_high`.
Then, we run the policy for 100 iterations/episodes. The `run_policy` method returns the average objective function value over all episodes.

In [2]:
high_low_policy = asp.HighLowPolicy(model=model, theta_low=10, theta_high=30)
high_low_policy.run_policy(n_iterations=100)

20.404837312953806

## Exercise 1
Execute the cell several times. How do you explain that the result is different every time? Do you notice any difference in this behavior if you change the number of iteration to 10, 1000, 10000?

---

We investigate the results in more detail. The results of a policy run are stored in an in a DataFrame called results. Every row corresponds to one timestep of one iteration/episode.

In [3]:
high_low_policy.results

Unnamed: 0,N,t,C_t sum,price,bias,price_smoothed,resource,sell,t_end,C_t
0,0,0,0.000000,20.000000,Neutral,20.000000,1,0.0,22,0.000000
1,0,1,0.000000,17.776240,Neutral,18.443368,1,0.0,22,0.000000
2,0,2,0.000000,19.414044,Up,19.122841,1,0.0,22,0.000000
3,0,3,0.000000,20.972127,Up,20.417341,1,0.0,22,0.000000
4,0,4,0.000000,23.993157,Up,22.920412,1,0.0,22,0.000000
...,...,...,...,...,...,...,...,...,...,...
1732,99,8,0.000000,15.706782,Neutral,16.503775,1,0.0,12,0.000000
1733,99,9,0.000000,12.204026,Down,13.493951,1,0.0,12,0.000000
1734,99,10,0.000000,10.267158,Down,11.235196,1,0.0,12,0.000000
1735,99,11,0.000000,6.957769,Down,8.240997,1,1.0,12,6.957769


Next we plot a few of the 100 paths using plotly. We notice that if the price never drops below `theta_low` the asset is sold at the end of the time horizon.

In [6]:
sample_paths = np.random.choice(100, size=5, replace=False)
df = high_low_policy.results.loc[high_low_policy.results.N.isin(sample_paths), :]
px.line(data_frame=df, x="t", y="price", facet_row="N", height=800)

The average amount of money that we make selling the asset depends of course on the values of `theta_low` and `theta_high`. If we for example  set `theta_low` to a higher value, it seems that the average profit is higher (note that it will be slightly different every time we execute the cell).

In [5]:
high_low_policy.theta_low = 19
high_low_policy.run_policy(n_iterations=100)

21.07238599239833

Next, we will try to find the best value for `theta_low` and `theta_high`. This is called *parameter tuning*. To do this, we just systematically try out different combinations of values for both. This strategy is called a *grid search* and there is a simple convenience method to automate this.

In [6]:
# Define a grid for combinations of theta_low and theta_high.
# theta_low should not be larger than the starting price (20) and theta_high should not be smaller than the starting price.
grid = {"theta_low": np.linspace(10, 20, 11), "theta_high": np.linspace(20, 30, 11)}
result = util.grid_search(grid, high_low_policy, n_iterations=10, ordered=True)

print(f"Best parameters: {result['best_parameters']} with an objective of {result['best_performance']}.")

Best parameters: {'theta_low': 10.0, 'theta_high': 30.0} with an objective of 22.35168604182948.


The result object gives us the best parameters and the corresponding performance but it gives also information about all the runs. We transform them into matrix form and visualize them with a heatmap.

In [10]:
res_grid = result["all_runs"].pivot(index="theta_low", columns="theta_high", values="performance")
px.imshow(res_grid)

Apparently, with the given uncertainty model and the high-low policy, the best profit is only slightly above the start price.

## Exercise 2
1. Create an instance of the tracking policy that is implemented in the class `TrackPolicy` in the module `AssetSellingPolicies` and run the policy for 100 iterations. Describe in your own words how this policy makes a decision. 
2. The policy has one tunable parameter `theta`. Run a grid search to find the best value for `theta`. Is the tracking policy better than the high-low policy?

---

In the first version of our sequential decision model we used a stochastic model to generate observations. We now introduce a new version, where we draw sample obervations $W_{t+1}$ from historical data. Consider the following version of our problem:

*You own a share of a company at the beginning of the month. Every day, you need to decide if you sell it (for the closing price of this day) or not. If by the end of the month the stock is still in your posession, it is sold at the closing price of the last day of the month.*

To generate different observations for one month, we will use 10 years of historical data where we scale the data so each month starts at zero. This gives us 120 observations in total that we will use to tune our policy.
 
As an example, we download data of the SAP stock using the package `yfinance` and reshape it to match our needs.

In [7]:
import yfinance as yf

def get_historical_monthly_paths(stock_name, start="2014-01-01", end="2023-12-31"):
    stock = yf.Ticker(stock_name)

    # Get historical market data (this makes an API call to Yahoo Finance)
    hist = stock.history(start=start, end=end, interval="1d")

    # We just keep the "Close" column"
    hist = hist.drop(["Open", "High", "Low", "Volume", "Dividends", "Stock Splits"], axis=1)

    # Enumerate the months from the start and store as a separate column
    hist["N"] = hist.index.tz_convert(None).to_period('M')
    hist["N"] = hist["N"].apply(lambda x: x.ordinal) - hist["N"].iloc[0].ordinal

    # Get the Close price at the beginning of every month and subtract from the Close value 
    hist_month_start = hist.groupby("N").head(1).rename({"Close": "Close_Month_Start"}, axis=1)
    hist_month_start = pd.merge(hist, hist_month_start, on="N")
    hist_month_start["price"] = (hist_month_start["Close"] - hist_month_start["Close_Month_Start"])
    
    return hist_month_start.drop(["Close", "Close_Month_Start"], axis=1)

hist_prices = get_historical_monthly_paths(stock_name="SAP")

Let's have a look at our historical sample paths $W_{t+1}$:

In [8]:
hist_prices

Unnamed: 0,N,price
0,0,0.000000
1,0,-0.484001
2,0,-0.408905
3,0,-0.609184
4,0,0.450645
...,...,...
2511,119,-7.298416
2512,119,-6.824371
2513,119,-4.138077
2514,119,-5.688614


We now create a new model where the exogenous information process is modified so that at each iteration, on month of the historical data is selected. We also create an instance of the high-low policy and run it for 120 iterations.

In [9]:
model_hist = asm.AssetSellingModelHistorical(hist_data=hist_prices)
policy_hist = asp.HighLowPolicy(model=model_hist, theta_low=-10, theta_high=10)
policy_hist.run_policy(n_iterations=120)

0.5281286875406901

Next, we need to optimize `theta_high` and  `theta_low`. To get a feeling for which values we might try, we first have a look at the distribution of monthly price deviations.

In [10]:
px.histogram(hist_prices, x="price")

We see that during one month, the difference of the stock price to the price at the beginning of the month is usually between -30 and +30 (with some outliers). We do a grid search on a 16x16 grid to find the best combination of `theta_low` and `theta_high`.

In [17]:
grid = {"theta_low": np.linspace(-30, 0, 16), "theta_high": np.linspace(0, 30, 16)}
result = util.grid_search(grid, policy_hist, n_iterations=120, ordered=True)
res_grid = result["all_runs"].pivot(index="theta_low", columns="theta_high", values = "performance")
px.imshow(res_grid)
print(f"Best parameters: {result['best_parameters']} with an objective of {result['best_performance']}.")

AttributeError: module 'plotly.express' has no attribute 'clf'

## Exercise 3
Go to https://finance.yahoo.com/ to look up names and historical charts of stocks. Repeat the steps above with a stock of your choice (Tesla? Wirecard? ...?). You should be able to use the function `get_historical_monthly_paths` from above to get the data in the appropriate format.Try to find a policy, i.e. "sell-low", "high-low", or "track", with corresponding parameters that maximizes the expected profit.

---