## Alpha 001 Formula:

$$
\text{Alpha001} = \left( \text{rank} \left( \text{argmax} \left( \text{SignedPower} \left( \left\{ \begin{array}{ll}
\text{stddev}(returns,20) & \text{if } returns<0 \\
\text{close} & \text{otherwise}
\end{array} \right\},2 \right),5 \right) \right) - 0.5 \right)
$$

Where:

- $ \text{stddev}(returns, 20) $ represents the 20-period standard deviation of returns.
- $ \text{close} $ represents the closing price.
- $ \text{SignedPower}(x, 2) $ calculates the square of $ x $.
- $ \text{argmax}(x, 5) $ returns the index of the maximum value of $ x $ over the last 5 periods.
- $ \text{rank}(x) $ assigns a rank to each value of $ x $, with ties being averaged.

## Demonstration

(Demonstrate on Taiwan Stock Market)
Here I choose a relatively high Volatility Stock (2492 華新科) as my example.

In [55]:
df2492 = pd.read_csv(r"C:\Users\micha\OneDrive\桌面\CODES\QUANT\Stockdata\stock2492_2020_5.csv")

## Close price or measure Volatility 

\\begin{cases}
\text{stddev}(returns,20) & \text{if } returns < 0 \\
\text{close} & \text{otherwise}
\end{cases}

Personal Understanding:
- (1) $ \underline{\text{stddev}(returns, 20)} $
:  This represents the standard deviation of returns over the past 20 periods. When the stock market experiences negative returns during this period, the formula selects the standard deviation of the closing price as a measure of volatility. This is because higher volatility during negative returns may indicate greater uncertainty or risk in the market.  
<br>
<br>
- (2) $ \underline{\text{close}} $:During periods, if one day shows a positive returns, the strategy focuses on the closing price. The strategy is primarily interested in price momentum for investment decisions. 

Note: 
- (1) why 20:5? is this a tunable parameter?
- (2) Isn't return more directly correlated to probability of losing compare to volatility?

In [56]:
df2492['returns'] = df2492['Close'].astype(float).pct_change() #pct_change: (close price today - close price yesterday)/close price today
df2492['stddev'] = df2492['returns'].astype(float).rolling(window=20).std()

df2492.iloc[25:30]

Unnamed: 0.1,Unnamed: 0,Date,Capacity,Turnover,Open,High,Low,Close,Change,Transaction,returns,stddev
25,25,2020-06-08,10525882,2090375136,201.0,201.5,196.5,197.0,-1.5,7453,-0.007557,0.028542
26,26,2020-06-09,5732073,1134357454,196.5,200.5,196.0,197.5,0.5,4163,0.002538,0.027333
27,27,2020-06-10,7764593,1538726111,198.0,200.5,196.5,196.5,-1.0,5168,-0.005063,0.027258
28,28,2020-06-11,9788719,1877982110,196.0,197.5,188.0,189.5,-7.0,7366,-0.035623,0.025767
29,29,2020-06-12,7230564,1339772875,183.0,188.5,182.5,188.0,-1.5,5320,-0.007916,0.025835


## Power of 2

$$
\text{SignedPower} \left( \left\{ \begin{array}{ll}
\text{stddev}(returns,20) & \text{if } returns<0 \\
\text{close} & \text{otherwise}
\end{array} \right\}, 2 \right)
$$

Personal Understanding:
I dont quite understand what this step actually does. The square of value probably wants to amplify the difference between the value from returns and standard deviation (ex. 2 and 3 becomes 4 and 9). However:
- (1) For day 2020-06-09, one can see that the value is merely the same order. I don't know whether the returns was supposed to have an order difference compare to standard deviation, sound nonlogic to me. 
- (2) What's the point of using 'signed' power? we choose positive return and standard deviation is also positive.
<br>
<br>
The signed power value is shown below

In [57]:
import numpy as np
df2492['signed_power'] = np.where(df2492['returns'] < 0, np.power(df2492['stddev'], 2), np.power(df2492['returns'].astype(float), 2))
selected_columns = ['returns','stddev','signed_power']
df2492_selected = df2492[selected_columns]
df2492_selected.iloc[25:30]

Unnamed: 0,returns,stddev,signed_power
25,-0.007557,0.028542,0.000815
26,0.002538,0.027333,6e-06
27,-0.005063,0.027258,0.000743
28,-0.035623,0.025767,0.000664
29,-0.007916,0.025835,0.000667


- For index 25, the returns is negative,so in column 'signed_power' the value is the square of stddev = 0.028542
- For index 26, the returns is positive,so in column 'signed_power' the value is the square of returns = 0.002538 

## Finding maximum value over the past 5 days

$$
\text{argmax} \left({\text{SignedPower}} \left( \left\{ \begin{array}{ll}
\text{stddev}(returns,20) & \text{if } returns<0 \\
\text{close} & \text{otherwise}
\end{array} \right\}, 2 \right), 5 \right)
$$

An example is best for explaining what this step does:

In [58]:
df2492['ts_argmax'] = df2492['signed_power'].rolling(window=5).apply(lambda x: np.argmax(x) + 1 if len(x) >= 5 else np.nan)

selected_columns = ['returns','stddev','signed_power', 'ts_argmax']
df2492_selected = df2492[selected_columns]
df2492_selected.iloc[25:35]


Unnamed: 0,returns,stddev,signed_power,ts_argmax
25,-0.007557,0.028542,0.000815,3.0
26,0.002538,0.027333,6e-06,2.0
27,-0.005063,0.027258,0.000743,1.0
28,-0.035623,0.025767,0.000664,1.0
29,-0.007916,0.025835,0.000667,1.0
30,-0.018617,0.025142,0.000632,2.0
31,0.0271,0.024144,0.000734,1.0
32,-0.010554,0.024197,0.000585,4.0
33,0.013333,0.024358,0.000178,3.0
34,-0.015789,0.023512,0.000553,2.0


Let's examine the data from row indices 29 and 32:

- For row index 29: the ts_argmax function selects the maximum value from row indices 25 to 29, where row index 25 corresponds to day 1, row index 26 to day 2, and so on. The maximum value of signed_power between row indices 25 and 29 is at row index 25, which is indexed as 1. Therefore, the ts_argmax value in row index 29 is assigned 1.
- For row index 32: the maximum value of signed_power from row indices 28 to 32 is at row index 31, where the value is 0.000734. Thus, the assigned value for row index 32 is index 4.
<br>

This procedure indicates how far the current date is from the last peak. This peak could have originated from high returns or high volatility. Following the belief in mean reversion, where stock prices tend to fluctuate around a certain value:

If we are far from a period of high returns, we could anticipate that the stock price is likely to rise towards another peak.
If we are far from a period of high volatility, it suggests that volatility is likely decreasing and the market is stabilizing.

Note: the parameter 5(Looking at the past five days) is probably worth modificatiom. In my opinion, it could be a dynamical parameter which comes from sampling the fluctuating period of the stock. 

## Rank

This part simply normalized the vector and make it centered nearly around 0

In [59]:
df2492['rank'] = df2492['ts_argmax']/5-0.5

selected_columns = ['returns','stddev','signed_power', 'ts_argmax','rank']
df2492_selected = df2492[selected_columns]
df2492_selected.iloc[25:30]


Unnamed: 0,returns,stddev,signed_power,ts_argmax,rank
25,-0.007557,0.028542,0.000815,3.0,0.1
26,0.002538,0.027333,6e-06,2.0,-0.1
27,-0.005063,0.027258,0.000743,1.0,-0.3
28,-0.035623,0.025767,0.000664,1.0,-0.3
29,-0.007916,0.025835,0.000667,1.0,-0.3


## Summary 

Strategy: Find the index of the maximum value of the past 5 days' records (either the standard deviation of the past 20 days or the closing price) for each stock as its weight. Then sort the weights for each stock, and finally return a boolean value (the percentage of the ranking position among all stocks) minus 0.5 as the value of factor alpha001 (rank). Determine: if alpha001 > 0, buy more of the stock; if alpha001 < 0, sell the existing position of the stock.