In this notebook, we provide minimal working example for running our algorithm.


We simulate stock relative price data with $n=20$ days and $d=100$ stocks.
The generated data $X=[X_1,\ldots,X_{n}]^{\top}$.

In [8]:
import numpy as np
n = 20
d = 100
np.random.seed(0)
X = np.random.rand(n,d) + 0.5

In [9]:
X.shape

(20, 100)

Our algorithm is designed to solve cardinality constrained problem on $\mathbb{S}^{d-1}=\{v\in\mathbb{R}^d:\|v\|_1=1,v\succeq 0\}$
$$\min\limits_{w\in\mathbb{S}^{d-1},\|w\|_0\leq s} -\frac{1}{n}\sum\limits_{i=1}^n u(w^{\top}X_i),$$
by solving an easier $l_1$-regularized problem
$$\min\limits_{w\in\mathbb{R}^d_+} -\frac{1}{n}\sum\limits_{i=1}^n u(w^{\top}X_i) + \lambda\|w\|_1.$$
where $u$ is a utility function.
In our implementation, we use logarithmic utility (`func=0`) and exponential utility (`func=1` and `a=1` by default).





As an example, we generate $n_{\lambda}=50$ $\lambda$'s uniformly spaced from $\lambda_{\max}$ to $10^{-2}\lambda_{\max}$ in the log scale.
This is automatically done internally when calling our main function `spo_l1_path`. Note that there are some other parameters that can be specified to suit one's need.
Please refer to the document by command `?spo_l1_path` after importing the function.

In [10]:
from spo import spo_l1_path
ws, lambdas, gaps, n_iters, n_active_features = spo_l1_path(X, func=0,
    n_lambdas=50, screen=True, max_iter=int(1e4), f=200, tol=1e-3)

100%|███████████████████████████████████████████| 50/50 [00:00<00:00, 87.49it/s]


The resulting portfoluo weights `ws` are of shape $(d,n_{\lambda})$.

In [15]:
ws.shape

(100, 50)

In [14]:
np.sum(ws>0., axis=0)

array([0, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       3, 3, 3, 3, 4, 4])

One can further use cross-validation to determine the best $\lambda$ and test it on the new data.