The basic idea is representing groups of positions using a sparse array. While there can be a large number of these groups (thing about individual portfolios held by multiple individuals), expressing their holdings for each active symbol will be sparse. For example, if there are 10,000 symbols and I have 5 positions, the row representing my position will contain 9,995 entries with zero shares held.

If we can represent a set of portfolios this way, can we efficiently calculate the market value as new price quotes arrive?

This notebook is for messing around with this idea...

For this sample we'll use the scipy compressed spare row (CSR) matrix.

In [1]:
from scipy.sparse import csr_matrix

For symbols, we'll grab the top 10 most active from yahoo finance, which on the day I grebbed them are SNAP, F, FB, AMD, AAPL, BAC, T, SOFI, PINS, and NIO.

In [2]:
symbols = ["SNAP", "F", "FB", "AMD", "AAPL", "BAC", "T", "SOFI", "PINS", "NIO"]

In a non-spase representation, a view of 3 sets of positions might be represented like this:

In [3]:
from numpy import array

In [5]:
positions = array([[1,0,0,0,0,0,0,0,1,0],[0,1,0,1,0,0,0,0,0,0],[0,0,0,0,0,1,1,0,0,0]])
print(positions)

[[1 0 0 0 0 0 0 0 1 0]
 [0 1 0 1 0 0 0 0 0 0]
 [0 0 0 0 0 1 1 0 0 0]]


Even in the trivial case above we can see the inefficiencies. Compare with the sparse representation.

In [6]:
sp = csr_matrix(positions)
print(sp)

  (0, 0)	1
  (0, 8)	1
  (1, 1)	1
  (1, 3)	1
  (2, 5)	1
  (2, 6)	1


Now calculation market value given a set of quotes, first if there's a price quote for all symbols

In [12]:
latest_prices = [38.91,17.96, 237.09,123.60,172.39,48.28,24.08,11.89,27.25,23.96]
print(sp.multiply(latest_prices))

  (0, 0)	38.91
  (0, 8)	27.25
  (1, 1)	17.96
  (1, 3)	123.6
  (2, 5)	48.28
  (2, 6)	24.08


We can also represent just some symbols having price updates as a sparse array, and use that to calculate market values.

In [18]:
# Here we have prices only for ford (F) and apple (AAPL), at columns 1 and 4, respectively
row = array([0,0])
col = array([1,4])
data = array([17.96,172.39])
sparse_prices = csr_matrix((data, (row, col)), shape=(1, 10))
print(sparse_prices)

  (0, 1)	17.96
  (0, 4)	172.39


In [17]:
print(sparse_prices.toarray())

[[  0.    17.96   0.     0.   172.39   0.     0.     0.     0.     0.  ]]


Now calculate position market value based on last quote

In [20]:
print(sp.multiply(sparse_prices))
print(sp.multiply(sparse_prices).toarray())

  (1, 1)	17.96
[[ 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.   17.96  0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]]


In [22]:
# Original set of positions - no apple holdings, one ford holding
print(positions)

[[1 0 0 0 0 0 0 0 1 0]
 [0 1 0 1 0 0 0 0 0 0]
 [0 0 0 0 0 1 1 0 0 0]]
