The following is a guided tour of portfolio_allocator's functional components.  Together, we will see how portfolio_allocator prepares data and employs two methods to calculate allotments.  The code herein also allows users to generate different input configurations to see how portfolio_allocator performs across different scenarios.  For a detailed explaination of the purposes and uses of portfolio_allocator, please see the file included in the project repository, purpose_and_uses.md.

In [31]:
# Import relevant libraries
import pandas as pd
import numpy as np
from random import randint
import sympy as sp

# Inputs
Following the import of relevant libraries, 3 basic inputs are defined:
- `cash`
- `target_pcts`
- `current_values`

The utility of the portfolio allocation methods will vary depending on these inputs.  Sometimes, portfolio allocation is so straightforward, advanced methods are not needed.  The default values used here provide a scenario in which advanced methods are useful.  You will see a scenario in which Method 1 is suboptimal and multiple iterations of Method 2 are required to achieve an optimal allocation.  To explore other scenarios, users can use the commented out code in the following cell.  Note however, that such variety will often generate uninteresting scenarios.  

In [32]:
# Define the basic features of the portfolio and investment
    # cash is the lump sum to be alloted to investments
    # current_values are the current values of the investments in the portfolio
    # target_pcts are the set of target allocations specified as 
    # proportions that sum to 1 (minus rounding error)

cash = 500
current_values = [10, 20, 30, 50, 80, 130, 210, 340, 550, 890]
target_pcts = [.1,.1,.1,.1,.1,.1,.1,.1,.1,.1]

# The default inputs used here capture the set of conditions that make the useuflness of portfolio_allocator most obvious
# However, the script will work with random values (generated by the commented out code) for the sake of experimentation

#cash = np.random.randint(1000,10000)
#number_of_investments = 10 #np.random.randint(3,20)
#def create_target_pcts(n_i):
#    n = 20 - n_i
#    arr = [1] * n_i
#    for i in range(n):
#        arr[randint(0, n) % n_i] += 1
#    return [(5*x)/100 for x in arr]
#target_pcts = [.1,.1,.1,.1,.1,.1,.1,.1,.1,.1]#create_targets(number_of_investments)
#current_values = np.random.randint(10,1000, size=number_of_investments)

# The sum of target_pcts should equal 1
assert round(sum(target_pcts)) == 1, 'sum of target % allocations do not equal zero'

# Create DataFrame
df = pd.DataFrame({'target%':target_pcts,'current_value':current_values})

# Output
print('cash = ',cash)
df

cash =  500


Unnamed: 0,target%,current_value
0,0.1,10
1,0.1,20
2,0.1,30
3,0.1,50
4,0.1,80
5,0.1,130
6,0.1,210
7,0.1,340
8,0.1,550
9,0.1,890


# Data Preparation

In [33]:
# Create calculated columns which will be used by the allocation methods below.
    # Note that target values reflect cash as shown in calculation
df['current%'] = df['current_value'] / df['current_value'].sum()
df['target_value'] = df['target%'] * (df['current_value'].sum()+cash)
df['deficit'] = df['target_value'] - df['current_value']
df['error'] = (df['current_value'] / df['current_value'].sum()) - df['target%']

# Create 'rank column' to reflect rankings of deficits (i.e., discrepancies between the target and current values)
    # Rankings are made in descending order such that the largest deficit gets a rank of 0
df['rank'] = df['deficit'].rank(method='first',ascending=False).astype(int)-1

# Having the dataframe sorted by rank makes calculations subsequent easier to interpret
df.sort_values(by='rank',inplace=True)

# Reorder columns
for col in reversed(('target%','current%','target_value','current_value','rank','deficit','error')):
    extracted_col = df.pop(col)
    df.insert(0, col, extracted_col)

df

Unnamed: 0,target%,current%,target_value,current_value,rank,deficit,error
0,0.1,0.004329,281.0,10,0,271.0,-0.095671
1,0.1,0.008658,281.0,20,1,261.0,-0.091342
2,0.1,0.012987,281.0,30,2,251.0,-0.087013
3,0.1,0.021645,281.0,50,3,231.0,-0.078355
4,0.1,0.034632,281.0,80,4,201.0,-0.065368
5,0.1,0.056277,281.0,130,5,151.0,-0.043723
6,0.1,0.090909,281.0,210,6,71.0,-0.009091
7,0.1,0.147186,281.0,340,7,-59.0,0.047186
8,0.1,0.238095,281.0,550,8,-269.0,0.138095
9,0.1,0.385281,281.0,890,9,-609.0,0.285281


# Method 1: allocate-by-rank

This method allots the cash to completely eliminate the deficits in rank order untill the cash is gone.  For example, if the cash is $3 and the two top ranking deficits are each $2, 
this method will result in an allocation of $2 to the investment with the top ranking deficit and $1 to the investment with the second-ranking deficit.

In [34]:
# define function to determine allotments based on the index of the DataFrame and the amount alloted so far
    # The function allots an amount equal to the deficit or whatever dollars remain 
    # after accounting for previous allocations
    
def determine_amount(i,allocated_so_far):
    if (df.loc[i,'deficit'] <= cash - allocated_so_far) & (df.loc[i,'deficit']>=0): 
        return df.loc[i,'deficit']
    elif (df.loc[i,'deficit'] >= cash - allocated_so_far) & (df.loc[i,'deficit']>=0): 
         return cash - allocated_so_far
    elif df.loc[i,'deficit'] < 0:
        return 0

# calculate the allotment for each investment
total_allocation = 0
money = cash
allocated_cumsum = 0
for rank in df['rank']:
    b = df['rank']==rank
    index = df.loc[b].index[0]
    allocation = determine_amount(index,allocated_cumsum)
    df.loc[b,'allocate_by_rank'] = allocation
    allocated_cumsum += allocation
    money = cash - allocated_cumsum

# calculate the error of the new values of investments in the hypothetical portfolio in which these allotments were made
df['error_m1'] = ((df['current_value'] + df['allocate_by_rank']) / (df['current_value'].sum()+cash)) - df['target%']

# make sure the sum of the allocations equals the cash
print('cash = ',cash)
assert round(df['allocate_by_rank'].sum()) == cash, "sum of Method 1 allocations does not equal cash"

df

cash =  500


Unnamed: 0,target%,current%,target_value,current_value,rank,deficit,error,allocate_by_rank,error_m1
0,0.1,0.004329,281.0,10,0,271.0,-0.095671,271.0,0.0
1,0.1,0.008658,281.0,20,1,261.0,-0.091342,229.0,-0.011388
2,0.1,0.012987,281.0,30,2,251.0,-0.087013,0.0,-0.089324
3,0.1,0.021645,281.0,50,3,231.0,-0.078355,0.0,-0.082206
4,0.1,0.034632,281.0,80,4,201.0,-0.065368,0.0,-0.07153
5,0.1,0.056277,281.0,130,5,151.0,-0.043723,0.0,-0.053737
6,0.1,0.090909,281.0,210,6,71.0,-0.009091,0.0,-0.025267
7,0.1,0.147186,281.0,340,7,-59.0,0.047186,0.0,0.020996
8,0.1,0.238095,281.0,550,8,-269.0,0.138095,0.0,0.09573
9,0.1,0.385281,281.0,890,9,-609.0,0.285281,0.0,0.216726


# Method 2: allocate-for-minimal-errors

Method 1 can lead to suboptimal results.  When alloting cash to elminate deficits, 
errors are reduced to zero for investments with the top ranking deficits, but may remain large for investments with lower ranking deficits.  The optimal result would be a set of allotments that minimize all errors as much as possible.  

Method 2 achieves this by calculating the minimum possible error for all investments with a deficit greater than 0, and then calculating the allotments required to produce those errors.   Sometimes however, this can result in negative allotment values when one or more investments have a relatively small error in the current portfolio.  To prevent such impractical allotment values, Method 2 iteratively calculates allotments and constant errors for each investment.  If any allotment values are negative, the negative allotment term is dropped from the subsequent calculations and iteration continues.  Iteration ends once the non-negative constraint for allotments is satisfied.

In [35]:
# For illustrate purposes, create system of equations required to calculate the minimum equal error 
# achievable for investments with a deficit greater than 0
    
# The first set of equations are derived from the same equation used to calculate the "error" columns above
    # the symbol e is reserved for error
    # all other single, lowercase, letter symbols represent allocations
cv = df['current_value'].to_list()
t_pct = df['target%'].to_list()
from sympy.abc import *
equations = {}
allotment_symbols = []
b = df['deficit'] >= 0
deficit_cnt = df.loc[b,'rank'].max()
for I,S in enumerate('abcdfghijklmnopqrstuvwxyz'[:deficit_cnt+1]):
    allotment_symbols.append(symbols(S))
    equations.update({'eq'+str(I):(((cv[I] + symbols(S)) / (sum(cv)+cash)) - t_pct[I]) - e})

# The final equation reflects the constraint that the sum of all allotments should equal the cash available to invest
equations.update({'eq'+str(deficit_cnt+1):sum(allotment_symbols) - cash})
print('\nSystem of equations (all equal to zero): '); display(equations)

# solve the system of equations
symbol_values = sp.solve(equations.values(), allotment_symbols+[e])

print('\nInitial set of values: '); display(symbol_values)


System of equations (all equal to zero): 


{'eq0': a/2810 - e - 0.09644128113879,
 'eq1': b/2810 - e - 0.0928825622775801,
 'eq2': c/2810 - e - 0.0893238434163701,
 'eq3': d/2810 - e - 0.0822064056939502,
 'eq4': -e + f/2810 - 0.0715302491103203,
 'eq5': -e + g/2810 - 0.0537366548042705,
 'eq6': -e + h/2810 - 0.0252669039145908,
 'eq7': a + b + c + d + f + g + h - 500}


Initial set of values: 


{a: 137.142857142857,
 b: 127.142857142857,
 c: 117.142857142857,
 d: 97.1428571428572,
 f: 67.1428571428571,
 g: 17.1428571428572,
 h: -62.8571428571428,
 e: -0.0476359938993391}

In [36]:
# This code uses equations derived from the system of equations above
# It iteratively calculates allotments and constant errors for each investment
# If any allotment values are negative, iteration continues, and
# the negative allotment term is dropped from the subsequent calculations.
# Iteration ends once the non-negative constraint is satisfied

cv = df['current_value'].to_list()
t_pct = df['target%'].to_list()
deficit = df['deficit'].to_list()
select_ranks = [i for i in range(len(df)) if deficit[i]>0]
iteration = 1
success = False

while not success:
    print('iteration = ',iteration)
    #Calculate equal errors
    #The equation for e is derived from the system of equations presented above
    select_cv = sum([cv[i] for i in select_ranks])
    select_t_pct = sum([t_pct[i] for i in select_ranks])
    select_ranks_cnt = len(select_ranks)
    e = 1/select_ranks_cnt * ((select_cv + cash)/(sum(cv) + cash) - select_t_pct)

    # Calculate cash allocations in a dictionary where keys are ranks and values are allocations
    # The equation for each allocation is derived from the system of equations presented above
    allotments = {i:((e + t_pct[i])*(sum(cv)+cash)) - cv[i] if i in select_ranks else 0 for i in range(len(df))}
    display(allotments)

    # Check if the non-negative constraint is satisfied
    if all(val >= 0 for val in allotments.values()):
        success = True
    else:
        select_ranks = [rank for rank,allotment in allotments.items() if allotment > 0]
        iteration += 1

# Create column containing allotments calculated via Method 2
df['allocate_for_minimal_errors'] = df.apply(lambda r: allotments[r['rank']],axis=1)

# Calculate errors based on Method 2 allotments
df['error_m2'] = ((df['current_value'] + df['allocate_for_minimal_errors']) / (df['current_value'].sum()+cash)) - df['target%']

df

iteration =  1


{0: 137.14285714285717,
 1: 127.14285714285717,
 2: 117.14285714285717,
 3: 97.14285714285717,
 4: 67.14285714285717,
 5: 17.142857142857167,
 6: -62.85714285714283,
 7: 0,
 8: 0,
 9: 0}

iteration =  2


{0: 126.66666666666669,
 1: 116.66666666666669,
 2: 106.66666666666669,
 3: 86.66666666666669,
 4: 56.666666666666686,
 5: 6.666666666666686,
 6: 0,
 7: 0,
 8: 0,
 9: 0}

Unnamed: 0,target%,current%,target_value,current_value,rank,deficit,error,allocate_by_rank,error_m1,allocate_for_minimal_errors,error_m2
0,0.1,0.004329,281.0,10,0,271.0,-0.095671,271.0,0.0,126.666667,-0.051364
1,0.1,0.008658,281.0,20,1,261.0,-0.091342,229.0,-0.011388,116.666667,-0.051364
2,0.1,0.012987,281.0,30,2,251.0,-0.087013,0.0,-0.089324,106.666667,-0.051364
3,0.1,0.021645,281.0,50,3,231.0,-0.078355,0.0,-0.082206,86.666667,-0.051364
4,0.1,0.034632,281.0,80,4,201.0,-0.065368,0.0,-0.07153,56.666667,-0.051364
5,0.1,0.056277,281.0,130,5,151.0,-0.043723,0.0,-0.053737,6.666667,-0.051364
6,0.1,0.090909,281.0,210,6,71.0,-0.009091,0.0,-0.025267,0.0,-0.025267
7,0.1,0.147186,281.0,340,7,-59.0,0.047186,0.0,0.020996,0.0,0.020996
8,0.1,0.238095,281.0,550,8,-269.0,0.138095,0.0,0.09573,0.0,0.09573
9,0.1,0.385281,281.0,890,9,-609.0,0.285281,0.0,0.216726,0.0,0.216726


As shown in the output above, Method 1 and Method 2 result in different allotment sets.  By comparing the error columns, we can see that Method 2 optimizes errors across investments by minimizing them as much as possible (across investments with a deficit greater than zero).  

While the portfolio size and allotment amounts are relatively small in this example, the difference between optimal and suboptimal allocation sets may have significant consequences for larger portfolios with larger allocation amounts.  In high stakes situations involving hedge funds, company-owned assests, government or corporate budgets, or philanthropic initiatives, 
each allocation method could differentially impact overall portfolio performance or the recipients of the funding determined by the allocation method.

# Demonstration of portfolio_allocator()

In [37]:
%run "portfolio_allocator().py"
portfolio_allocator(cash,current_values,target_pcts)

Unnamed: 0,target%,current%,target_value,current_value,rank,deficit,error,allocate_by_rank,error_m1,allocate_for_minimal_errors,error_m2
0,0.1,0.004329,281.0,10,0,271.0,-0.095671,271.0,0.0,126.666667,-0.051364
1,0.1,0.008658,281.0,20,1,261.0,-0.091342,229.0,-0.011388,116.666667,-0.051364
2,0.1,0.012987,281.0,30,2,251.0,-0.087013,0.0,-0.089324,106.666667,-0.051364
3,0.1,0.021645,281.0,50,3,231.0,-0.078355,0.0,-0.082206,86.666667,-0.051364
4,0.1,0.034632,281.0,80,4,201.0,-0.065368,0.0,-0.07153,56.666667,-0.051364
5,0.1,0.056277,281.0,130,5,151.0,-0.043723,0.0,-0.053737,6.666667,-0.051364
6,0.1,0.090909,281.0,210,6,71.0,-0.009091,0.0,-0.025267,0.0,-0.025267
7,0.1,0.147186,281.0,340,7,-59.0,0.047186,0.0,0.020996,0.0,0.020996
8,0.1,0.238095,281.0,550,8,-269.0,0.138095,0.0,0.09573,0.0,0.09573
9,0.1,0.385281,281.0,890,9,-609.0,0.285281,0.0,0.216726,0.0,0.216726
