# Things I was in the middle of doing


I don't think the possible_trades function is working, I need to fix that. I was going to finish off by randomly taking possible trades to conclude the portfolio for subsequent days. This would be added in a seperate or in addition to the do_the_trades function and would be the final step in making the portfolio for the next day. After that the things I thing to be done are

- I am thinking about how to deal with the closing of positions. I think right now it is best to close a position on 1st of december and do not make a trade in its' place until the second, and this allows the passing of the dataframes to be a bit simpler. I haven't got to coding this yet. This will mean that when a position closes we need to remove it from our dataframe, which will then allow space for one more trade on the next day! 



- set up returns. This should be straight forward initially, however one thing I am considering is that this code will need to be altered to compensate for prices changes before the end of the period that we are looking at. In the end the system currently in place will need replaced, we will need to re-do the stock weightings and exposures manually in a function but I don't think this should be too bad. 



- incorporate sector and country limits of exposure. This will be challenging I reckon, to get it working alongside factor exposure limits. Then we need to consider gross factor exposure too. 


## Portfolio Construction



This file is a first attempt at ensuring exposures do not exceed given levels. It is split up into sections, and this may eventually be split into documents depending on how computationally intense this is to run...


I want to structure it something like: 


- Data Loading & Pre-Processing

- Set-up & Variable definitions

- Portfolio Construction (initial set up)

- Portfolio Construction (future days)

- Portfolio Returns

- Visualisations


There are a number of simplifications I make initially. To make stuff run I'm only going to deal with trades for one country, and initially this code does $\textbf{NOT}$ take into account sector exposure. This should be relatively easy to add.

In [None]:
import numpy as np
import sympy as sp
import pandas as pd

from datetime import datetime

### Data Loading & Pre-Processing


In [None]:
#General formatting. I use japan as it is relatively small. Only important point here is that datetime needs
#reformated for easy filtering, and I am initially only working with 2017 again just for practical reasons. 

japan_trades = pd.read_csv('Japan Trades.csv')
japan_trades = japan_trades.dropna()

japan_trades['Date'] = pd.to_datetime(japan_trades['Date'])
japan_2017 = japan_trades[japan_trades['Date'].dt.year == 2017]
japan_2017 = japan_2017.drop(columns=['returns', 'company_name','Unnamed: 0'])
Days = japan_2017.Date.unique()

### Set-up & Variable definitions


In [None]:
total_val = 10000000 
trade_val = 250000

#This is effectively a measure of the exposure. For the given numbers we get a value of 1/40, and so if we want
#our maximum factor exposure to be 12% for each factor this means we cannot have a " 5 trade swing" in favour of 
#buy or sell. For example if we had 14 buy trades and 10 sell shorts this is fine, but if it was 15 and 10 we 
#would be over exposed to the given factor. 

stock_weighting = np.round(trade_val/total_val,3)

#unique factors and sectors. This is importantly for all days, for each day we may not have a factor / sector present
#and so these need to be re-defined daily. 

factors = list(japan_trades['Factor'].unique())
sectors = list(japan_trades['Sector'].unique())

#inbuilt maximum net and sector exposure. Can be changed. 
maximum_net_exposure = np.round(1/len(factors),2)
maximum_sector_exposure = np.round(100/len(sectors),0)

#This will contain the portfolio for each given day 
portfolios = []

### Portfolio Construction

##### Valid Exposure Checker

This function checks to ensure a data set does not exceed the maximum net exposure. The data past in here should be filtered by factor already. 

In [1]:
#To make code more efficient, this valid_exposure function is called first to see if our input is already valid.
#It is checking that the length (i.e. number of trades) for a given factor does not exceed the allocation
#for instance the ML models may have given us 10 datapoints for a given factor / day in the Japan Trades.csv file
#and this ensures we do not exceed that. It also ensures we do not allocate too much of our portfolio to one single factor. 

def valid_exposure(data,maximum_net_exposure,max_trades):
    exposure = 0
    for i in ["Sell Short","Buy"]:
        exposure_by_side = data[data.Side == i]
        if i == "Sell Short":
            exposure += stock_weighting*len(exposure_by_side)
        else:
            exposure -= stock_weighting*len(exposure_by_side)
    if np.abs(exposure) <= maximum_net_exposure:
        if len(data) <= max_trades:
            print("hes valid")
            return True
    else:
        return False

    
#This is the chunkier function, doing more checks. It is constructing the portfolio based on ensuring it does not exceed
#a given maximum numbr of trades, the factor exposure does not exceed a certain level. 

#data                 : should be a dataframe, pre sorted by both date and factor.
#maximum_net_exposure : the maximum exposure to a given factor. Initially I take them all as 12%
#max_trades           : the maximum number of trades we can make for each factor. 


def get_valid(data,maximum_net_exposure,max_trades):
    
    #calls simpler function for efficiency. 
    if valid_exposure(data,maximum_net_exposure,max_trades) == True:
        return data
    else: 
        
        #sets up our blank portfolio representing the day,factor pair. 
        portfolio = data.head(1)
        
        #extracts positions that are buy or sell short. 
        buys = data[data.Side == "Buy"]
        sell_shorts = data[data.Side == "Sell Short"]
        
        #this difference in length can be seen as the "over-exposure"? For instance if we had 80 buys and 70 short sells
        #we would have 10 difference, and after multiplying by the (initially equal) weighting of each of these
        #we would arrive at 10*0.025 = 0.25 = over exposed! 
        
        difference = np.abs(len(buys) - len(sell_shorts))

        
        #this sets up our dataframe first by filling with pairs of buy and sell shorts to ensure net_exposure
        #remains un-changed. Entries here are going to be filtered down later in the code.
        for i in range(0,max(len(buys),len(sell_shorts))-difference):
            portfolio = portfolio.append(buys.iloc[[i]])
            portfolio = portfolio.append(sell_shorts.iloc[[i]])
        
        #This sets up the remaining trades, ensuring we do not over-expose for a given factor.
        available_trades = difference
        available_trades_to_reach_net_exposure = np.floor(maximum_net_exposure/stock_weighting)
        trade_limit = min(available_trades,available_trades_to_reach_net_exposure)
        
        #adds our new trades to the portfolio. If we have more buys we add buys, etc. 
        if len(buys) > len(sell_shorts):
            for k in range (0,trade_limit):
                    portfolio = portfolio.append(buys.iloc[[i+k]])
        elif len(sell_shorts) < len(buys):
             for k in range (0,trade_limit):
                    portfolio = portfolio.append(sell_shorts.iloc[[i+k]])
                    
            
        #this bit is just ensuring we do not make too many trades. It is checking the length of our constructed portfolio
        #If too long, it removes a pair. This will ensure our portfolio remains within 1 unit of the maximum allocated
        #trades. 
                    
        if len(portfolio) > max_trades:
            difference = len(portfolio) - max_trades
            n2 = len(portfolio)
            
            if difference % 2 == 0:
                sample = int(difference/2)
                for i in ["Sell Short","Buy"]:
                    portfolio = portfolio.drop(portfolio[portfolio['Side'] == i].sample(n= sample).index)
            else:
                sample = int((-1 + difference)/2)                               
                for i in ["Sell Short","Buy"]:
                    portfolio = portfolio.drop(portfolio[portfolio['Side'] == i].sample(n= sample).index)
        
        #removes the first row that was needed for set-up.
        return portfolio.iloc[1: , :]

#### Test Cell

Use this to test valid exposure checker & get valid exposure. It is important to make sure the dataframe being passed in is already sorted by date! 

In [2]:
first_day = Days[0]
japan_2017_initial = japan_2017[japan_2017['Date'] == first_day]

data = japan_2017_initial[japan_2017_initial.Factor == "Behavioural"]
maximum_net_exposure = 0.12
maximum_trades = 10

get_valid(data,maximum_net_exposure,maximum_trades)

NameError: name 'Days' is not defined

## Initial Portfolio (On Day 1)

The below code is simply constructing the initial portfolio. 

In [None]:
#gets the data for the first day
first_day = Days[0]
japan_2017_initial = japan_2017[japan_2017['Date'] == first_day]

initial_portfolio = [] 
available_trades_by_factor = []
#just incase you ran the test sell here max_trades is re-defined to ensure the variable is still correct. 
max_trades = int(np.round(total_val/trade_val,0))

#gets the factors for which we may trade for a current day. Most days in the current data we would not be trading every factor.
factors_current_day = japan_2017_initial.Factor.unique()

#gets maximum amount of trades we can make for each factor so that our portfolio remains within feasible cost (i.e. not
#exceeding 10 million). 

max_trades_by_factor = int(max_trades/len(factors_current_day))

#this is just creating the portfolio from the already defined functions. 
for i in japan_2017_initial.Factor.unique():
    current_factor = japan_2017_initial[japan_2017_initial.Factor == i ]
    initial_portfolio.append(get_valid(current_factor,maximum_net_exposure,max_trades_by_factor))  
portfolios.append(pd.concat(initial_portfolio))

## Portfolio (Future Days)


### Needed Functions

In [None]:
#initial_data         : the portfolio from previous time period. 
#potential_trades     : the dataframe taken from "Japan 2017.csv" filtered for the current day.
#maximum_net_exposure : as defined previously, in this case 12%
#max_trades_by_factor : as defined previously, in this case 40 (for equal weightings).


#this gets the factor exposure for a data-frame, given constant stock weightings. This will be easy to change to not constant.
#so for instance the stock_weighting os 0.0025 in our example, and thus exposure is just the sum of buys * weighting minus 
#the sum of shorts * weighting


def get_exposure(data,stock_weighting):
    exposure = 0
    for i in ["Sell Short","Buy"]:
        if i == "Sell Short":
            exposure -= stock_weighting*len(data[data['Side'] == "Sell Short"])
        else:
            exposure += stock_weighting*len(data[data['Side'] == "Buy"])
    return exposure
            
            
        
#this returns a df of the maximum number of available trades available for each factor. It appends in order of factors
#in which they appear in the original dataset. 

def max_factor_trades(initial_data,maximum_net_exposure):
    available_trades = [] 
    for i in factors:
        current_factor = initial_data[initial_data['Factor'] == i]
        current_exposure = get_exposure(current_factor,stock_weighting)
        if len(current_factor['Factor']) == 0:
            available_trades.append(int(np.floor(maximum_net_exposure/stock_weighting)))
        else:
            current_trades = len(current_factor)
            if np.abs(current_exposure) < maximum_net_exposure:
                difference = maximum_net_exposure - current_exposure 
                trade_exposure = stock_weighting
                trades = int(np.floor(difference/stock_weighting) - current_trades)
                if trades > 0:
                    available_trades.append(trades)
                else:
                    available_trades.append(0)
            else:
                available_trades.append(0)
    trades = pd.DataFrame(factors,available_trades)
    trades = trades.reset_index()
    trades = trades.set_axis(['Available Trades', 'Factor'], axis=1, inplace=False)
    return trades
        

# we can trade if:
# 1) we have not already met our trade limit, i.e. 10 million / (250,000*n) where n is the number of trades must be greater
#than 1. 
# 2) we have an available trade for the current day that we could do. This needs to check to see that by adding this trade
# we would not exceed factor exposure limit, which is done by calls to the max_factor_trades function. 
    
#we leave the portfolio as it was the day before if we cannot trade. 
    
def possible_trades(initial_data,potential_trades,maximum_net_exposure,max_trades_by_factor):
        trade_max = [] 
        n = len(initial_data)
        available_trades_all_factors = 10*maximum_net_exposure/stock_weighting - len(initial_data)
        #checking we can make a trade (i.e. we have not met our trade limit). I have made a mistake here and the 10* needs changed
        if available_trades_all_factors >0:
            #calls max trades to see the maximum available number, based only on yesterdays data.
            max_trades = max_factor_trades(initial_data,maximum_net_exposure)
            for i in range(0,len(factors)):
                #effectively checks if we have availability from yesterday and a potential trade to make today that we could do.
                if max_trades['Available Trades'][i] > 0 and len(potential_trades[potential_trades['Factor'] == factors[i]]) > 0:
                    trade_max.append(min(max_trades['Available Trades'][i],len(potential_trades[potential_trades['Factor'] == factors[i]])))
                else:
                    trade_max.append(0) 
            
            #checking to make sure we do not exceed the available trades.
            if sum(trade_max) <= available_trades_all_factors:
                
                trade_max = pd.DataFrame(factors,trade_max)
                trade_max = trade_max.reset_index()
                trade_max = trade_max.set_axis(['Trades to Make', 'Factor'], axis=1, inplace=False)
                return trade_max
            else:
                #if we do exceed it, we need to remove some trades randomly from non-zero trades. Not certain this works yet
                #but I think it does.
                difference = available_trades_all_factors - sum(trade_max)
                counter = 0
                
                #loops till we have removed the required number of elements. 
                while (counter < difference):
                    for i in range (0,len(factors)):
                        if trade_max[i] > 0:
                            trade_max[i] -= 1 
                            counter +=1
                trade_max = pd.DataFrame(factors,trade_max)
                trade_max = trade_max.reset_index()
                trade_max = trade_max.set_axis(['Trades to Make', 'Factor'], axis=1, inplace=False)
                return trade_max
        else:
            return 1

### Functions Testing

In [None]:
print("Current Factor Exposure: " + str(np.round(get_exposure(portfolios[0],stock_weighting),2)))


print("Trades function output: ")

#I use .head to only get the first 2 rows, this indicates a change is needed as the current possible_trades function is wrong!!!!!!!!!

print(possible_trades(portfolios[0].head(2),japan_2017[japan_2017['Date'] ==Days[2]],maximum_net_exposure,max_trades_by_factor))



print("And corresponding available trades : ")
max_factor_trades(portfolios[0],maximum_net_exposure)

## Portfolio (Future Days)



(code not finished, I haven't actually incorporated the new trades as I didn't manage to finish incorporating a number of required things. Namely actually making the trades, and removing trades that have ended). 


In [None]:
#this removes first day and goes from there. Initially I do first 2 days. 
for i in Days[1:2]:
    
    #gets portfolio from previous day. 
    initial_portfolio = portfolios[0]
    
    #gets the dataframe of trades we may choose to make. 
    potential_trades = japan_2017[japan_2017['Date'] == i]
    
    #these set up max trades by factor. Here
    factors_current_day = potential_trades.Factor.unique()
    max_trades_by_factor = int(max_trades/len(factors_current_day))
        
     
    #(make calls to the functions, write the function to remove closed trades)
    #also finish function to actually add the inteded trades, I have not done that yet. Only identified
    #which factors we can make more trades for and how many. This should be straight forward but I also want to check
    #that the other parts of the code work. 
    

        


### Portfolio Returns


### Visualisations