**Fin 585**  
**Diether**  
**Problem Set**  
**Introduction to Portfolios**  
**Application: Short Selling**

**Overview**

This problem set is introduces you to portfolio construction using a real portfolio application from the Finance literature. This assignment builds on the concepts and code I introduced last time in class. I think you'll find the Intro to Portfolios' Jupyter notebook helpful for this assignment.


**Short Selling Background**

When someone shorts a stock, they profit if the price of the stock goes down instead of going up, but short-selling transactions are more complicated than going long (buying a stock and then later selling it).  There are four basic steps to short selling:

1. *The short seller borrows the desired number of shares from someone.* This is usually done by the broker who locates the shares and the broker becomes the middleman for the short seller and the lender (note, the broker is often both the middleman and the lender). The lender expects to be paid interest on the loan which is the main cost of shorting. The loan is callable by the lender at any time. The short seller can repay the loan at any time. 

2. *The short seller sells the shares.* The proceeds are put into an interest-bearing account called the collateral account. Most lenders require the collateral account to contain 102% of the value of the proceeds.  The collateral account usually invests in low risk, short term securities (e.g., Treasury bills). When the short seller borrows the stock there are lending fees; the short seller pays interest on the loan. Typically, the interest rate is small. The overall interest rate earned on the collateral account is split between the lender and the short seller. The portion of the interest rate received by the short seller is called the rebate rate. The **loan fee** is the portion paid the lender, and is equivalent to the interest rate the short seller pays on the loan. Therefore, the **loan fee** is the main direct cost of shorting. There can be a zero or negative rebate rate; a negative rebate rate corresponds to a situation where the lender receives all the interest in the collateral account and the short sellers pays additional interest out of her pocket to the lender.

3. *Pay any dividends while the loan is open.* The short seller must pay to the lender the cash equivalent of any dividends paid out on the stock. If you short Apple, and Apple pays a 2 dollar dividend per share during the time you short the stock, then you owe the lender 2 dollars for every share you shorted.

4. *Buy the shares back.* The short sellers profits are the following:
$$
Profit = Sell - Buy -(Interest \ Paid)
$$ 
  
The Finance literature has been interested in whether short sellers have good information. Do short sellers typically make money when the short? If loan fees are high, then short sellers are paying a high price to short (bet that the stock will perform poorly). Therefore, if short sellers are paying a lot to short, it likely represents times when short sellers have good information. 

In this problem set you create portfolios based on lagged loan fee to test the preceding hypothesis. Remember, the returns in the data are from going long (buying and then selling the stock). Your portfolios will reflect that fact. If the returns are really low or negative for a portfolio, then that means the short sellers are experiencing high returns.


**Data**

The data are monthly stock data for all stocks in the U.S. with non-missing loan fee data. The basic unit of observation is the stock month. You can download the data directly using the following link: [the data](https://diether.org/prephd/03-mstk_short_02-12.csv). There is also a link on *Learning Suite*. The data contain the following variables:

|Variable | Description                                       |
|---------|---------------------------------------------------|
|permno   | stock identifier                                  |
|caldt    | calendar date                                     |
|ret      | monthly return                                    |
|prclag   | stock price, lagged                               |   
|melag    | market equity, lagged                             |
|feelag   | the loan fee expressed a percent per anum, lagged |


**Tasks and Questions**  

1. What is the sample period of this data?

2. There are some observations where `feelag` is less than zero. These are data errors. Remove these observation from the dataframe and then compute the summary statistics for `feelag`.

3. Output the median and 90th percentile of `feelag` for every month. This can be done in seperate commands. 

4. Construct three equal-weight portfolios using `feelag` a the criterion variable. Portfolio 0: an equal-weight portfolio the includes all stocks with `feelag` less than or equal to 3% (loan fee are express as the interest rate per anum). Portfolio 1: an equal-weight portfolio the includes all stocks with `feelag` greater than 3% and less than or eqal to 5%. Portfolio 2: an equal-weight portfolio the includes stocks with `feelag` greater than 5% (loan fee are express as the interest rate per anum). Hint: use the pd.cut to bin the data and use my Jupyter notebook from last class as a guide.

5. Compute summary statistics for each of the portfolios you created including the mean, standard deviation, and t-statistics testing of the mean return of the portfolio is zero.

6. Are your results consistent with the hypothesis that short sellers have good information?


In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.read_csv("03-mstk_short_02-12.csv",parse_dates=['caldt'])
df.head()

Unnamed: 0,permno,caldt,ret,prclag,melag,feelag
0,10001,2005-06-30,0.12843,8.02,21.053,0.15
1,10001,2005-07-29,0.009945,9.05,26.363,0.32701
2,10001,2005-08-31,0.039387,9.14,26.625,0.15
3,10001,2005-10-31,-0.11904,11.51,33.529,0.1625
4,10001,2005-11-30,-0.059397,10.1,29.421,0.15


1. _The smaple period of this data is monthly._

In [3]:
# 2
df = df.loc[df["feelag"] >= 0]
df["feelag"].describe()

count    419500.000000
mean          1.030662
std           3.764338
min           0.000000
25%           0.095028
50%           0.150000
75%           0.361825
max          98.180000
Name: feelag, dtype: float64

In [4]:
# 3
grouped = df.groupby("caldt")
grouped_med = grouped["feelag"].median()
grouped_90th = grouped["feelag"].quantile(0.9)
print(f"Median:\n{grouped_med}\n\n90th Percentile:\n{grouped_90th}")

Median:
caldt
2002-06-28    0.442105
2002-07-31    0.250000
2002-08-30    0.250000
2002-09-30    0.227160
2002-10-31    0.189690
                ...   
2012-03-30    0.104045
2012-04-30    0.099750
2012-05-31    0.108170
2012-06-29    0.109820
2012-07-31    0.099295
Name: feelag, Length: 122, dtype: float64

90th Percentile:
caldt
2002-06-28    1.87500
2002-07-31    1.55537
2002-08-30    1.62500
2002-09-30    1.75000
2002-10-31    1.39574
               ...   
2012-03-30    3.44536
2012-04-30    3.43792
2012-05-31    3.59560
2012-06-29    3.91192
2012-07-31    4.19535
Name: feelag, Length: 122, dtype: float64


In [5]:
# 4
df["binned"] = pd.cut(df["feelag"], bins=[0, 3, 5, 100], labels=["Portfolio 0", "Portfolio 1", "Portfolio 2"], right=True)
portfolio_zero = df[df["binned"] == "Portfolio 0"]
portfolio_one = df[df["binned"] == "Portfolio 1"]
portfolio_two = df[df["binned"] == "Portfolio 2"]
df

Unnamed: 0,permno,caldt,ret,prclag,melag,feelag,binned
0,10001,2005-06-30,0.128430,8.02,21.053,0.15000,Portfolio 0
1,10001,2005-07-29,0.009945,9.05,26.363,0.32701,Portfolio 0
2,10001,2005-08-31,0.039387,9.14,26.625,0.15000,Portfolio 0
3,10001,2005-10-31,-0.119040,11.51,33.529,0.16250,Portfolio 0
4,10001,2005-11-30,-0.059397,10.10,29.421,0.15000,Portfolio 0
...,...,...,...,...,...,...,...
437770,93436,2012-03-30,0.114640,33.41,3494.800,13.73900,Portfolio 2
437771,93436,2012-04-30,-0.110370,37.24,3916.300,18.09700,Portfolio 2
437772,93436,2012-05-31,-0.109570,33.13,3485.700,13.02700,Portfolio 2
437773,93436,2012-06-29,0.060678,29.50,3103.800,10.79800,Portfolio 2


In [6]:
# 5
portfolio_zero["feelag"].describe()

portfolio_one["feelag"].describe()

portfolio_two["feelag"].describe()


mean_return_zero = portfolio_zero["feelag"].mean()
std_dev_zero = portfolio_zero["feelag"].std()

mean_return_one = portfolio_one["feelag"].mean()
std_dev_one = portfolio_one["feelag"].std()

mean_return_two = portfolio_two["feelag"].mean()
std_dev_two = portfolio_two["feelag"].std()


t_stat_zero = (mean_return_zero - 0) / (std_dev_zero / np.sqrt(len(portfolio_zero["feelag"])))

t_stat_one = (mean_return_one - 0) / (std_dev_one / np.sqrt(len(portfolio_one["feelag"])))

t_stat_two = (mean_return_two - 0) / (std_dev_two / np.sqrt(len(portfolio_two["feelag"])))


print(f"Portfolio 0:\n Mean Return: {mean_return_zero}\n Standard Devaition: {std_dev_zero}\n T-Stat: {t_stat_zero}")
print(f"Portfolio 1:\n Mean Return: {mean_return_one}\n Standard Devaition: {std_dev_one}\n T-Stat: {t_stat_one}")
print(f"Portfolio 2:\n Mean Return: {mean_return_two}\n Standard Devaition: {std_dev_two}\n T-Stat: {t_stat_two}")

Portfolio 0:
 Mean Return: 0.3237846534210129
 Standard Devaition: 0.5180659537426436
 T-Stat: 387.35401342925235
Portfolio 1:
 Mean Return: 3.924816062420125
 Standard Devaition: 0.5806885483104175
 T-Stat: 824.1148074681291
Portfolio 2:
 Mean Return: 12.375491369224669
 Standard Devaition: 12.03837464810272
 T-Stat: 146.00545311384278


6. 
_Portfolio 0, the bin with the lowest fees, also had the lowest mean return, meaning that short-sellers both have the best odds and the lowest cost, but with a low standard deviation, they are not likely to have much opportunity. The t-stat refelcts that this information is relatively relaible._ 

_Portfolio 1 has a mean return more than ten times greater than that of portfolio zero and has higher fees, indicating that a short-seller must have more reason to be confident about their shorts in order to win on the trade, and with a similair standard deviation they are still relatively unlikely to have many opportunites. The t-stat indicates that this information is quite relaible._

_The story amplifies for portfolio 2 with higher mean return and higher fees, but the high standard deviation indicates more opportunity for short-selling, and with the mean return being as high as it is, it's likely that returns made on well-placed shorts are rather large. Overall, it seems that the hypothesis of short-sellers having good information when placing shorts on stocks with high fees is likely to be true, as the risk of shorting those stocks is much greater. In order to reach any meaningful conclusion using this data, an assumtption must me made about those who are trading, and given the fact that many who short stay in business, I choose to assume they make good decisions. The t-stat indicates that this information is lesss relaible than the others, but still statistically significant._