[![Binder](img/badge-binder.svg)](https://mybinder.org/v2/gh/nhirschey/teaching/gh-pages?filepath=assignment-signal-portfolio.ipynb)&emsp;
[![Script](img/badge-script.svg)](/Teaching//assignment-signal-portfolio.fsx)&emsp;
[![Notebook](img/badge-notebook.svg)](/Teaching//assignment-signal-portfolio.ipynb)

Student Name | Student Number
--- | --- | ---
**1** | &#32; | &#32;
**2** | &#32; | &#32;


**Signal Name (e.g., Book to Market):**

**Signal Code (e.g., be_me):**

This is an assignment. You may work in pairs (two students) using either student's signal to answer the below questions.  You will find sections labeled **Task** asking you to do each piece of analysis. Please make sure that you complete all of these tasks. Make use of the course resources and example code on the course website. It should be possible to complete all the requested tasks using information given below or somewhere on the course website.



In [None]:
#r "nuget:FSharp.Data"
#r "nuget: FSharp.Stats"
#r "nuget: Plotly.NET, 2.0.0-preview.17"


Microsoft.DotNet.Interactive.InstallPackagesMessage


In [None]:
#r "nuget: Plotly.NET.Interactive, 2.0.0-preview.17"


Microsoft.DotNet.Interactive.InstallPackagesMessage


In [None]:
open System
open FSharp.Data
open Plotly.NET
open FSharp.Stats


In [None]:
// Set dotnet interactive formatter to plaintext
Formatter.Register(fun (x:obj) (writer: TextWriter) -> fprintfn writer "%120A" x )
Formatter.SetPreferredMimeTypesFor(typeof<obj>, "text/plain")
// Make plotly graphs work with interactive plaintext formatter
Formatter.SetPreferredMimeTypesFor(typeof<GenericChart.GenericChart>,"text/html")


## Load Data

First, make sure that you're referencing the correct files.

Here I'm assuming that you have a class folder with this notebook and a `data` folder inside of it. The folder hierarchy would look like below where you
have the below files and folders accessible:

```code
/class
    Portfolio.fsx
    Common.fsx
    notebook.ipynb
    /data
        id_and_return_data.csv
        zero_trades_252d.csv
    
```


In [None]:
let [<Literal>] ResolutionFolder = __SOURCE_DIRECTORY__
Environment.CurrentDirectory <- ResolutionFolder

#load "Portfolio.fsx"
open Portfolio

#load "Common.fsx"
open Common


### Data files

We assume the `id_and_return_data.csv` file and the signal csv file  are in the `data` folder. In this example the signal file is `zero_trades_252d.csv`. You should replace that file name with your signal file name.



In [None]:
let [<Literal>] IdAndReturnsFilePath = "data/id_and_return_data.csv"
let [<Literal>] MySignalFilePath = ""
let strategyName = ""


If my paths are correct, then this code should read the first few lines of the files.
If it doesn't show the first few lines, fix the above file paths.



In [None]:
IO.File.ReadLines(IdAndReturnsFilePath) |> Seq.truncate 5


seq
  ["id(string),eom(date),source(string),sizeGrp(string),obsMain(string),exchMain(string),primarySec(bool),gvkey(string),iid(string),permno(int Option),permco(int Option),excntry(string),curcd(string),fx(string),common(bool),compTpci(string),crspShrcd(int Option),compExchg(string),crsp_exchcd(int Option),adjfct(float Option),shares(float Option),gics(int Option),sic(int Option),naics(int Option),ff49(int Option),ret(float Option),retExc(float Option),prc(float Option),marketEquity(float Option)";
   "crsp_86432,2000-01-31T00:00:00.0000000,CRSP,micro,1,1,true,115876,01,86432,16313,USA,USD,1,true,,11,,3,2,5.218,40101010,6020,522110,45,-0.003906,-0.00824925,15.9375,83.161875";
   "crsp_85640,2000-01-31T00:00:00.0000000,CRSP,small,1,1,true,002193,01,85640,20300,USA,USD,1,true,,11,,1,1,102.496,35102020,8051,623110,11,-0.157143,-0.161485863,3.6875,377.954";
   "crsp_86430,2000-01-31T00:00:00.0000000,CRSP,micro,1,1,true,115946,01,86430,16319,USA,USD,1,true,,11,,3,1,10.764,45103010,7372,511

In [None]:
IO.File.ReadLines(MySignalFilePath) |> Seq.truncate 5


seq
  ["id(string),eom(date),signal(float option)"; "comp_001034_01,2008-12-31T00:00:00.0000000,5";
   "comp_001043_01,2000-01-31T00:00:00.0000000,5"; "comp_001076_02,2010-12-31T00:00:00.0000000,"; ...]


Ok, now assuming those paths were correct the below code will work.
I will put all this prep code in one block so that it is easy to run.



In [None]:
/// Type for Id and returns csv data
type IdAndReturnsType = 
    CsvProvider<Sample=IdAndReturnsFilePath,
                // The schema parameter is not required,
                // but I am using it to override some column types
                // to make filtering easier.
                // If I didn't do this these particular columns 
                // would have strings of "1" or "0", but explicit boolean is nicer.
                Schema="obsMain(string)->obsMain=bool,exchMain(string)->exchMain=bool",
                ResolutionFolder=ResolutionFolder>

/// Type for the signal data
type MySignalType = 
    CsvProvider<MySignalFilePath,
                ResolutionFolder=ResolutionFolder>

/// Id and returns data indexed by security id and month
let msfBySecurityIdAndMonth =
    IdAndReturnsType.GetSample().Rows
    |> Seq.map(fun row -> 
        let id = Other row.Id
        let month = DateTime(row.Eom.Year,row.Eom.Month,1)
        let key = id, month
        key, row)
    |> Map    

/// Signal data indexed by security id and month
let signalBySecurityIdAndMonth =
    MySignalType.GetSample().Rows
    |> Seq.choose(fun row -> 
        // we'll use choose to drop the security if the signal is None.
        // The signal is None when it is missing.
        match row.Signal with
        | None -> None // choose will drop these None observations
        | Some signal ->
            let id = Other row.Id
            let month = DateTime(row.Eom.Year,row.Eom.Month,1)
            let key = id, month
            // choose will convert Some(key,signal) into
            // (key,signal) and keep that.
            Some (key, signal))
    |> Map    

/// Investment universe for each month, indexed by month.
let securitiesByFormationMonth =
    msfBySecurityIdAndMonth
    |> Map.values
    |> Seq.groupBy(fun x -> DateTime(x.Eom.Year, x.Eom.Month,1))
    |> Seq.map(fun (ym, obsThisMonth) -> 
        let idsThisMonth = [ for x in obsThisMonth do Other x.Id ]
        ym, idsThisMonth)
    |> Map

/// Function to get investment universe for a month.
let getInvestmentUniverse formationMonth =
    match Map.tryFind formationMonth securitiesByFormationMonth with
    | Some securities -> 
        { FormationMonth = formationMonth 
          Securities = securities }
    | None -> failwith $"{formationMonth} is not in the date range"
   
/// Function to get signal for entire investment universe.
let getMySignals (investmentUniverse: InvestmentUniverse) =
    let getMySignal (securityId, formationMonth) =
        match Map.tryFind (securityId, formationMonth) signalBySecurityIdAndMonth with
        | None -> None
        | Some signal ->
            Some { SecurityId = securityId 
                   // if a high signal means low returns,
                   // use `-signal` here instead of `signal`
                   Signal = signal }
    let listOfSecuritySignals =
        investmentUniverse.Securities
        |> List.choose(fun security -> 
            getMySignal (security, investmentUniverse.FormationMonth))    
    
    { FormationMonth = investmentUniverse.FormationMonth 
      Signals = listOfSecuritySignals }


/// Function to get market cap for security in a month.
let getMarketCap (security, formationMonth) =
    match Map.tryFind (security, formationMonth) msfBySecurityIdAndMonth with
    | None -> None
    | Some row -> 
        match row.MarketEquity with
        | None -> None
        | Some me -> Some (security, me)

/// Function to get returns for a security in a month.
let getSecurityReturn (security, formationMonth) =
    // If the security has a missing return, assume that we got 0.0.
    // Note: If we were doing excess returns, we would need 0.0 - rf.
    let missingReturn = 0.0
    match Map.tryFind (security, formationMonth) msfBySecurityIdAndMonth with
    | None -> security, missingReturn
    | Some x ->  
        match x.Ret with 
        | None -> security, missingReturn
        | Some r -> security, r

let isObsMain (security, formationMonth) =
    match Map.tryFind (security, formationMonth) msfBySecurityIdAndMonth with
    | None -> false
    | Some row -> row.ObsMain

let isPrimarySecurity (security, formationMonth) =
    match Map.tryFind (security, formationMonth) msfBySecurityIdAndMonth with
    | None -> false
    | Some row -> row.PrimarySec

let isCommonStock (security, formationMonth) =
    match Map.tryFind (security, formationMonth) msfBySecurityIdAndMonth with
    | None -> false
    | Some row -> row.Common

let isExchMain (security, formationMonth) =
    match Map.tryFind (security, formationMonth) msfBySecurityIdAndMonth with
    | None -> false
    | Some row -> row.ExchMain

let hasMarketEquity (security, formationMonth) =
    match Map.tryFind (security, formationMonth) msfBySecurityIdAndMonth with
    | None -> false
    | Some row -> row.MarketEquity.IsSome

/// Data filters
let myFilters securityAndFormationMonth =
    isObsMain securityAndFormationMonth &&
    isPrimarySecurity securityAndFormationMonth &&
    isCommonStock securityAndFormationMonth &&
    isExchMain securityAndFormationMonth &&
    isExchMain securityAndFormationMonth &&
    hasMarketEquity securityAndFormationMonth

/// Function to filter investment universe by filters.
let doMyFilters (universe:InvestmentUniverse) =
    let filtered = 
        universe.Securities
        // my filters expect security, formationMonth
        |> List.map(fun security -> security, universe.FormationMonth)
        // do the filters
        |> List.filter myFilters
        // now convert back from security, formationMonth -> security
        |> List.map fst
    { universe with Securities = filtered }

/// Months in the sample where we can form portfolios.
let sampleMonths = 
    let startSample = 
        msfBySecurityIdAndMonth.Keys
        |> Seq.map(fun (id, dt) -> dt)
        |> Seq.min
    let endSample = 
        let lastMonthWithData = 
            msfBySecurityIdAndMonth.Keys
            |> Seq.map(fun (id, dt) -> dt)
            |> Seq.max
        // The end of sample is the last month when we have returns.
        // So the last month when we can form portfolios is one month
        // before that.
        lastMonthWithData.AddMonths(-1) 
    getSampleMonths (startSample, endSample)


## Start of assignment

> **Task:** Complete the below function. It should take a month, a strategy name, and a number `n` of portfolios as input. The output should be a list of assigned portfolios, where stocks are assigned to `n` portfolios by sorts on the signal. I've included type signatures to constrain the output to the correct type.
> 



In [None]:
let formAssignedPortfolios (ym: DateTime) (strategyName: string) (n: int) : list<AssignedPortfolio> =
    ym
    |> getInvestmentUniverse
    |> doMyFilters
    |> getMySignals
    |> assignSignalSort strategyName n

formAssignedPortfolios sampleMonths[2] strategyName 2

[{ PortfolioId = Indexed { Index = 1
                           Name = "Piotroski F-Score" }
   FormationMonth = 3/1/2000 12:00:00 AM
   Signals =
    [{ SecurityId = Other "crsp_26331"
       Signal = 0.0 }; { SecurityId = Other "crsp_39029"
                         Signal = 0.0 }; { SecurityId = Other "crsp_48697"
                                           Signal = 0.0 }; { SecurityId = Other "crsp_69033"
                                                             Signal = 0.0 }; { SecurityId = Other "crsp_75388"
                                                                               Signal = 0.0 };
     { SecurityId = Other "crsp_76215"
       Signal = 0.0 }; { SecurityId = Other "crsp_76740"
                         Signal = 0.0 }; { SecurityId = Other "crsp_77427"
                                           Signal = 0.0 }; { SecurityId = Other "crsp_79379"
                                                             Signal = 0.0 }; { SecurityId = Other "crsp_80307"
        

> **Task:** Using your `formAssignedPortfolios` function, calculate quintile portfolios (5 portfolios) for your signal for August 2017. Assign the result to `aug2017Assignments5`. I've assigned type signatures to constrain the output to the correct type.
> 



In [None]:
let aug2017Assignments5: list<AssignedPortfolio> =
    formAssignedPortfolios (DateTime(2017,8,1)) strategyName 5

aug2017Assignments5

[{ PortfolioId = Indexed { Index = 1
                           Name = "Piotroski F-Score" }
   FormationMonth = 8/1/2017 12:00:00 AM
   Signals =
    [{ SecurityId = Other "crsp_76934"
       Signal = 0.0 }; { SecurityId = Other "crsp_88173"
                         Signal = 0.0 }; { SecurityId = Other "crsp_11368"
                                           Signal = 1.0 }; { SecurityId = Other "crsp_11511"
                                                             Signal = 1.0 }; { SecurityId = Other "crsp_12411"
                                                                               Signal = 1.0 };
     { SecurityId = Other "crsp_12413"
       Signal = 1.0 }; { SecurityId = Other "crsp_12573"
                         Signal = 1.0 }; { SecurityId = Other "crsp_12577"
                                           Signal = 1.0 }; { SecurityId = Other "crsp_13652"
                                                             Signal = 1.0 }; { SecurityId = Other "crsp_13706"
        

> **Task:** Assign quintile 5 to a value named `quintile5`. I've assigned type constraints to make sure the output is the correct type.
> 



In [None]:
let quintile5: AssignedPortfolio = 
    aug2017Assignments5.[4]

quintile5

{ PortfolioId = Indexed { Index = 5
                          Name = "Piotroski F-Score" }
  FormationMonth = 8/1/2017 12:00:00 AM
  Signals =
   [{ SecurityId = Other "crsp_14231"
      Signal = 7.0 }; { SecurityId = Other "crsp_14271"
                        Signal = 7.0 }; { SecurityId = Other "crsp_14329"
                                          Signal = 7.0 }; { SecurityId = Other "crsp_14401"
                                                            Signal = 7.0 }; { SecurityId = Other "crsp_14408"
                                                                              Signal = 7.0 };
    { SecurityId = Other "crsp_14434"
      Signal = 7.0 }; { SecurityId = Other "crsp_14452"
                        Signal = 7.0 }; { SecurityId = Other "crsp_14478"
                                          Signal = 7.0 }; { SecurityId = Other "crsp_14487"
                                                            Signal = 7.0 }; { SecurityId = Other "crsp_14493"
                       

> **Task:** How many stocks are in quintile 5?

In [None]:
let countQuintile5 = 
    quintile5.Signals
    |> List.distinct
    |> List.length

countQuintile5

648



> **Task:** Calculate value-weights for quintile 5. The result should have type `Portfolio`.
> 



In [None]:
let quintile5VW: Portfolio =
    giveValueWeights getMarketCap quintile5

quintile5VW

{ PortfolioId = Indexed { Index = 5
                          Name = "Piotroski F-Score" }
  FormationMonth = 8/1/2017 12:00:00 AM
  Positions =
   [{ SecurityId = Other "crsp_14231"
      Weight = 0.001046385039 }; { SecurityId = Other "crsp_14271"
                                   Weight = 0.0002073260205 }; { SecurityId = Other "crsp_14329"
                                                                 Weight = 0.001612446484 };
    { SecurityId = Other "crsp_14401"
      Weight = 0.0006358262147 }; { SecurityId = Other "crsp_14408"
                                    Weight = 0.0008046652856 }; { SecurityId = Other "crsp_14434"
                                                                  Weight = 8.845944869e-05 };
    { SecurityId = Other "crsp_14452"
      Weight = 4.313042843e-06 }; { SecurityId = Other "crsp_14478"
                                    Weight = 0.0001512561704 }; { SecurityId = Other "crsp_14487"
                                                           

> **Task:** Plot a histogram of the position weights for the stocks in quintile 5.


In [None]:
let weights = [for i in quintile5VW.Positions do i.Weight]

let Histogram =
    weights
    |> Chart.Histogram

> 

> **Task:** Calculate the minimum, 5th percentile, 50th percentile, 95th percentile, and maxium of the position weights for quintile 5.
> 


In [None]:
let minimum =
    weights
    |> List.min

let maximum =
    weights
    |> List.max

printfn "The minimum is equal to %f" minimum
printfn "The maximum is equal to %f" maximum

The minimum is equal to 0.000000
The maximum is equal to 0.076159


In [None]:
let percentiles = [0.05; 0.5; 0.95]

let weightInArrayForm =
    weights
    |> List.toArray

let percentilesWeights = 
    [for p in percentiles do
        Quantile.compute p weightInArrayForm]

percentilesWeights

[6.040153243e-06; 0.0003399500363; 0.006313485887]



> **Task:** Calculate the total weight put in quintile 5's top 10 positions when using value weights. How does this compare to the total weight of the top 10 positions if you used equal weights instead of value weights?
> 



In [None]:
let TopVW = 
    weights
    |> List.sortDescending
    |> List.truncate 10
    |> List.sum

let TopEW =
    10/648 

In [None]:
TopVW 

0.348673364


In [None]:
weights.Length

648


In [None]:
TopEW

0


TExt here 