Skip to content

Commit

Permalink
Merge pull request #38 from tidy-finance/assign-portfolio-change
Browse files Browse the repository at this point in the history
Assign portfolio change
  • Loading branch information
patrick-weiss committed May 17, 2023
2 parents beeed3c + 2576c75 commit 2209bb1
Show file tree
Hide file tree
Showing 5 changed files with 65 additions and 52 deletions.
2 changes: 1 addition & 1 deletion changelog.qmd
Expand Up @@ -5,4 +5,4 @@ title: Changelog
- [Mar. 30, 2023, Issue 29:](https://github.com/tidy-finance/website/issues/29) We upgraded to `tidyverse` 2.0.0 and R 4.3.2 and removed all explicit loads of `lubridate`
- [Feb. 15, 2023, Commit bfda6af: ](https://github.com/tidy-finance/website/commit/bfda6af6169a42f433568e32b7a9cce06cb948ac) We corrected an error in the calculation of the annualized average return volatility in the Chapter [Introduction to Tidy Finance](https://tidy-finance.quarto.pub/website/introduction-to-tidy-finance.html#the-efficient-frontier)
- [Mar. 06, 2023, Commit 857f0f5: ](https://github.com/tidy-finance/website/commit/857f0f5893a8e7e4c2b4475e1461ebf3d0abe2d6) We corrected an error in the label of [Figure 6](https://tidy-finance.quarto.pub/website/introduction-to-tidy-finance.html#fig-106) which wrongly claimed to show the efficient tangency portfolio.
- [Mar. 09, 2023, Commit fae4ac3: ](https://github.com/tidy-finance/website/commit/fae4ac3fd12797d66a48f43af3d8e84ded694f13) We corrected a typo in the definition of the power utility function in Chapter [Portfolio Performance](https://tidy-finance.quarto.pub/website/parametric-portfolio-policies.html#portfolio-performance). The utility function implemented in the code is now consistent with the text.
- [Mar. 09, 2023, Commit fae4ac3: ](https://github.com/tidy-finance/website/commit/fae4ac3fd12797d66a48f43af3d8e84ded694f13) We corrected a typo in the definition of the power utility function in Chapter [Portfolio Performance](https://tidy-finance.quarto.pub/website/parametric-portfolio-policies.html#portfolio-performance). The utility function implemented in the code is now consistent with the text.
27 changes: 15 additions & 12 deletions replicating-fama-and-french-factors.qmd
Expand Up @@ -78,24 +78,27 @@ variables_ff <- me_ff |>
Next, we construct our portfolios with an adjusted `assign_portfolio()` function.\index{Portfolio sorts} Fama and French rely on NYSE-specific breakpoints, they form two portfolios in the size dimension at the median and three portfolios in the dimension of book-to-market at the 30%- and 70%-percentiles, and they use independent sorts. The sorts for book-to-market require an adjustment to the function in Chapter 9 because the `seq()` we would produce does not produce the right breakpoints. Instead of `n_portfolios`, we now specify `percentiles`, which take the breakpoint-sequence as an object specified in the function's call. Specifically, we give `percentiles = c(0, 0.3, 0.7, 1)` to the function. Additionally, we perform an `inner_join()` with our return data to ensure that we only use traded stocks when computing the breakpoints as a first step.\index{Breakpoints}

```{r}
assign_portfolio <- function(data, var, percentiles) {
assign_portfolio <- function(data,
sorting_variable,
percentiles) {
breakpoints <- data |>
filter(exchange == "NYSE") |>
reframe(breakpoint = quantile(
{{ var }},
probs = {{ percentiles }},
na.rm = TRUE
)) |>
pull(breakpoint) |>
as.numeric()
pull({{ sorting_variable }}) |>
quantile(
probs = percentiles,
na.rm = TRUE,
names = FALSE
)
assigned_portfolios <- data |>
mutate(portfolio = findInterval({{ var }},
mutate(portfolio = findInterval(
pick(everything()) |>
pull({{ sorting_variable }}),
breakpoints,
all.inside = TRUE
)) |>
pull(portfolio)
return(assigned_portfolios)
}
Expand All @@ -105,12 +108,12 @@ portfolios_ff <- variables_ff |>
mutate(
portfolio_me = assign_portfolio(
data = pick(everything()),
var = me_ff,
sorting_variable = me_ff,
percentiles = c(0, 0.5, 1)
),
portfolio_bm = assign_portfolio(
data = pick(everything()),
var = bm_ff,
sorting_variable = bm_ff,
percentiles = c(0, 0.3, 0.7, 1)
)
) |>
Expand Down
15 changes: 9 additions & 6 deletions size-sorts-and-p-hacking.qmd
Expand Up @@ -152,22 +152,25 @@ To replicate the NYSE-centered sorting procedure, we introduce `exchanges` as an
assign_portfolio <- function(n_portfolios,
exchanges,
data) {
# Compute breakpoints
breakpoints <- data |>
filter(exchange %in% exchanges) |>
reframe(breakpoint = quantile(
mktcap_lag,
pull(mktcap_lag) |>
quantile(
probs = seq(0, 1, length.out = n_portfolios + 1),
na.rm = TRUE
)) |>
pull(breakpoint) |>
as.numeric()
na.rm = TRUE,
names = FALSE
)
# Assign portfolios
assigned_portfolios <- data |>
mutate(portfolio = findInterval(mktcap_lag,
breakpoints,
all.inside = TRUE
)) |>
pull(portfolio)
# Output
return(assigned_portfolios)
}
```
Expand Down
33 changes: 20 additions & 13 deletions univariate-portfolio-sorts.qmd
Expand Up @@ -102,27 +102,34 @@ The results indicate that we cannot reject the null hypothesis of average return

## Functional Programming for Portfolio Sorts

Now we take portfolio sorts to the next level. We want to be able to sort stocks into an arbitrary number of portfolios. For this case, functional programming is very handy: we employ the [curly-curly](https://www.tidyverse.org/blog/2019/06/rlang-0-4-0/#a-simpler-interpolation-pattern-with-)-operator to give us flexibility concerning which variable to use for the sorting, denoted by `var`.\index{Curly-curly} We use `quantile()` to compute breakpoints for `n_portfolios`. Then, we assign portfolios to stocks using the `findInterval()` function. The output of the following function is a new column that contains the number of the portfolio to which a stock belongs.\index{Functional programming}
Now we take portfolio sorts to the next level. We want to be able to sort stocks into an arbitrary number of portfolios. For this case, functional programming is very handy: we employ the [curly-curly](https://www.tidyverse.org/blog/2019/06/rlang-0-4-0/#a-simpler-interpolation-pattern-with-)-operator to give us flexibility concerning which variable to use for the sorting, denoted by `sorting_variable`.\index{Curly-curly} We use `quantile()` to compute breakpoints for `n_portfolios`. Then, we assign portfolios to stocks using the `findInterval()` function. The output of the following function is a new column that contains the number of the portfolio to which a stock belongs.\index{Functional programming}

In some applications, the variable used for the sorting might be clustered (e.g., at a lower bound of 0). Then, multiple breakpoints may be identical, leading to empty portfolios. Similarly, some portfolios might have a very small number of stocks at the beginning of the sample. Cases, where the number of portfolio constituents differs substantially due to the distribution of the characteristics, require careful consideration and, depending on the application, might require customized sorting approaches.

```{r}
assign_portfolio <- function(data, var, n_portfolios) {
assign_portfolio <- function(data,
sorting_variable,
n_portfolios) {
# Compute breakpoints
breakpoints <- data |>
reframe(
breakpoint = quantile(
{{ var }},
probs = seq(0, 1, length.out = n_portfolios + 1),
na.rm = TRUE)
) |>
pull(breakpoint) |>
as.numeric()
pull({{ sorting_variable }}) |>
quantile(
probs = seq(0, 1, length.out = n_portfolios + 1),
na.rm = TRUE,
names = FALSE
)
# Assign portfolios
assigned_portfolios <- data |>
mutate(portfolio = findInterval({{ var }},
mutate(portfolio = findInterval(
pick(everything()) |>
pull({{ sorting_variable }}),
breakpoints,
all.inside = TRUE
)) |>
pull(portfolio)
# Output
return(assigned_portfolios)
}
```
Expand All @@ -135,7 +142,7 @@ beta_portfolios <- data_for_sorts |>
mutate(
portfolio = assign_portfolio(
data = pick(everything()),
var = beta_lag,
sorting_variable = beta_lag,
n_portfolios = 10
),
portfolio = as.factor(portfolio)
Expand Down
40 changes: 20 additions & 20 deletions value-and-bivariate-sorts.qmd
Expand Up @@ -10,9 +10,6 @@ The current chapter relies on this set of packages.
```{r, eval = TRUE, message = FALSE}
library(tidyverse)
library(RSQLite)
library(scales)
library(lmtest)
library(sandwich)
```

## Data Preparation
Expand Down Expand Up @@ -92,28 +89,31 @@ data_for_sorts <- data_for_sorts |>
drop_na()
```

The last step of preparation for the portfolio sorts is the computation of breakpoints. We continue to use the same function allowing for the specification of exchanges to use for the breakpoints. Additionally, we reintroduce the argument `var` into the function for defining different sorting variables via `curly-curly`.\index{Curly-curly}
The last step of preparation for the portfolio sorts is the computation of breakpoints. We continue to use the same function allowing for the specification of exchanges to use for the breakpoints. Additionally, we reintroduce the argument `sorting_variable` into the function for defining different sorting variables.

```{r}
assign_portfolio <- function(data, var, n_portfolios, exchanges) {
assign_portfolio <- function(data,
sorting_variable,
n_portfolios,
exchanges) {
breakpoints <- data |>
filter(exchange %in% exchanges) |>
reframe(
breakpoint = quantile(
{{ var }},
probs = seq(0, 1, length.out = n_portfolios + 1),
na.rm = TRUE)
) |>
pull(breakpoint) |>
as.numeric()
pull({{ sorting_variable }}) |>
quantile(
probs = seq(0, 1, length.out = n_portfolios + 1),
na.rm = TRUE,
names = FALSE
)
assigned_portfolios <- data |>
mutate(portfolio = findInterval({{ var }},
mutate(portfolio = findInterval(
pick(everything()) |>
pull({{ sorting_variable }}),
breakpoints,
all.inside = TRUE
)) |>
pull(portfolio)
return(assigned_portfolios)
}
```
Expand All @@ -132,13 +132,13 @@ value_portfolios <- data_for_sorts |>
mutate(
portfolio_bm = assign_portfolio(
data = pick(everything()),
var = bm,
sorting_variable = "bm",
n_portfolios = 5,
exchanges = c("NYSE")
),
portfolio_me = assign_portfolio(
data = pick(everything()),
var = me,
sorting_variable = "me",
n_portfolios = 5,
exchanges = c("NYSE")
),
Expand Down Expand Up @@ -170,22 +170,22 @@ The resulting annualized value premium is 4.608 percent.

In the previous exercise, we assigned the portfolios without considering the second variable in the assignment. This protocol is called independent portfolio sorts. The alternative, i.e., dependent sorts, creates portfolios for the second sorting variable within each bucket of the first sorting variable.\index{Portfolio sorts!Dependent bivariate} In our example below, we sort firms into five size buckets, and within each of those buckets, we assign firms to five book-to-market portfolios. Hence, we have monthly breakpoints that are specific to each size group. The decision between independent and dependent portfolio sorts is another choice for the researcher. Notice that dependent sorts ensure an equal amount of stocks within each portfolio.

To implement the dependent sorts, we first create the size portfolios by calling `assign_portfolio()` with `var = me`. Then, we group our data again by month and by the size portfolio before assigning the book-to-market portfolio. The rest of the implementation is the same as before. Finally, we compute the value premium.
To implement the dependent sorts, we first create the size portfolios by calling `assign_portfolio()` with `sorting_variable = "me"`. Then, we group our data again by month and by the size portfolio before assigning the book-to-market portfolio. The rest of the implementation is the same as before. Finally, we compute the value premium.

```{r}
value_portfolios <- data_for_sorts |>
group_by(month) |>
mutate(portfolio_me = assign_portfolio(
data = pick(everything()),
var = me,
sorting_variable = "me",
n_portfolios = 5,
exchanges = c("NYSE")
)) |>
group_by(month, portfolio_me) |>
mutate(
portfolio_bm = assign_portfolio(
data = pick(everything()),
var = bm,
sorting_variable = "bm",
n_portfolios = 5,
exchanges = c("NYSE")
),
Expand Down

0 comments on commit 2209bb1

Please sign in to comment.