Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new summarize_weighted_buildout function #141

Merged
merged 38 commits into from Jul 29, 2020
Merged

add new summarize_weighted_buildout function #141

merged 38 commits into from Jul 29, 2020

Conversation

jdhoffa
Copy link
Member

@jdhoffa jdhoffa commented Jul 21, 2020

This is a WIP, and requires #136 to be merged before checks will pass. I open here as a draft just to track my work.

Remains to be done:

  • Add tests
  • Write an article outlining the methodology
  • Refactor (summarize_weighted_buildout is very similar to summarize_weighted_production and I imagine there must be a more DRY way to accomplish this
  • Filter out input data if the start-year production is 0 (as this would result in "infinite" build-out percentage)

Relates to #132

@maurolepore maurolepore added this to In progress in r2dii via automation Jul 23, 2020
@maurolepore
Copy link
Contributor

maurolepore commented Jul 23, 2020

Surprisingly, all calls to join seem to use by and that I expected to suppress this message:

#> Joining, by = c("sector", "technology")

--

library(r2dii.data)
library(r2dii.match)
#> Warning: package 'r2dii.match' was built under R version 4.0.2
devtools::load_all()
#> Loading r2dii.analysis

master <- loanbook_demo %>%
  match_name(ald_demo) %>%
  prioritize() %>%
  join_ald_scenario(ald_demo,
                    scenario_demo_2020,
                    region_isos = region_isos_demo
  )

summarize_weighted_buildout(master)
#> Joining, by = c("sector", "technology")
#> # A tibble: 138 x 4
#>    sector     technology  year weighted_buildout
#>    <chr>      <chr>      <int>             <dbl>
#>  1 automotive electric    2020       0          
#>  2 automotive electric    2021       0.000000765
#>  3 automotive electric    2022       0.00000153 
#>  4 automotive electric    2023       0.00000230 
#>  5 automotive electric    2024       0.00000306 
#>  6 automotive electric    2025       0.00000383 
#>  7 automotive electric    2026       0.00000459 
#>  8 automotive hybrid      2020       0.0000528  
#>  9 automotive hybrid      2021       0.0000501  
#> 10 automotive hybrid      2022       0.0000474  
#> # … with 128 more rows

Created on 2020-07-23 by the reprex package (v0.3.0)

--

Found the issue here. A call to inner_join() was passing to by not the character vector c("sector", "technology") but the columns c(.data$sector, .data$technology).

@maurolepore
Copy link
Contributor

@jdhoffa, is it safe to update the regression references? I can't figure out what is that changed them. The change happens seems to happen between f5d9959 and 0e57f9c but it's not clear to me what it is. Do you know?

The previous approach is too recent in rlang, and would break
unless the user has a recent version of rlang.
@jdhoffa
Copy link
Member Author

jdhoffa commented Jul 28, 2020

@maurolepore yes it's safe to update. Im truly not sure what is causing them to mess up.

Comment on lines +310 to +324
test_that("is sensitive to `use_credit_limit`", {
data <- fake_master(
name_ald = rep(c("company a", "company b"), 2),
year = c(2020, 2020, 2021, 2021),
id_loan = c("i1", "i2", "i1", "i2"),
loan_size_credit_limit = c(20, 30, 20, 30),
production = c(10, 10, 20, 40),
)
out2 <- data %>%
summarize_weighted_percent_change(name_ald, use_credit_limit = TRUE)
out3 <- data %>%
summarize_weighted_percent_change(name_ald, use_credit_limit = FALSE)

expect_false(identical(out2, out3))
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from expectations that I found outside a call to test_that(). I change the code to reflect how I would test that summarize_weighted_percent_change() is sensitive to use_credit_limit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Comment on lines 303 to 329
# TODO: Come back to this after the methodology article is written
# test_that("with known input outputs as expected", {
# # styler: off
data <- fake_master(
name_ald = rep(c("company a", "company b"),2),
year = c(2020, 2020, 2021, 2021),
id_loan = c("i1", "i2", "i1", "i2"),
production = c( 10, 10, 20, 40),
)
out1 <- summarize_weighted_percent_change(data, name_ald)
out1$weighted_percent_change
expect_equal(out1$weighted_percent_change, c(0,0,50,150))

# Is sensitive to `use_credit_limit`
# Reversing loan_size and production outputs reverse result
data2 <- fake_master(
name_ald = rep(c("company a", "company b"),2),
year = c(2020, 2020, 2021, 2021),
id_loan = c("i1", "i2", "i1", "i2"),
loan_size_credit_limit = c( 20, 30, 20, 30),
production = c( 10, 10, 20, 40),
)
out2 <- summarize_weighted_percent_change(data2, name_ald, use_credit_limit = TRUE)
expect_equal(out2$weighted_percent_change, c(0,0,40,180))
# # styler: on
# })

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdhoffa, I'm not sure about removing this. It wasn't inside a test, nor it seems that the code and comments correspond to each other. Can you check this is really to throw away?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, i didn't mean to remove the test entirely! i meant only to remove the TODO comment haha.
I put the test back in

@@ -0,0 +1,61 @@
---
title: "Indicator Choices: Production vs. Percent Change in Production"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "production" enough or do we need to qualify "production" with some word to more precisely distinguish it from "percent change in production"? Without a qualifier, percent in production seems like a subgroup of production when it is instead an alternative.

Three qualifiers that come to my mind are weighted, absolute, and raw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I elected to add "Weighted" in every case

We aggregate the percent-change in production for each company to the portfolio-level, by using the same loan-weighted average as above. In particular, for each loan $l_j$ to company $j$, we have:
$$ \overline{\chi_i} = \sum_j \left[ \chi_{i,j} * \dfrac{l_j}{\sum_j l_j} \right]$$

It should be noted that the percent change, $\chi$, is undefined for $0$ initial production. Intuitively, this makes sense, since you would require an "infinite percent" build-out to grow to anything from 0. For this reason, any company having $0$ initial production is filtered out prior to calculating the percent change indicator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where it says "$0$ initial production" to replace $0$ with just 0. You seem to refer not to the equation but to the value 0.

Maybe you mean "When initial production ($0$) takes a value of 0 (zero) ... ". If so, maybe you need to reword a bit.


```{r}
# using the master dataset defined in the previous chunk:
summarize_weighted_percent_change(master)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the calls to head() because they seem necessary for tibbles. Is there a reason you prefer head()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I thought they were formatting strangely without head(), but it looks fine to me now.

Copy link
Contributor

@maurolepore maurolepore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdhoffa, I suggested a few changes and made a few comments. If you feel it's ready, then let's merge. If not, I'm happy to review again after you incorporate my review.

@jdhoffa
Copy link
Member Author

jdhoffa commented Jul 29, 2020

I will merge once the checks pass! Thanks @maurolepore

@jdhoffa jdhoffa merged commit 7cf1385 into RMI-PACTA:master Jul 29, 2020
r2dii automation moved this from In progress to Done Jul 29, 2020
@jdhoffa jdhoffa deleted the 132-volume-build-out branch July 29, 2020 10:17
@AlexAxthelm AlexAxthelm removed this from Done in r2dii Sep 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants