Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The presence of some zero coefficients might cause robyn_allocator to report all coefficients are zero #345

Closed
harell opened this issue Mar 16, 2022 · 24 comments
Assignees

Comments

@harell
Copy link

harell commented Mar 16, 2022

Project Robyn

Describe issue

Hey Team

1st iteration:

I am running Robyn 3.6.1 and failing to produce meaningful allocation.
When I extract the coefficients of a Robyn model named 1_237_4 from Robyn::robyn_outputs output, then I see that only cinema_spend and print_spend are zero (see attached table).

image

When I run Robyn::robyn_allocator employing model 1_237_4, then I get the following message, saying that all of my coefficients are zero.

>>> Running budget allocator for model ID 1_237_4 ...
cinema, cpcv, cpm, native, ooh, print, radio, sem, tv are excluded in optimiser because their coeffients are 0

Why does Robyn::robyn_allocator claim that all coefficients of the selected model are zero while the returning object from Robyn::robyn_outputs shows otherwise?

2nd iteration:

In this iteration, I hypothesise that the presents of zero coefficients lead to the above discrptency.

To test that I:

  • Exclude cinema_spend and print_spend from paid_media_vars (the variable with zero coefficient in the first iteration);
  • Rerun the model; and
  • Call Robyn::robyn_allocator

This time, Robyn::robyn_allocator produces meaningful allocation with the following message

>>> Running budget allocator for model ID 1_248_4 ...

Therefore, it seems the presence of some zero chaffinches led Robyn::robyn_allocator to report all coefficients are zero.

Provide dummy data & model configuration

See the above experiment

Environment & Robyn version

  • Robyn v3.6.1
  • R version 4.1.1 (2021-08-10)
@laresbernardo
Copy link
Collaborator

Hi @harell thanks for reporting this issue. Before I start helping you and Kyle debug this issue, can you please check this, follow the steps to check if this issue was solved already in that branch and if not, let us know?

@laresbernardo
Copy link
Collaborator

Feel free to re-open if this particular error or issue persists when updating to 3.6.2 @harell

@harell
Copy link
Author

harell commented Apr 18, 2022

This is still an issue. The allocator works when all the channel boundaries are between 0.5 and 2 (on all the channels). But once the boundaries are between 0.9 and 1.1, the allocator prompts the '0' coefficients message.

@kyletgoldberg
Copy link
Contributor

@harell could you run remotes::install_github("facebookexperimental/Robyn/R") and try this again? and if you don't mind then sharing what the actual coeffs are that it is saying are 0? Thanks again for your patience.

@kyletgoldberg kyletgoldberg reopened this Apr 19, 2022
@harell
Copy link
Author

harell commented Apr 20, 2022

What commit to master do you think will resolve it?

I tried running the code with tag v3.6.2. The problem presists.

@johnscherrer
Copy link

@kyletgoldberg I still have this issue. Three channels with 0-coefficients are included in the allocator, and three channels with non zero coefficients (our most contributing channel as well as smaller channels) are excluded.

@laresbernardo
Copy link
Collaborator

@harell running remotes::install_github("facebookexperimental/Robyn/R") will get you the latest commit. Please try with this instead of v3.6.2 tag, or from commit e04a6ce.
@johnscherrer I'm checking the latest code and I see that we identify coef > 0 channels, and then we keep only those channels. Can you validate that this is happening in your case? Can you please try debugging line by line if not (the robyn_allocator() function) to check in which part of the code we are excluding incorrectly the non-0 channels?

@johnscherrer
Copy link

johnscherrer commented Apr 20, 2022

@laresbernardo I'm debugging rows 161-169 (the code section you linked):

The first line coefSelectorSorted <- dt_coefSorted[, coef > 0] works.

Next, chn_coef0 returns all the paid_media_vars, so it prints out all paid impressions variables in the message saying all of them will be excluded in the allocator.

Then, mediaSpendSortedFiltered <- mediaSpendSorted[coefSelectorSorted] returns the correct amount of variables that has > 0 coefficients, but some of them are the variables that have 0-coefficients. So seems like this line is where the wrong channels for the allocator are picked out.

Lmk if you need more info. Thank you!

@laresbernardo
Copy link
Collaborator

Hi @johnscherrer there was def an error in the message displayed. I fixed it earlier today. Now, let me see if I understand what's happening in your case for the last part: it's filtering non 0-coef and leaving some 0-coef variables in mediaSpendSortedFiltered result?

@johnscherrer
Copy link

@laresbernardo correct

@harell
Copy link
Author

harell commented Apr 21, 2022

The allocator message reports the zero coefficients now

Before: Using Robyn 3.6.2 I get
image

Now: Using commit 990aa8f I get
image

@laresbernardo
Copy link
Collaborator

@harell thanks for confirming. Is it fixed for your case then?

@johnscherrer if the error persists, can you help me debug what's wrong with your case? Can you check the sortings? The selection should be fixed now and I think you shouldn't have problems anymore but you say you do. Do you have a reproducible example? I suggest you try running robyn_allocator() line by line to check where and why you get this unexpected behavior.

@johnscherrer
Copy link

johnscherrer commented Apr 25, 2022

@laresbernardo The error persists for me, but think I found the error source for my case. Will try to explain what goes on when debugging lines 160-173:

  1. Lines 160 to 166 seems to work correctly. Can confirm that coefSelectorSorted sets the 0-coeff boolean labels on the variables correctly.
  2. chn_coef0 <- setdiff(names(coefSelectorSorted), mediaSpendSorted[coefSelectorSorted]) does not return correct values.
    mediaSpendSorted[coefSelectorSorted] returns the correct amount of variables, but some of the wrong variables (with 0-coefficients), so the output here is wrong. What happens when I run mediaSpendSorted[coefSelectorSorted] is that it applies the true/false-labels according to their sequential order in coefSelectorSorted, on mediaSpendSorted in which the media channel names is sorted differently (alphabetically?). Hence it's picking out some of the wrong channels due to the difference in sorting.

Example of what happens in my case - if:
mediaSpendSorted = "channel_1", "channel_2", "channel_3", "channel_4", and
coefSelectorSorted = channel_2: TRUE, channel_3: TRUE, channel_1: FALSE, channel_4: FALSE
Then mediaSpendSorted[coefSelectorSorted] returns channel_1, channel_2, when I suppose it should return: mediaSpendSorted[coefSelectorSorted] = channel_2, channel_3.

The same method is used at line 173: mediaSpendSortedFiltered <- mediaSpendSorted[coefSelectorSorted]. That's probably the reason for using wrong variables in the budget allocator.

Please let me know if this makes sense from your end. Thanks

@laresbernardo
Copy link
Collaborator

laresbernardo commented Apr 25, 2022

The sorting part should happen here so the order of the variables is always the same (alphabetically), or at least that's what's expected to happen. Can you try adapting the code until it behaves as it should for your variables please? Maybe changing get_rn_order <- media_order could work!

@johnscherrer
Copy link

johnscherrer commented Apr 25, 2022

You're right. It was the mediaSpendSorted that wasn't sorted alphabetically in my case, because media_order <- order(paid_media_vars) and not media_order <- order(paid_media_spends). I had names the paid_media_vars and paid_media_spends inconsistently, causing different alphabetical orders. Line 128

Now the issue is resolved. Thanks!

@laresbernardo
Copy link
Collaborator

laresbernardo commented Apr 25, 2022

That's great!! So fixing this issue right-away

@laresbernardo
Copy link
Collaborator

Please, can you run robyn_update(ref = "6f0e467"), restart your session and test if it works OK for your case? Thanks again @johnscherrer

@johnscherrer
Copy link

Got the error msg Error in robyn_update(ref = "6f0e467"): could not find function "robyn_update" Traceback:. so I haven't been able to check. I changed one variable name in my paid_media_spends to fix the issue.

@laresbernardo
Copy link
Collaborator

laresbernardo commented Apr 25, 2022

Ah sorry, this should work then: install_github(repo = "facebookexperimental/Robyn/R", ref = "6f0e467"); robyn_update is a new function you might not have yet. Can you please try without changing the variable name just to be sure this worked? Thanks a lot

@johnscherrer
Copy link

johnscherrer commented Apr 25, 2022

Getting this error now when running AllocatorCollect:
Error in glued("\nModel ID: {x$dt_optimOut$solID[1]}\nScenario: {scenario}\nMedia Skipped (coef = 0): {paste0(x$skipped, collapse = ',')}\nRelative Spend Increase: {spend_increase_p}% ({spend_increase}{scenario_plus})\nTotal Response Increase (Optimized): {signif(100 * x$dt_optimOut$optmResponseUnitTotalLift[1], 3)}%\nWindow: {x$dt_optimOut$date_min[1]}:{x$dt_optimOut$date_max[1]} ({x$dt_optimOut$periods[1]})\nAllocation Summary:\n {summary}\n", : could not find function "glued" Called from: print(glued("\nModel ID: {x$dt_optimOut$solID[1]}\nScenario: {scenario}\nMedia Skipped (coef = 0): {paste0(x$skipped, collapse = ',')}\nRelative Spend Increase: {spend_increase_p}% ({spend_increase}{scenario_plus})\nTotal Response Increase (Optimized): {signif(100 * x$dt_optimOut$optmResponseUnitTotalLift[1], 3)}%\nWindow: {x$dt_optimOut$date_min[1]}:{x$dt_optimOut$date_max[1]} ({x$dt_optimOut$periods[1]})\nAllocation Summary:\n {summary}\n", scenario = ifelse(x$scenario == "max_historical_response", "Maximum Historical Response", "Maximum Response with Expected Spend"), spend_increase_p = signif(100 * x$dt_optimOut$expSpendUnitDelta[1], 3), spend_increase = formatNum(sum(x$dt_optimOut$optmSpendUnitTotal) - sum(x$dt_optimOut$initSpendUnitTotal), abbr = TRUE, sign = TRUE), scenario_plus = ifelse(x$scenario == "max_response_expected_spend", sprintf(" in %s days", x$expected_spend_days), ""), summary = paste(sprintf("\n- %s:\n Optimizable Range (bounds): [%s%%, %s%%]\n Mean Spend Share (avg): %s%% -> Optimized = %s%%\n Mean Response: %s -> Optimized = %s\n Mean Spend (per time unit): %s -> Optimized = %s [Delta = %s%%]", x$dt_optimOut$channels, 100 * x$dt_optimOut$constr_low - 100, 100 * x$dt_optimOut$constr_up - 100, signif(100 * x$dt_optimOut$initSpendShare, 3), signif(100 * x$dt_optimOut$optmSpendShareUnit, 3), formatNum(x$dt_optimOut$initResponseUnit, 0), formatNum(x$dt_optimOut$optmResponseUnit, 0), formatNum(x$dt_optimOut$initSpendUnit, 3, abbr = TRUE), formatNum(x$dt_optimOut$optmSpendUnit, 3, abbr = TRUE), formatNum(100 * x$dt_optimOut$optmSpendUnitDelta, signif = 2)), collapse = "\n ")))

@laresbernardo
Copy link
Collaborator

That's an error in the print method (i.e. printing AllocatorCollect results). Were you actually able to run robyn_allocator() successfully first?

@johnscherrer
Copy link

Yes, now the budget allocator picked out the correct channels, without changing variable name!!

@laresbernardo
Copy link
Collaborator

Ok, that's good news! So the allocator is fixed but you seem to have problems printing the results when running print(allocator_results) right? This is what we are running to print results in a nice format. Would you be able to check these values and see why you get an error? It's not quite informative given it's literally printing the whole code without saying "you are missing X value".

@johnscherrer
Copy link

print(allocator_results) works for me now too, did a mistake in the code on my end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants