Description
Issue Summary
When using plot_ly() in R with a scatter plot (mode = "lines+markers"), missing (NA) values are expected to create gaps in the line plot. However, if exactly two NA values exist per category, the missing values are incorrectly connected by a line instead of creating a gap.
Interestingly, when the hovertemplate is removed, the line plot behaves as expected (i.e., creating a gap for NA values). This issue only occurs when there are exactly two NA values per category; the code works with any other number of NA values.
Additional Discovery:
The issue is resolved if I include the argument split = ~Category, but I cannot find documentation for split in Plotly, which makes me think it may be deprecated. Moreover, when the hovertemplate is removed, the inclusion of split does not work as expected and does not resolve the issue.
Reproducible Example
The following R code demonstrates the issue:
library(plotly)
df <- data.frame(
Category = rep(c("A", "B"), each = 6),
Date = c(2020, 2021, 2022, 2023, 2024, 2025, 2020, 2021, 2022, 2023, 2024, 2025),
Value = c(10, 15, NA, NA, 20, 25, 12, 14, NA, 22, NA, 27)
)
df$Date <- factor(df$Date, levels = unique(df$Date), ordered = TRUE)
plot_ly(
df,
x = ~Date,
y = ~Value,
color = ~Category,
type = 'scatter',
mode = 'lines+markers',
text = ~Category,
hovertemplate = paste0("Date: %{x}<br>Category: %{text}")
)
Expected Behaviour
- NA values should create a gap in the line plot, i.e., they should not be connected.
- This works correctly when there is any number of NA values other than exactly two in any category.
Actual Behavior
- When there are exactly two NA values per category, the missing values are incorrectly connected by a line instead of creating a gap.
- When the hovertemplate is removed, the lines create a gap as expected.
- The issue only arises when there are exactly two NA values per category; any other instance of NA works fine.
- Including split = ~Category resolves the issue, but:
- I cannot find any Plotly documentation on split, leading me to believe it may be deprecated.
- Interestingly, when the hovertemplate is removed, including split does not resolve the issue.
Additional Notes
This seems to be an issue specifically triggered by the combination of NA handling and the hovertemplate. I would appreciate further insight on why this happens or suggestions for a workaround to preserve the gap in the case of two NA values.
System Info
R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows Server 2019 x64 (build 19045)
Matrix products: default
Activity
romanzenka commentedon Feb 21, 2025
I looked in depth into what is happening, and the generated plotly.js data is genuinely missing the null values. Will need to investigate deeper, looks like a genuine bug in the R code to me at the moment.
romanzenka commentedon Feb 21, 2025
I found the culprit... the function
traceify
inplotly_build
would split "traces", looking for trace attributes of a proper length. However, the hidden parameter.plotlyVariableMapping
that is used within the trace does not contain actual values, it contains a list of variable names for the trace. If the length of the variable list matches EXACTLY the number of entries in the trace (you have 12 datapoints, with NAs 8 are kept, the number of variables is 8: x, y, text, hoverTemplate, color, .plotlyTraceIndex, .plotlyMissingIndex and .plotlyGroupIndex... so if that happens, the list of variables in a trace gets cut in two halves, which will switch off grouping on one of the traces that no longer contains the .plotlyGroupIndex.This sounds like very fragile code, I need to think more about how this could be fixed. The culprit is https://github.com/plotly/plotly.R/blob/aa619dc2fbc2fa786e15a8d11444a18863661ed4/R/plotly_build.R#L996C57-L996C71 - this needs to be replaced with a more robust check for "is this entry splittable?"
romanzenka commentedon Feb 21, 2025
Here is a "minimal" counterexample. 7 rows, 1 NAs (total number of rows drops to 6), matches 6 variables without hovertemplate. Add another variable - any variable, it will start working. You can break it by arbitrary number of rows, as long as the number of non-NA rows matches exactly 6.
Shows incorrectly
Should show
Protect list of variables