Improve `$`-completion of method-chaining syntax and more #1469

sorhawell · 2024-01-02T22:13:50Z

What problem did you solve?

with setting.json

    "r.lsp.debug": true,
    "r.session.useWebServer": true,

vscode-R webserver has a simple regex based isolation of code to eval when using the $/@ to trigger in-session completion. This regex will fail for many but the simplest cases.

Instead I suggest to rely on the R parser and the AST to decide what the left-hand-side of e.g. $ operator is.

This will allow multi-line completion, chained method completion, and potentially anything the R parser accepts.

I'm interested in improving vscode-R similar as this patch for Rstudio completions to support completion for method chaining r-polars (... and also R6 and arrow),

How can I check this pull request?

check out branch with vscode and emulate the extension
"r.session.useWebServer": true

example 1

#$-complete after arbitrary parenthesis
(
    iris |> # |> is supported if the R interpreter supports it
    (\(iris) {
        iris$is_alien = FALSE
        head(iris, 3)
    })()
)$

example 2

#$ complete chained R6 methods 
FruitBasket <- R6::R6Class(
  "FruitBasket",
  private = list(content = character()),
  public = list(
    initialize = \() {},
    mutate_content = \(fruit) {
        private$content = c(private$content, fruit)
        self
    },
    add_banana = \() self$clone()$mutate_content("banana"),
    add_apple = \() self$clone()$mutate_content("apple"),
    to_count_list = \() as.list(table(private$content))
  )
)

#$-complete any multiline chain. If it is valid R syntax is works
{
  print("this print statement is not evaluated, because not a child of `$` call in AST");
  FruitBasket$new()$
    add_apple()$
    add_banana(
    
    )$
    add_apple()$
    to_count_list()$
}

example 3

support r-polars completion ;) (disclaimer I have co-authored the polars bindings for R)

# load polars and disable completion () and <- in suggestions
library(polars) #polars as of https://github.com/pola-rs/r-polars/pull/597
cs = polars:::completion_symbols
cs$method = ""
cs$setter = ""


pl$
  DataFrame(
    a = 1:5,
    b = letters[1:5]
  )$
  lazy()$
  select(
    pl$all()$head(5)
  )

example 4 completion for the arrow package (R6)

library(arrow)


arrow_table(iris[,1:2])$
  Slice(2,10)$
  print()$
  Take(5)$

sorhawell · 2024-01-03T13:05:09Z

4b3044d commit adds finish_expression which is related to the rstudio approach of filling in missing parenthesis. I forgot to add that to PR sry.

renkun-ken · 2024-01-06T14:11:12Z

Thanks for working on this! It is definitely a super useful improvement of the current completion.

I notice that the code in scope has to be evaluated (silently) to provide the completion, which makes the completion result accurate. I wonder if it is safe to do so in many use cases?

For example:

df <- data.frame(id = 1:10)
obj <- local({
  file.remove("/path/to/something/important")
  df$x <- 1
  df
}$)

If a user writes some critical code that changes some important state, it seems only desirable that the code should only be evaluated when manually sent to the terminal by user. However, maybe a typo $ could trigger a completion that evaluates the code in the scope, i.e. the code is silently evaluated in the background.

A less damaging example is that the user code to be evaluated might be too time-consuming before the user realizes it. If the code contains C/C++ impl without caring user interruption, then the session might be blocked.

Evaluting any user code other than simple name1$name2$name3 cases seems to suffer such problem. What do you think?

sorhawell · 2024-01-07T21:13:42Z

@renkun-ken

I principally agree with your concern. I will think in practice, it is likely not as problematic. I suggest to add a user option something like "evaluation_level" to control how eager or cautious completion should be.

What do you think of that?

In any case, parsing is a fast first step before evaluation and could likely replace regex at any desired "evaluation_level".

evaluation_level = "super_secure"

If opting for this level. The auto-completion must only extract variables and not run user-defined or package defined generic functions. A malicious developer could sneak in bad side effects disguised as variables via e.g. activeBindings or the generic methods .$, .[, .[[ or .dollarNames. That would also be an issue in the current vscode-R. I think it should be possible to block most of that.

# sneaky evaluation #1
makeActiveBinding("xyz", f = \() {print("hello there");42L}, env = .GlobalEnv)
xyz
#> [1] "hello there"
#> [1] 42

# sneaky evaluation #2
l = list()
class(l) = "TomatoCan"
"$.TomatoCan" = \(self, name) {print("hi again");self[[name]]}
l$foo = "bar"
l$foo
#> [1] "hi again"
#> [1] "bar"

^{Created on 2024-01-07 with reprex v2.0.2}

evaluation_level = "only_generic_getters"

allow (chained) evaluation generic functions like .[ .[[ and .$, but no function calls ( in the AST which are not on the positive-list. This is pretty much what rstudio is running with today I think. BTW .[ in many ways behaves just as function call ( and can be used to circumvent rstudio completion limits. But by convention .[ is expected to only index so that is ok.

example of what could work:

example of what would not work (unless c and list is added to the positive list)

evaluation_level = "only_single_line_scope"

There could be a choice which will only parse the same code line to make a meaningful evaluation.

evaluation_level = "full_evaluation"

Parsing like rstudio, but unlimited evaluation. I will argue if using most package APIs, then there are very few mutable methods out there. The polars and arrow packages is built around immutable structures. Most R objects SEXPs have imuttable behavior, where environment is the central exception. Most R package have APIs where the central objects are imuttable. On top of that polars has lazy evaluation (just as dplyrDB) where evaluation does not trigger computation before explicitly calling collect(). A good R package running low level code, should check for R user interrupts every 100-1000ms and abort if requested.

There are exceptions e.g. data.table set-functions, (file)-connections ...

I think some users would happily opt-in for "full_evaluation".

renkun-ken · 2024-01-08T06:30:08Z

Thanks for the detailed explanation. I think the evaluation_level should be good enough to handle such problem.

sorhawell · 2024-03-27T08:08:13Z

Update. I have been away for a while. I failed to make good safe list of functions which could be evaluated by IDE auto completion to avoid side effects. In the end, I think this feature is something a user has to willingly opt in to.

sorhawell added 2 commits December 28, 2023 01:06

rm trigger from expr string, only eval last expr

2d17b41

use ast to isolate left side

7861dd4

sorhawell changed the title ~~Improve complete~~ Improve $-completion of method-chaining syntax and more Jan 2, 2024

forgot finish_expression

4b3044d

ElianHugh mentioned this pull request Mar 2, 2024

Expand Session watcher when parantheses #1495

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `$`-completion of method-chaining syntax and more #1469

Improve `$`-completion of method-chaining syntax and more #1469

sorhawell commented Jan 2, 2024 •

edited

Loading

sorhawell commented Jan 3, 2024

renkun-ken commented Jan 6, 2024 •

edited

Loading

sorhawell commented Jan 7, 2024 •

edited

Loading

renkun-ken commented Jan 8, 2024

sorhawell commented Mar 27, 2024

Improve $-completion of method-chaining syntax and more #1469

Are you sure you want to change the base?

Improve $-completion of method-chaining syntax and more #1469

Conversation

sorhawell commented Jan 2, 2024 • edited Loading

What problem did you solve?

How can I check this pull request?

example 1

example 2

example 3

example 4 completion for the arrow package (R6)

sorhawell commented Jan 3, 2024

renkun-ken commented Jan 6, 2024 • edited Loading

sorhawell commented Jan 7, 2024 • edited Loading

evaluation_level = "super_secure"

evaluation_level = "only_generic_getters"

evaluation_level = "only_single_line_scope"

evaluation_level = "full_evaluation"

renkun-ken commented Jan 8, 2024

sorhawell commented Mar 27, 2024

Improve `$`-completion of method-chaining syntax and more #1469

Improve `$`-completion of method-chaining syntax and more #1469

sorhawell commented Jan 2, 2024 •

edited

Loading

renkun-ken commented Jan 6, 2024 •

edited

Loading

sorhawell commented Jan 7, 2024 •

edited

Loading