Split ghcide actions into different descriptors #1857

berberman · 2021-05-22T13:30:32Z

This PR attempts to split the big code action provider in ghcide. I want to minimize the trivial changes in implementation functions (suggestExtendImport, suggestImportDisambiguation, etc.), so I refactored Development.IDE.Plugin.CodeAction.Args to keep the automatic function arguments passing, making it compatible with the old code. However, iiuc the provider should be broken down into different providers, rather than descriptors like this PR does. One of our initial goals is to make those code actions switchable, but providers are anonymous, whereas descriptors have their names, so it's not clear how to generate configuration per provider. Thus this PR introduces some new descriptors for ghcide code actions. This might not be reasonable, but with the abstraction of CodeActionArgs, it won't be hard to reorganize them in providers.

Now the code actions are grouped blindly and I havn't tested them

pepeiborra · 2021-05-22T14:20:55Z

We want descriptors, not providers, so this PR is doing the right thing. I may have said provider where I really meant descriptor.

pepeiborra · 2021-05-23T05:30:19Z

@jneira Does the split have the granularity that you expected? It seems fine to me, but it might not be enough to support disabling specific code actions.

jneira · 2021-05-23T16:01:21Z

fine with me too, if this pr open the way to disable cas even in small groups we could wait to issues asking for

pepeiborra

Thanks, awesome work @berberman

pepeiborra · 2021-05-25T15:48:55Z

Out of curiosity I just checked the benchmark results, and they look quite impressive:

code actions: total time doesn't change, but user spends wayyy less time waiting (15%). I'm not sure where this improvement comes from (did we change to use delayed data?) but it's nice.
code actions after edit: total time does decrease significantly (~35%), which is probably thanks to the extra parallelism.

version    name                             success   samples   startup              setup                   userTime             delayedTime             totalTime            maxResidency   allocatedBytes
upstream   code actions                     True      50        2.7157943            2.1442910000000003e-2   7.908476607000001    2.5013517999999995e-2   7.938987232000001    171MB          6844MB
HEAD       code actions                     True      50        2.676736845          1.5433488e-2            1.183338935          7.195596829000001       8.384399064          170MB          6933MB
upstream   code actions after edit          True      50        2.455734591          0.0                     34.89700874300001    2.9194995000000005e-2   34.932517633         195MB          29005MB
HEAD       code actions after edit          True      50        2.566527158          0.0                     14.851208827999997   8.068489619000003       22.927151044000002   195MB          18714MB

Ailrun · 2021-05-25T15:54:58Z

What happened to memory allocation (for "code actions after edit" bench)? That's quite impressive.

berberman · 2021-05-26T02:24:46Z

I'm not sure where this improvement comes from (did we change to use delayed data?)

Previously, we waited all used rules run finished, including TypeCheck, GetHieAst, GetBindings, etc., and then compute code actions. But only suggestHideShadow is using typechecked ast and hie ast, so there is no need for other code action computations to wait typecheck finished. This PR made rule results shared per group, so only code actions in bindingsPluginDescriptor will not be calculated until the typecheck finished.

If I'm right, I guess moving suggestHideShadow out to a standalone plugin descriptor will make it even faster 🤔

berberman · 2021-05-26T02:29:11Z

We also can enable parallelism in each group, but I'm not sure whether the contention will make it worse, given:

haskell-language-server/ghcide/src/Development/IDE/Plugin/CodeAction/Args.hs

Lines 149 to 157 in a7510a9

    
           -- | There's no concurrency in each provider, 
        
           -- so we don't need to be thread-safe here 
        
           onceIO :: MonadIO m => IO a -> m (IO a) 
        
           onceIO io = do 
        
             var <- liftIO $ newIORef Nothing 
        
             pure $ 
        
               readIORef var >>= \case 
        
                 Just x -> pure x 
        
                 _      -> io >>= \x -> writeIORef' var (Just x) >> pure x

pepeiborra · 2021-05-26T07:15:30Z

Do we need that onceIO at all? Shake already guarantees that these rules will only be run once in the same session, and subsequent reruns will be lookups. So the onceIO can be dropped

berberman · 2021-05-26T07:23:45Z

But a delayed action will still be enqueued, no? Let's try removing onceIO and see what will happen.

berberman · 2021-05-26T09:32:02Z

After removing onceIO:

version    name                             success   samples   startup              setup                   userTime             delayedTime
upstream   code actions                     True      50        2.079015637          2.4707264000000003e-2   3.6805140430000005   5.06126036
HEAD       code actions                     True      50        2.554838365          0.0                     1.3539628190000002   7.1702762180000015
upstream   code actions after edit          True      50        2.210849848          0.0                     18.546111556         6.303753765000001
HEAD       code actions after edit          True      50        2.427225427          0.0                     24.017535580999997   7.096961165000001

pepeiborra · 2021-05-26T09:48:20Z

But a delayed action will still be enqueued, no? Let's try removing onceIO and see what will happen.

So you are saying that we will end up with multiple copies of the same ~~redundant~~ delayed action? This is true, but I'm assuming that the actual work will only be done once.

What does the benchmark show, is upstream the version with onceIO and HEAD the version without?

berberman · 2021-05-26T09:55:50Z

What does the benchmark show, is upstream the version with onceIO and HEAD the version without?

Yes, I think HEAD is the version without onceIO: berberman@6201d94

pepeiborra · 2021-05-26T10:11:17Z

The numbers in the benchmark look contradictory. Why is userTime 66% in the "code actions" experiment, but 33% higher in the "code actions after edit" exp.?

berberman · 2021-05-26T11:51:23Z

I re-run the benchmark CI. It seems that the gap is narrowed, but HEAD is still slightly slower than upstream overall.

version    name                             success   samples   startup              setup          userTime             delayedTime             totalTime            maxResidency   allocatedBytes
upstream   code actions                     True      50        2.039180623          0.0            1.6012565329999997   5.546552721000003       7.152452846          169MB          6887MB
HEAD       code actions                     True      50        2.318210653          1.2344288e-2   1.1096260069999995   7.146050902000002       8.261057349          170MB          6945MB
upstream   code actions after edit          True      50        2.171567757          0.0            21.922519571000002   5.966397669             27.894771285         196MB          26335MB
HEAD       code actions after edit          True      50        2.4681653640000003   0.0            22.032106895000002   6.8835301719999995      28.921297008000003   195MB          26841MB

pepeiborra · 2021-05-26T17:35:28Z

Based on those results, I would remove the onceIO. Otherwise, contributors will think that they need to do their own caching and we'll end up duplicating state and introducing subtle bugs.

berberman · 2021-05-27T04:16:26Z

Otherwise, contributors will think that they need to do their own caching

I understand your concern, but the situation here is very different, and I'm still a bit worried about the overhead of executing those IO actions many times. For example, ToCodeAction instances help us package the following function:

suggestExtendImport :: ExportsMap -> ParsedSource -> Diagnostic -> [(T.Text, CodeActionKind, Rewrite)]

to something like:

f :: CodeActionArgs -> IO [(T.Text, CodeActionKind, [TextEdit])]
f CodeActionArgs {..} = do
  x <- caaExportsMap
  y <- fmap astA <$> caaAnnSource
  case y of
    Just ps -> do
      let results = suggestExtendImport x ps caaDiagnostic
      concat
        <$> mapM
          ( \(a, b, c) -> do
              d <- caaDf``
              e <- fmap annsA <$> caaAnnSource
              case (d, e) of
                (Just df, Just anns)
                  | Right edit <- rewriteToEdit df anns c -> pure [(a, b, edit)]
                _ -> pure []
          )
          results
    _ -> pure []

where

haskell-language-server/ghcide/src/Development/IDE/Plugin/CodeAction/Args.hs

Lines 135 to 147 in 204fdcb

    
           data CodeActionArgs = CodeActionArgs 
        
             { caaExportsMap   :: IO ExportsMap, 
        
               caaIdeOptions   :: IO IdeOptions, 
        
               caaParsedModule :: IO (Maybe ParsedModule), 
        
               caaContents     :: IO (Maybe T.Text), 
        
               caaDf           :: IO (Maybe DynFlags), 
        
               caaAnnSource    :: IO (Maybe (Annotated ParsedSource)), 
        
               caaTmr          :: IO (Maybe TcModuleResult), 
        
               caaHar          :: IO (Maybe HieAstResult), 
        
               caaBindings     :: IO (Maybe Bindings), 
        
               caaGblSigs      :: IO (Maybe GlobalBindingTypeSigsResult), 
        
               caaDiagnostic   :: Diagnostic 
        
             }

As you can see, for N results produced by suggestExtendImport, caaDf is executed N times and caaAnnSource is executed N + 1 times. I don't know if our experiments can reflect this fact. Although we avoid unnecessary runs of rules in defineEarlyCutoff' with shake, without the help of onceIO, runAction will enqueue many many duplicate DelayedActions to the action queue. Even worse, functions like suggestExtendImport in the same plugin descriptor will not be able to share the rule resuts, ending up with more runActions. Apart from that, ExportsMap may also be calculated repeatedly.

pepeiborra · 2021-05-27T08:03:57Z

I see what you mean now, thank you for elaborating the argument. I agree that this is a bit different because every provider may execute an Action multiple times. This is odd!

Split ghcide actions into different descriptors

ce26fd6

Fix duplicate of suggestModuleTypo

a7ba515

berberman marked this pull request as ready for review May 23, 2021 02:21

berberman added 2 commits May 23, 2021 10:26

Reformat

cad04a2

Merge branch 'master' into ghcide-ca-split

ed7214d

pepeiborra approved these changes May 23, 2021

View reviewed changes

berberman changed the title ~~[WIP] Split ghcide actions into different descriptors~~ Split ghcide actions into different descriptors May 24, 2021

berberman added 3 commits May 24, 2021 11:13

Merge branch 'master' into ghcide-ca-split

08de109

Merge branch 'master' into ghcide-ca-split

03e4c03

Merge branch 'master' into ghcide-ca-split

6bc8e1a

berberman merged commit 13a2cc2 into haskell:master May 25, 2021

berberman deleted the ghcide-ca-split branch May 25, 2021 13:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split ghcide actions into different descriptors #1857

Split ghcide actions into different descriptors #1857

berberman commented May 22, 2021

pepeiborra commented May 22, 2021

pepeiborra commented May 23, 2021

jneira commented May 23, 2021

pepeiborra left a comment

pepeiborra commented May 25, 2021

Ailrun commented May 25, 2021

berberman commented May 26, 2021

berberman commented May 26, 2021

pepeiborra commented May 26, 2021

berberman commented May 26, 2021

berberman commented May 26, 2021

pepeiborra commented May 26, 2021 •

edited

berberman commented May 26, 2021 •

edited

pepeiborra commented May 26, 2021

berberman commented May 26, 2021 •

edited

pepeiborra commented May 26, 2021

berberman commented May 27, 2021 •

edited

pepeiborra commented May 27, 2021

Split ghcide actions into different descriptors #1857

Split ghcide actions into different descriptors #1857

Conversation

berberman commented May 22, 2021

pepeiborra commented May 22, 2021

pepeiborra commented May 23, 2021

jneira commented May 23, 2021

pepeiborra left a comment

Choose a reason for hiding this comment

pepeiborra commented May 25, 2021

Ailrun commented May 25, 2021

berberman commented May 26, 2021

berberman commented May 26, 2021

pepeiborra commented May 26, 2021

berberman commented May 26, 2021

berberman commented May 26, 2021

pepeiborra commented May 26, 2021 • edited

berberman commented May 26, 2021 • edited

pepeiborra commented May 26, 2021

berberman commented May 26, 2021 • edited

pepeiborra commented May 26, 2021

berberman commented May 27, 2021 • edited

pepeiborra commented May 27, 2021

pepeiborra commented May 26, 2021 •

edited

berberman commented May 26, 2021 •

edited

berberman commented May 26, 2021 •

edited

berberman commented May 27, 2021 •

edited