Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do-block syntax in Elixir confuses show-smartparens-mode #707

Closed
gausby opened this issue Feb 7, 2017 · 30 comments · Fixed by #1058
Closed

Do-block syntax in Elixir confuses show-smartparens-mode #707

gausby opened this issue Feb 7, 2017 · 30 comments · Fixed by #1058

Comments

@gausby
Copy link
Contributor

gausby commented Feb 7, 2017

The Elixir syntax is a bit unforgiving because there are a couple of shorthands which currently confuses smartparens 'show pairs' functionality, such as:

screen shot 2017-02-07 at 20 56 35

Do-blocks can be written as do (expressions) end and , do: expression.

I don't know if there are a way to support this in the show-parens mode.

@Fuco1
Copy link
Owner

Fuco1 commented Feb 8, 2017

Ouch :)

This should be no problem, if the syntax requires do: we can decide on the : if there is an associated end or not.

@gausby
Copy link
Contributor Author

gausby commented Feb 8, 2017

I think ignoring do:'s on pairs would suffice :)

There are also some problems with matching defp's. Could it be because of the order in this list: https://github.com/Fuco1/smartparens/blob/master/smartparens-elixir.el#L39-L52 ?—and should I open another issue?

@gausby
Copy link
Contributor Author

gausby commented Feb 8, 2017

Also…as I read the code: it will look for a do, go to the start of the line and look for one of the keywords.

Unfortunately, if this is the case: that assumption does not suffice: The do doesn't have to be on the same line as the keyword. An example could be this function defintion where I use guards:

  defp drop_tailing_bits(%__MODULE__{size: size} = bitfield)
  when rem(size, 8) == 0 do
    {:ok, bitfield}
  end

Am I correct in my understanding of the code, and should I open another issue for this one?

@Fuco1
Copy link
Owner

Fuco1 commented Feb 8, 2017

Yep, it works as you say. I guess we don't need another issue, maybe you could open a PR with added failing tests displaying these cases so I can then fix the issues. What do you think?

@gausby
Copy link
Contributor Author

gausby commented Feb 9, 2017

I will look into it asap, but I am kinda busy until Sunday.

Thanks for a great package, btw :)

@Fuco1 Fuco1 added this to New in Elixir Feb 20, 2017
@Fuco1 Fuco1 moved this from New to In Progress in Elixir Apr 27, 2017
@Fuco1 Fuco1 added this to the Backlog milestone Dec 4, 2017
@g-belmonte
Copy link

@Fuco1 I'm fairly new to emacs and elisp, so I'm sorry if the question sounds dumb, but...
wouldn't something like this solve the issue?

here: https://github.com/Fuco1/smartparens/blob/master/smartparens-elixir.el#L41
like this:

...
 (when (equal (rx "do[^:]") id)
...

Are there any issues using a regex in this place?

@Fuco1
Copy link
Owner

Fuco1 commented Dec 8, 2019

I don't remember what the smartparens code does there, but we need to make sure this do: construct doesn't expect a corresponding end. Which the regular do does expect.

It might still work though. If you use elixir you can test it and if it works we can create a patch.

@nicobao
Copy link

nicobao commented Mar 18, 2020

Hi! This error still pops up in latest version of elixir-mode. Or at least, that's what I am experimenting. Is the issue supposed to be fixed already?
(and thanks a lot for your work! not sure I know enough about Emacs Lisp to help but still...)

@nicobao
Copy link

nicobao commented Mar 19, 2020

Hi! Here is another example where similar errors occur in Elixir: with the quote do construct.
defmodule_end
defmodule_quote_do_end

If someone is ready to provide some initial guidance, I'd be happy to help.

@Fuco1
Copy link
Owner

Fuco1 commented Mar 22, 2020

The quote do should already be handled by sp-elixir-def-p, so if that doesn't work there's likely a bug in that code.

As for the inline do:, can you provide some minimal valid elixir code where this causes a problem? I do not use elixir or ruby very much.

@axelson
Copy link

axelson commented Mar 22, 2020

@Fuco1 Here is a minimal example with do:

defmodule ElixirExampleOne do
  def func_one do
    IO.puts("hi 1")
  end

  def func_two, do: IO.puts("hi 2")
end

The def for func_two ends up paired with the end on the last line which is incorrect.

Another related bit of syntax that is interpreted incorrectly is anonymous functions, in this example func_two is a function that returns an anonymous function. The end inside func_two should be paired with fn, but instead it is paired with the def of func_two:

defmodule ElixirExampleTwo do
  def func_one do
    IO.puts("hi 1")
  end

  def func_two do
    fn name -> IO.puts("hello " <> name) end
  end
end

The fn ... end could be solved separately or in a different PR, but I mention it since it is another incorrect pairing and it might be easier to solve both of them at the same time rather than one after another (I'm not familiar enough with the code to really know).

You can streamline the examples even further by removing func_one from each of them.

@andreyorst
Copy link
Contributor

@Fuco1 I'm interested in working on this, as I experience this problem and it makes it a lot harder to navigate code. Any points towards where I should look at? E.g. Docs or maybe there are already languages supported with similar feature?

@andreyorst
Copy link
Contributor

@axelson, sorry to disturb you, but it seems that I've fixed the do: issue, and fn ... end is already handled by Smartparens. However I would like to ask if often see linebreaks before do: or even do forms in the code, e.g.:

defmodule ElixirExampleOne do
  def func_one
  do
    IO.puts("hi 1")
  end

  def func_two,
    do: IO.puts("hi 2")
end

Because currently Smartparens is completely broken for this particular case and my PR does not yet fix this. The question is if this needs to be fixed, although I suppose this will be a llittle more performance hungry.

@axelson
Copy link

axelson commented Dec 6, 2020

@andreyorst Thanks for taking a stab at this issue! Those two forms are indeed valid elixir code, although the first one is not seen very often, usually there is a when clause as a guard before the do. Here's a couple longer examples:

defmodule ElixirExampleTwo do
  def can_destroy_message(nil, _), do: {:error, %{reason: "not_found"}}

  def can_destroy_message(%{destroyed_at: destroyed_at}, _)
      when not is_nil(destroyed_at),
      do: {:error, %{reason: "not_found"}}

  def can_destroy_message(%{room_id: message_room_id}, %{room_id: member_room_id})
      when message_room_id != member_room_id,
      do: {:error, %{reason: "unauthorized"}}

  def can_destroy_message(%{content: content} = message, room_member)
      when not is_nil(content) and not is_nil(room_member) do
    # implementation
  end
end

The def being on a separate line from the do is rather common in Elixir code, so it would be great if it was well-handled.

@andreyorst
Copy link
Contributor

Good news, I've managed to make this case work last night:

defmodule ElixirExampleTwo do
  def can_destroy_message(nil, _),
    do: {:error, %{reason: "not_found"}}

  def can_destroy_message(%{destroyed_at: destroyed_at}, _)
      when not is_nil(destroyed_at),
      do: {:error, %{reason: "not_found"}}

  def can_destroy_message(%{room_id: message_room_id}, %{room_id: member_room_id})
      when message_room_id != member_room_id,
      do: {:error, %{reason: "unauthorized"}}

  def can_destroy_message(%{content: content} = message, room_member)
      when not is_nil(content) and not is_nil(room_member) do
    # implementation
  end
end

However your example doesn't quite work. The difference is in the newline before first do:. I'm not sure if I can fix this.

@andreyorst andreyorst mentioned this issue Dec 7, 2020
4 tasks
@andreyorst
Copy link
Contributor

@axelson if you can test implementation from #1058 on some more or less big codebases (I don't write in Elixir and don't know big projects except maybe the language server), and provide some more examples where it breaks it would be very helpful :)

@andreyorst
Copy link
Contributor

Yay, I've fixed that bug with do: not being parsed when it is on the same line! Now all code I've seen so far is parsed correctly. Relevant changes are in #1058

@axelson
Copy link

axelson commented Dec 8, 2020

Great! I'm having some trouble getting the git version of smartparens installed locally but I should be able to figure it out eventually.

@andreyorst
Copy link
Contributor

andreyorst commented Dec 8, 2020

Great! I'm having some trouble getting the git version of smartparens installed locally but I should be able to figure it out eventually.

If you use straight and use-package it's easy:

(use-package smartparens
  :straight (:host github :repo "andreyorst/smartparens" :branch "elixir-better-search"))

If not, I guess you can simply replace smartparens-elixir.el with code from my repo https://raw.githubusercontent.com/andreyorst/smartparens/elixir-better-search/smartparens-elixir.el and bytecompile it

@andreyorst
Copy link
Contributor

andreyorst commented Dec 9, 2020

I guess that the only quirk left out now is bodyless clauses. I'll try to handle it as well.

Also, I've noticed that without (modify-syntax-entry ?_ "w") Smartparens will think that end_ is also a valid pair. @Fuco1 is this intentional?

@andreyorst
Copy link
Contributor

andreyorst commented Dec 10, 2020

Uh oh. I've also noticed that this is a valid syntax:

defmodule Example do
  def foo(do: body) do
    body
  end
end

Example.foo do
  IO.puts("1")
  IO.puts("2")
end

Which currently breaks all heuristics I've introduced...

And given that this all can be minified to singlle line like this

defmodule Example do def foo(do: body) do body end end; Example.foo do IO.puts("1"); IO.puts("2") end

Making this extremely hard to find matching keywords with my approach. I guess that language with such syntax should use a proper parser instead, something like tree-sitter, to make structural navigation really reliable.

Edit: OK, I might be able to fix this, but I'm not sure if other things will not pop out

@Fuco1
Copy link
Owner

Fuco1 commented Dec 10, 2020

I was thinking of using something like LSP for parsing but it might be too slow for interactive navigation. Maybe fetching entire file's AST or some partial AST around point would work, but it's probably a lot of work.

Once you include some test cases for the new work I'll be happy to merge this. Alternatively, if either of you wants to maintain the elixir extension, I'll give you push access.

@andreyorst
Copy link
Contributor

andreyorst commented Dec 10, 2020

I was thinking of using something like LSP for parsing but it might be too slow for interactive navigation.

@Fuco1 have you heard about https://tree-sitter.github.io/tree-sitter/? It is amazingly fast and provides all necessary info for structural navigation and manipulation. Albeit it is an external library that must be loaded separately, but there are works in Emacs for this https://github.com/ubolonton/emacs-tree-sitter.

Once you include some test cases for the new work I'll be happy to merge this. Alternatively, if either of you wants to maintain the elixir extension, I'll give you push access.

I don't think I'll go as far as Elixir extension maintainer, as I don't program in it :) At least rightnow. I've just decided to help by fixing the issue, and maybe some others that I have not yet looked at.

@axelson
Copy link

axelson commented Dec 11, 2020

I am a maintainer of the Elixir Language Server, so I'm curious how we'd potentially leverage the language server from smartparens. Are there any examples from other languages? Also I'm not currently well-versed enough in elisp to feel comfortable writing elisp for a use-case like this.

@andreyorst with the ElixirExampleTwo above and your code (546274a) I'm not seeing the first do for the overall module matching with the end on the very last line. Hmm, actually it looks like if I hover over the defmodule then the final end is highlighted, but if I hover over the final end then the defmodule is not highlighted.

Also, I'm very curious about TreeSitter and would like to investigate that more in the future, although last time I checked it didn't seem quite ready to use, but perhaps that's changed by now.

@andreyorst
Copy link
Contributor

andreyorst commented Dec 11, 2020

if I hover over the final end then the defmodule is not highlighted.

Hm, that's weird. Here's how it works for me

image

Here's a really good talk on Tree Sitter: https://www.youtube.com/watch?v=Jes3bD6P0To, and at 5:03 Tree Sitter's author explains why LSP should not be used for such features as syntax highlighting and structural navigation. In short - embedding several language servers into smartparens will almost turn smartparens into another LSP client, so perhaps, if a server provides such features as strucural navigation (I don't know any server that does that), it is better to make smartparens work with LSP-mode or eglot instead.

although last time I checked it didn't seem quite ready to use, but perhaps that's changed by now.

If you mean in Emacs, than maybe yeah. In Atom it is fairly useful since around 2018. Although support for new languages comes at a quite slow pace, but I guess that's because VSCode pushes LSP for highlighting syntax.

@nicobao
Copy link

nicobao commented Dec 11, 2020

if I hover over the final end then the defmodule is not highlighted.

Hm, that's weird. Here's how it works for me

Here's a really good talk on Tree Sitter: https://www.youtube.com/watch?v=Jes3bD6P0To, and at 5:03 Tree Sitter's author explains why LSP should not be used for such features as syntax highlighting and structural navigation. In short - embedding several language servers into smartparens will almost turn smartparens into another LSP client, so perhaps, if a server provides such features as strucural navigation (I don't know any server that does that), it is better to make smartparens work with LSP-mode or eglot instead.

The video you mentioned explains that tree-sitter is more adapted for Atom and GitHub because they deal with plenty of languages.
As of smartparens, it's the same thing. But the question is: do all editors have an equivalent of smartparens? Besides, isn't it the point of an LSP to decouple IDE-capibilities from specific frontend? Finally, is providing excellent syntax highlighting without any mistakes for all the possible grammars really feasible without some errors here and there?

LSP specification now supports semantic tokens: https://microsoft.github.io/language-server-protocol/specifications/specification-3-16/#textDocument_semanticTokens

The LSP Server maintainers can implement it using tree-sitter or any other parser they prefer, and support syntax highlighting from there.

However - I do agree that syntax highlighting is a little special, because it's something you always want to work, and language server means I/O, which means errors. Nothing prevent lsp clients to keep in cache the latest syntax information though.
Besides, it's better if syntax highlighting work similarly between multiple languages: it's confusing to have different theme for multiple languages so I am a little conflicted here. This is easier to deal with if you have one client-side project - like smartparens - to deal with syntax highlighting for all the languages.

At the end of the day, tree-sitter is just a parser, and LSP is just a protocol. Whether you use tree-sitter from LSP or directly on the client side is a matter of choice, but they are in no way incompatible.

@axelson
Copy link

axelson commented Dec 12, 2020

if I hover over the final end then the defmodule is not highlighted.

Hm, that's weird. Here's how it works for me

Probably something specific to my setup (and likely related to something that Doom is adding). I haven't noticed any other issues so 👍 from me on merging the PR.

@andreyorst
Copy link
Contributor

But the question is: do all editors have an equivalent of smartparens?

@nicobao Didn't seen that coming, but yeah, actually a lot of text editors have smartparens-like plugins. Don't really understand why this is relevant though.

is providing excellent syntax highlighting without any mistakes for all the possible grammars really feasible without some errors here and there?

Tree sitter was implemented with this in mind - to be able to parse correctly as many languages as possible, with a lot of ambiguity and a way to recover from error state while still providing useful syntax tree. This is mostly what Smartparens does, except it is much simpler than tree-sitter as it features small regexp based parser for each language it supports. And as we see in this issue, some languages may require a more complete parsing to work reliably.

Also, perhaps this was an analogy, but just to make sure - Smartparens does not provide syntax highlighting at all. Using syntax highlighting info from a language server just for querying the syntax tree seems overkill to me. Treesitter on the other hand provides just the syntax tree that can be analysed however you want to.

Anyway, I think this gets away from the main topic.

@axelson, @Fuco1 I was able to implement searchers that work for minified version of the code, where everything is in single line, but this actually makes parsing very slow. For files that contain about 100 files the delay between highlighing and jumping at the top level form is significant. I guess we can keep the line based search, that can't detect everything but still a bit better than what we have ATM?

@axelson
Copy link

axelson commented Dec 14, 2020

@axelson, @Fuco1 I was able to implement searchers that work for minified version of the code, where everything is in single line, but this actually makes parsing very slow. For files that contain about 100 files the delay between highlighing and jumping at the top level form is significant. I guess we can keep the line based search, that can't detect everything but still a bit better than what we have ATM?

👍 from me on that approach. I don't think that keeping everything in a single line is very common in elixir (and most people use the formatter which wraps at 100 characters).

@andreyorst
Copy link
Contributor

@Fuco1 I've added tests, and worked out most edge cases I could reliably do. I consider this as the point of as far as I can go right now.

@Fuco1 Fuco1 moved this from In Progress to Done in Elixir Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

6 participants