New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide defguard #2469

Closed
josevalim opened this Issue Jul 1, 2014 · 42 comments

Comments

Projects
None yet
@josevalim
Copy link
Member

josevalim commented Jul 1, 2014

Today implementing a guard is somehow complicated as we need to quote/unquote arguments depending if the function is being invoked inside or outside of a guard. Here is an example from the Record module:

defmacro record?(data, kind) do
  case Macro.Env.in_guard?(__CALLER__) do
    true ->
      quote do
        is_tuple(unquote(data)) and tuple_size(unquote(data)) > 0
          and :erlang.element(1, unquote(data)) == unquote(kind)
      end
    false ->
      quote do
        result = unquote(data)
        is_tuple(result) and tuple_size(result) > 0
          and :erlang.element(1, result) == unquote(kind)
      end
  end
end

The idea is to provide a defguard(p) macro that does all the work automatically and assert the expressions in the guard are valid. The example above would be rewritten as:

defguard record?(data, kind) do
  is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind
end

defguard(p) is extremely limited in scope:

  • the function head cannot pattern match
  • code blocks are not allowed in guards
  • no assignment
  • no local functions

Since we have all limitations above, it is debatable if defguard should use the do/end syntax as it "promotes" code blocks. Here are other alternatives:

defguard record?(data, kind) =
  is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind

or:

defguard record?(data, kind),
  is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind

or even:

defguard record?(data, kind) when
  is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind

Thoughts?

@josevalim josevalim added this to the 1.0 milestone Jul 1, 2014

@kluthen

This comment has been minimized.

Copy link

kluthen commented Jul 1, 2014

I love this future feature but I think that
= might be awkward since this is no matching nor an assignation, even attributes don't use = sign.
when might prove confusing, one might want to implement different guarding techniques based upon other guard.
thus I prefer , or as :
defguard record?(data, kind) as is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind

@ericmj

This comment has been minimized.

Copy link
Member

ericmj commented Jul 1, 2014

Even though the scope of what you can do with defguard is very limited i would say that it should use do blocks.

@josevalim josevalim removed this from the 1.0 milestone Jul 1, 2014

@scrogson

This comment has been minimized.

Copy link
Contributor

scrogson commented Jul 1, 2014

Yeah, do: end is more consistent.

On Tuesday, July 1, 2014, Eric Meadows-Jönsson notifications@github.com
wrote:

Even though the scope of what you can do with defguard is very limited i
would say that it should use do blocks.


Reply to this email directly or view it on GitHub
#2469 (comment).

@christhekeele

This comment has been minimized.

Copy link
Contributor

christhekeele commented Jul 1, 2014

I did something like this some time ago. Quite out of date now, and it does no work to enforce the limitations of guards. However, I found using do: end to be quite convenient and consistent.

@chrismccord

This comment has been minimized.

Copy link
Member

chrismccord commented Jul 1, 2014

My vote goes for do/end + friendly docs that says keep it simple.

@meh

This comment has been minimized.

Copy link
Contributor

meh commented Jul 1, 2014

:do all the way.

@alco

This comment has been minimized.

Copy link
Member

alco commented Jul 1, 2014

Somehow this looks like the most natural choice to me:

defguard record?(data, kind) when
     is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind

Elixir has body-less functions. This one extends the idea a bit to associate a guard expression (it is obvious here that only things allowed in guards expressions are also allowed in defguard) with a name.


Using do end and limiting what you can do inside is something that hasn't been done and is outright unexpected.

= doesn't work with def* (unless other def* macros start taking it).

This doesn't look Elixirsh:

defguard record?(data, kind),
   is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind

It could instead be

defguard :record?, [:data, :kind], 
    is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind

to look somewhat similar to how EEx defines functions with arguments. Still, that trailing expression that is not enclosed in a block looks awkward.

@ericmj

This comment has been minimized.

Copy link
Member

ericmj commented Jul 1, 2014

Body-less functions have no logic in elixir, so we shouldnt introduce a type of body-less function that has logic. Additionally, with when you can still think that you could pattern match on the function head, so why use when if it doesn't limit everything.

Seeing a when guard without a body may be even more confusing and the guard has different semantics from normal guards. A defguard cannot have multiple clauses, so what will happen if the guard doesn't evaluate to true? Since there are no other clauses the user may think it will error since that is what function guards do.

Use do ... end and document the semantics of defguard instead.

@nurugger07

This comment has been minimized.

Copy link
Contributor

nurugger07 commented Jul 21, 2014

+1 one on the do...end. This would maintain consistent with macro generation and the language semantics.

@knewter

This comment has been minimized.

Copy link
Contributor

knewter commented Aug 2, 2014

Yup, reading through these comments I'm inclined to prefer the do...end here as well

@devinus

This comment has been minimized.

Copy link
Contributor

devinus commented Aug 8, 2014

, do: works just fine IMHO.

@josevalim josevalim modified the milestone: v1.1.0 Jan 4, 2015

@whatyouhide

This comment has been minimized.

Copy link
Member

whatyouhide commented May 2, 2015

I tried to prototype defguard this morning and it shouldn't be very hard. The only annoying thing that defguard doesn't allow (as far as I understand) is something like this:

defmacro my_guard(arg) do
  arg = operation_on_arg(arg)
  quote do
    unquote(arg)
  end
end

That is, doing something in the macro before entering the quote block. Did any of you have any thoughts about this?

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented May 2, 2015

@whatyouhide that's a very, very good point. Maybe, instead of providing defguard as a Kernel macro, we could provide Macro.defguard/2 (or Macro.to_guard/2), it would receive the quoted expression and the environment and do the operation described above. This way we keep Kernel clean and we are still flexible.

@whatyouhide

This comment has been minimized.

Copy link
Member

whatyouhide commented May 2, 2015

@josevalim so, would we write something like:

defmacro ends_in?(n, digit_or_digits) do
  digits = List.wrap(digit_or_digits)
  Macro.to_guard do
    rem(n, 10) == digits
  end
end

I'm not sure if I got this right :\

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented May 2, 2015

More like this:

defmacro ends_in?(n, digit_or_digits) do
  digits = List.wrap(digit_or_digits)
  expr  = quote do: rem(n, 10) == digits
  vars  = [digits: digits]
  Macro.to_guard(expr, vars, __CALLER__)
end
@vendethiel

This comment has been minimized.

Copy link

vendethiel commented May 2, 2015

I'm not sure why n is not in vars, could you explain?

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented May 2, 2015

It is a bug, both should be there. :)

@josevalim josevalim removed this from the v1.1.0 milestone Aug 10, 2015

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented Aug 10, 2015

I am stepping back on this one.

@josevalim josevalim closed this Aug 10, 2015

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented Aug 13, 2016

Reopening given the mailing list discussion. @whatyouhide if someone needs to process the arguments before quoting, then they shouldn't use defguard. :) We will also go with the third option:

defguard is_record(data, kind) when
  is_tuple(data) and tuple_size(data) > 0 and elem(data, 0) == kind

It is the one that makes most sense because the right side of when needs to be guard expressions.

@josevalim josevalim reopened this Aug 13, 2016

@christhekeele

This comment has been minimized.

Copy link
Contributor

christhekeele commented Aug 13, 2016

So, to reify the spec on this, you want:

  • to create a Kernel macro defguard\1

  • that accepts a single argument of the function definition AST format name(args) when guard clauses

  • that raises an error if there are no when clauses

  • that otherwise splits on the when to give function head name(args) and guards guard clauses

  • that creates two versions of guards: one where every variable reference is wrapped in unquote, and one that unquotes each variable reference once and uses those throughout the guard definition

  • and becomes:

    defmacro function_head do
      case Macro.Env.in_guard?(__CALLER__) do
        true -> quote do: guards_with_all_variables_unquoted
        false -> quote do: guards_with_each_variable_unquoted_once
      end
    end

I still think there are some problems with the when style but my post exploring those options is getting long so I'm going to break this up.

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented Aug 13, 2016

@christhekeele yes. And it should also raise if a non-valid guard expression is used in when.

@christhekeele

This comment has been minimized.

Copy link
Contributor

christhekeele commented Aug 13, 2016

Here's my issue with the when syntax: idiomatically wherever you can specify a when clause (function/macro heads), you can write multiple heads for matching different clauses. So you can do:

def foo(bar) when is_list bar
def foo(bar) when is_map bar

Parallel syntax suggests you could do the following:

defguard foo(bar) when is_list bar
defguard foo(bar) when is_map bar

Of course that makes no sense in this context. The implementation discussed above would generate macros that warn this clause cannot match because a previous clause always matches at compile time. With functions that suggests your guards overlap somehow but they clearly don't in our defguards so it's particularly cryptic to the end user.

The do...end syntax avoids unidiomatic usage of when and fits in better with def, defmacro, defprotocol and peers.

defguard foo(bar) do
  is_list(bar) or is_map(bar)
end

Admittedly, if the guard writer does provide guard clauses things get a little weird:

defguard foo(bar) when is_list bar do
  hd(bar) == :baz
end

If that's translated using do..end syntax, it becomes:

defmacro foo(bar) when is_list bar do
  case Macro.Env.in_guard?(__CALLER__) do
    true -> quote do
      hd(unquote(bar)) == :bar
    end
    false -> quote do
      r = unquote(bar)
      hd(bar) == :bar
    end
  end
end

This will never work inside guards because bar is an AST tuple. If the guard provided is is_tuple bar, then it will match the AST in guards but not outside it.

I can't think of a good way to support guards in defguards at the macro level, perhaps someone really clever could figure it out though. Failing a proposal for that, though, defguard ... when .. do .. end could simply raise. If someone figures something out later, then the implementation can just change to support that use-case, backwards-compatible.


TL;DR: I think the when syntax goes counter to Elixir defxx ... when guard clause conventions so might be a little confusing. Furthermore the do...end syntax is more inline with Elixir conventions for other things like def, defmacro, defprotocol, and defimpl, and could support guard clauses on defguards down the line.

@christhekeele

This comment has been minimized.

Copy link
Contributor

christhekeele commented Sep 17, 2016

This naive one-clause addtion to Macro.validate seems surprisingly robust for scanning guard expressions.

It whitelists the allowed local calls documented here, permits simple variable references, and doesn't allow blocks, remote calls, function definition, assignment, or pattern matching.

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented Nov 19, 2016

@christhekeele a defguard implementation would need to expand the AST because of both external macros (such as Integer.is_odd) and internal macros (like the ones in defmacrop). On the positive side, the expansion can be done by calling Macro.traverse + Macro.expand. Then you can use erl_internal to check if all entries are valid guard expressions or not.

@christhekeele

This comment has been minimized.

Copy link
Contributor

christhekeele commented Mar 8, 2017

Just wanted to post some notes here. I took another deep dive into implementing this last night on top of edge 1.5.x and got a quite a bit further, but still ran into some blockers in isolating what's valid in guards.

This is much trickier than I originally surmised and has really gotten me acquainted with Elixir's internal implementation, it's been a blast. 😃

I'm keeping a log of my investigation here. Documenting it as I go has helped me make many incremental gains over my last attempt, although I've spent at least much time on the writeup as the implementation. If you happen to read your way through it all, I'd appreciate any comments or insights!

@OvermindDL1

This comment has been minimized.

Copy link
Contributor

OvermindDL1 commented Mar 8, 2017

Idea dump:

I'm curious how it would be to implement an is_struct/1/is_struct/2 with this, it seems as it is discussed now it is not possible, however if we want these both to work:

def blah(some_struct) when is_struct(some_struct), do: :blah

def blorp(some_struct) when is_struct(some_struct, MyStruct), do: :blorp

def bleep(some_struct, dyn_struct_name) when is_struct(some_struct, dyn_struct_name), do: :bleep

Then there should be some way to create an unnamed match as well, so perhaps something like this:

defguard is_struct(%{__struct__: struct_name}) when is_atom(struct_name) # Maybe some other checks?

defguard is_struct(%{__struct__: struct_name}, struct_name) when is_atom(struct_name)

Could cause the above first 3 examples of is_struct to de-sugar like this:

def blah(some_struct) when is_struct(some_struct), do: :blah
# Desugars into:
def blah(some_struct = %{__struct__: struct_name_guardmatch_0}) when is_atom(struct_name_guardmatch_0), do: :blah

def blorp(some_struct) when is_struct(some_struct, MyStruct), do: :blorp
# Desugars into:
def blorp(some_struct = %{__struct__: struct_name_guardmatch_0}) when is_atom(struct_name_guardmatch_0) and (struct_name_guardmatch_0 === MyStruct), do: :blorp

def bleep(some_struct, dyn_struct_name) when is_struct(some_struct, dyn_struct_name), do: :bleep
# Desugars into:
def bleep(some_struct = %{__struct__: struct_name_guardmatch_0}, dyn_struct_name) when is_atom(struct_name_guardmatch_0 and (struct_name_guardmatch_0 === dyn_struct_name), do: bleep

Or something like that. It is definitely not a normal macro, very special, but this would give the most power. When 'def' walks over the defguard calls it just has a counter it increments for every usage to make unique variable names and so forth. Even calling is_struct in multiple when branches would be safe, just make multiple internal names for each one unless you really wanted to unify them (no need though as it is all internal and 'should' not cost runtime). You could even optimize calls so that the equals match for the is_struct/2 case is optimized into the match itself, however you could only do that if all when cases used the same struct name, so probably not worth it for the slight tiny matching speed boost.

@ericmj

This comment has been minimized.

Copy link
Member

ericmj commented Mar 8, 2017

@OvermindDL1 It's not possible since guards can have boolean expressions. def foo(bar) when is_struct(Bar) or is_atom(bar) would have to be transformed into multiple function clauses. You can imagine how complicated more complex guards would be. It also goes from defguard being a macro into is_struct being some special form that has to be handled by the compiler or the def macros.

EDIT: Changed the example.

@OvermindDL1

This comment has been minimized.

Copy link
Contributor

OvermindDL1 commented Mar 8, 2017

@ericmj Yeah I'd thought about those when thinking of them before, would probably have to break up the function head into multiple heads and have them call back down to the base defined body. It is doable, but a pain, however it would make for a very powerful defguard.

@ericmj

This comment has been minimized.

Copy link
Member

ericmj commented Mar 8, 2017

@OvermindDL1 It's not possible with defguard because macros can only expand where they are called, they cannot change their surrounding code. It would have to be an entirely different feature implemented directly in the compiler and the def macros.

@OvermindDL1

This comment has been minimized.

Copy link
Contributor

OvermindDL1 commented Mar 8, 2017

@ericmj Correct, that's why I mentioned that It is definitely not a normal macro, though I should have worded that as It can definitely not be implemented with a normal macro, the defguard itself would have to be a special form and the def special form would have to check its head ast to resolve the {:defguard, _, [matchers, when] into the final output. It definitely could not be implemented as a macro for sure without overriding def as well (which might be fine for a test of syntax, but absolutely not as a final input).

@OvermindDL1

This comment has been minimized.

Copy link
Contributor

OvermindDL1 commented Mar 8, 2017

And I just implemented that hack to see how the syntax works. ^.^

A shell session:

blah@blah MINGW64 ~/projects/tmp/defguard
$ iex -S mix
Eshell V8.2  (abort with ^G)
Interactive Elixir (1.4.0) - press Ctrl+C to exit (type h() ENTER for help)
iex>defmodule StructEx do
...>   import Defguard
...>   defguard is_struct(%{__struct__: struct_name}) when is_atom(struct_name)
...> end
{:module, StructEx,
 <<70, 79, 82, 49, 0, 0, 5, 252, 66, 69, 65, 77, 69, 120, 68, 99, 0, 0, 0, 155,
   131, 104, 2, 100, 0, 14, 101, 108, 105, 120, 105, 114, 95, 100, 111, 99, 115,
   95, 118, 49, 108, 0, 0, 0, 4, 104, 2, ...>>, {:is_struct, 1}}
iex> defmodule Testering do
...>   use Defguard
...>   import StructEx
...>   def blah(any_struct) when is_struct(any_struct), do: any_struct
...>   def blah(_), do: nil
...> end
warning: unused import StructEx
{  iex:4
:module
, Testering,
 <<70, 79, 82, 49, 0, 0, 5, 24, 66, 69, 65, 77, 69, 120, 68, 99, 0, 0, 0, 151,
   131, 104, 2, 100, 0, 14, 101, 108, 105, 120, 105, 114, 95, 100, 111, 99, 115,
   95, 118, 49, 108, 0, 0, 0, 4, 104, 2, ...>>, {:blah, 1}}
iex> Testering.blah(%{__struct__: Blah})
%{__struct__: Blah}
iex> Testering.blah(%{blah: 42})
nil
iex> Testering.blah(42)
nil
iex>

As you can see, it works fine. ^.^

I'm also playing with the idea to have defguard accept an optional do/end so it can do compile-time checks and alter the ast directly, but not needed for something as simple as is_struct/1. :-)

My hack is 'just' implemented enough for something this simple to work, would need work to have more, but considering it is less than 60 lines of code to do what I have now and adding more cases is easily expressed with how I have things split up, this could easily become a full defguard hack^H^H^H^Himplementation. ^.^

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented Mar 8, 2017

It's not possible since guards can have boolean expressions. def foo(bar) when is_struct(Bar) or is_atom(bar) would have to be transformed into multiple function clauses. You can imagine how complicated more complex guards would be.

@ericmj is correct. The only scenario where we would accept this trade-off is if someone did a careful study of the generated beam code under all of those circumstances and how BEAM will or won't optimize those patterns.

@OvermindDL1

This comment has been minimized.

Copy link
Contributor

OvermindDL1 commented Mar 8, 2017

@ericmj is correct. The only scenario where we would accept this trade-off is if someone did a careful study of the generated beam code under all of those circumstances and how BEAM will or won't optimize those patterns.

What I'm doing is at each new when it is making a new function head where the bodies are just copied between them. It'd probably be better to move the body to a private function and just call that when the guards match, but this is a lazy hack so far. ^.^

I think that works well since if you do an is_struct(bar) as shown above then the match will very quickly fail regardless if bar is not a map, but these already have to be thought of when using or as any call that constrains the type would cause an early fail anyway (hence why multiple whens are supported on a function anyway). I could dump the core code is curious, but I doubt doing this is any slower than doing it manually in any case, just with a lot less code. :-)

@christhekeele

This comment has been minimized.

Copy link
Contributor

christhekeele commented Mar 9, 2017

@OvermindDL1

This comment has been minimized.

Copy link
Contributor

OvermindDL1 commented Mar 9, 2017

I can't think of what you'd put there.

Yeah I could not come up with a good example either, though I just chalked that up to still having a bit of flu. ^.^

A do block could be used to dynamically build up some conditions, might be good for a database or something maybe? I'm unsure...

@Eiji7

This comment has been minimized.

Copy link
Contributor

Eiji7 commented Mar 9, 2017

How about add to def and defp something like with that just calls method? Is it possible?
Example:

defmodule Example do
  defstruct [:sample]
  defguardp is_not_empty_struct(%Example{sample: sample}) when sample != false
  def prepare_struct(add_sample, struct),
      do: if add_sample, do: Map.put(struct, :sample, "sample"), else: struct
  def method_that_should_have_prepared_struct(struct) when is_not_empty_struct(struct),
      do: # do something with struct ...
  def method_that_should_check_struct(_, struct) with new_struct from &prepare_struct/2
      when is_not_empty_struct(new_struct), do: # do something with struct ...
  # or something similar ...
end

So before call &method_that_should_check_struct/2 we calling: &prepare_struct/2 (with same arguments like: apply(callback_from_with, args) - to prepare data and finally match them in guard same as data from normal caller) and in both cases we use guard matcher or maybe only macro?

Summary: function that when matching calls another function that determines if first function will match and returns extra prepared data, so we do not need to use if or case in function body.
What do you think about it? Am I missed something?

@ericmj

This comment has been minimized.

Copy link
Member

ericmj commented Mar 9, 2017

How about add to def and defp something like with that just calls method? Is it possible?

Anything is possible, but like we have said multiple times, let's move this discussion to a more appropriate forum like the elixir core mailing list or the forum. This issue is about defguard and you are talking about a separate feature which only relation to defguard is that they are both about guards.

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented Mar 9, 2017

Folks, please stop derailing the discussion with other features. defguards is well specified. If you want to suggest alternatives or propose extensions to how guards work, please write a well thought-out proposal to elixir-core, considering the pros and cons.

@josevalim

This comment has been minimized.

Copy link
Member

josevalim commented Jul 2, 2017

Closing in favor of the PR which is pretty much complete at this point.

@josevalim josevalim closed this Jul 2, 2017

@avocade

This comment has been minimized.

Copy link

avocade commented May 25, 2018

Just love this.

@manuel-rubio

This comment has been minimized.

Copy link

manuel-rubio commented Jul 8, 2018

Hey! guys, I was trying this feature with pattern matching and looks like it's not working as I expected or maybe I'm wrong with something basic:

defmodule Data do
  defguard is_capital(<<letter::binary-size(1), _ :: binary>>) when letter >= ?A and letter <= ?Z
  def greet(name) when is_capital(name), do: "Hello, #{name}"
  def greet(thing_name), do: "Hello, to #{thing_name}"
end

The error is:

** (ArgumentError) invalid syntax in defguard is_capital(<<letter::binary-size(1), _::binary>>)
    (elixir) lib/kernel.ex:4623: anonymous fn/2 in Kernel.validate_variable_only_args!/2
    (elixir) lib/enum.ex:737: Enum."-each/2-lists^foreach/1-0-"/2
    (elixir) lib/enum.ex:737: Enum.each/2
    (elixir) lib/kernel.ex:4592: Kernel.define_guard/3
    (elixir) expanding macro: Kernel.defguard/1
    iex:2: Data (module)

Is not let to use pattern matching with defguard?

@fertapric

This comment has been minimized.

Copy link
Member

fertapric commented Jul 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment