Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extend self and module_function are not the same #556

Open
jaredbeck opened this issue Apr 8, 2016 · 34 comments
Open

extend self and module_function are not the same #556

jaredbeck opened this issue Apr 8, 2016 · 34 comments

Comments

@jaredbeck
Copy link
Contributor

extend self and module_function are not the same. The guide says:

Favor the use of module_function over extend self when you want to turn a module's instance methods into class methods.
https://github.com/bbatsov/ruby-style-guide#module-function

Unfortunately, extend self keeps the instance methods public, but module_function makes them private.

# `extend self` makes the copies public
module Pub
  extend self
  def x; end
end

class PubClass
  include Pub
end

PubClass.new.x #=> nil

# `module_function` makes the copies private
module Priv
  def x; end
  module_function :x
end

class PrivClass
  include Priv
end

PrivClass.new.x
# NoMethodError: private method `x' called for #<PrivClass:0x007fba1c0e7e90>

Is making them private the point?

@jaredbeck
Copy link
Contributor Author

Per the ruby documentation:

The instance-method versions are made private.
http://ruby-doc.org/core-2.2.2/Module.html#method-i-module_function

@cremno
Copy link

cremno commented Apr 15, 2016

There is another difference that might matter:

No method copying: If you want to modify a method's behavior via meta-programming, you only need to do this in one place

(from http://idiosyncratic-ruby.com/8-self-improvement.html#advantages-of-extend-self)

@rustamgasanov
Copy link

Can someone from maintainers give an answer on this question? Why do you need to prefer one functionality over another. How does it go together?

@tibbon
Copy link

tibbon commented Dec 3, 2016

I'm still hoping to know why as well.

@ifokeev
Copy link

ifokeev commented Apr 11, 2017

Confusing me too

@kke
Copy link

kke commented Apr 12, 2017

I don't see any situation where extend self would actually be the correct choice.

  1. You have a module that provides a utility method such as Math.pi
  2. You want to use pi in the instance methods of your own class without having to type Math.pi all the time

Is there any other situation where you would want to take a module method like Math.pi or FileUtils.mkdir_p as an instance method in your own class?

A: You create a local method def pi; Math.pi; end

  • Constant lookup + Double method lookup on use
  • It makes no sense to have your instances offer .pi so you make it private, now you have the same situation as with module_function but with extra lookups.

B: You include Math

def self vs extend self vs module_function

using def self.pi

include Math will not give you any sort of accessor to .pi, not YourClass.pi, not instance.pi, nothing. You must use Math.pi and that's it.

using extend self with public methods

include Math will give your instances a public instance method .pi.

using extend self with private methods

include Math will give you a private instance method .pi but Math.pi now says "private method pi called"

using module_function :pi

include Math will give your instances a private .pi method and Math.pi still works just fine. You can do something like def pi; super + 1; end in your class to overload it or make it public.

When to use what?

  1. You want to provide a module that adds instance methods to whatever is including it:
module Foo
  private # if you want them to be private
  def foo
  end
end
  1. You want to provide a module that adds instance methods and has some module level utility methods of its' own that should not end up into any instances of classes that include the module:
module Foo
  # won't appear as YourClass.foo or your_instance.foo when you `include Foo`
  def self.foo
    "foo" 
  end

  def bar
    "the bar is open"
  end
end
  1. You want to provide a module that offers singleton/utility methods. If someone happens to include your module, it does pretty much what you would expect it to do. This is what stdlib does.
module Foo
  # Foo.all_good? works, including Foo will give instances a private all_good? method
  module_function
  def all_good?
    true unless false
  end
end

This got a bit lengthy and off topic. After writing this, my solution to the original issue would be to change it to "never use extend self".

@ifokeev
Copy link

ifokeev commented Apr 12, 2017

@kke but what if I need private methods in the module?

@kke
Copy link

kke commented Apr 12, 2017

@ifokeev you shouldn't need instance methods in the module. Private module methods can probably be achieved with private and module_function and your code will still work if included.

Edit: nope, use module_function and private_class_method

@ifokeev
Copy link

ifokeev commented Apr 12, 2017

@kke show me, please, the correct way of using module_function if I have one public and 10 private methods

@kke
Copy link

kke commented Apr 12, 2017

@ifokeev You need to use private_class_method in addition.

module Bedtime
  def now?
    after_midnight?
  end
  module_function :now?

  def after_midnight?
    Time.now.hour < 12
  end
  module_function :after_midnight?
  private_class_method :after_midnight?

  def work_in_the_morning?
    (0..4).cover?(Time.now.wday)
  end
  module_function :work_in_the_morning?
  private_class_method :work_in_the_morning?
end

class Test
  include Bedtime

  def print
    puts now?
  end
end

# Test.new.print => works
# Test.new.now? => private method now? called
# Test.now? => undefined method 'now?'
# Bedtime.after_midnight? => private method after_midnight? called
# Bedtime.now? => works

Btw, FileUtils in stdlib has defined private_module_function that does that with single line: https://apidock.com/ruby/v1_9_3_392/FileUtils/private_module_function/class

@ifokeev
Copy link

ifokeev commented Apr 12, 2017

@kke your code doesn't look good. That's why I prefer extend self. I didn't find any other solution.

@kke
Copy link

kke commented Apr 12, 2017

And your code doesn't work correctly :)

You can write it a bit shorter:

module Foo

  private_class_method def bar(foo)
    puts "yo #{foo}"
  end
  module_function :bar
end

Maybe:

module PrivateModuleMethod
  def module_private(sym)
    module_function sym
    private_class_method sym
  end
end
  
module Foo
  extend PrivateModuleMethod

  module_private def foo(bar)
    puts bar
  end
end

I suppose that with some trickery it could even be def module_private.foo(bar)

But that's another story. The story here is that extend self almost never does what you actually wanted
(or should want) to do. It's pretty but incorrect.

@jaredbeck
Copy link
Contributor Author

After 14 months, this issue has not attracted the attention of any maintainers, so I'm closing it. Thanks to everyone who provided feedback. I think if someone were to summarize the discussion here, a useful PR could be written.

@bbatsov
Copy link
Collaborator

bbatsov commented Jul 3, 2017

After 14 months, this issue has not attracted the attention of any maintainers, so I'm closing it. Thanks to everyone who provided feedback. I think if someone were to summarize the discussion here, a useful PR could be written.

The maintainers are extremely busy. PR welcome!

@printercu
Copy link

module_function does not work with attr_accessor, visibility modifiers and methods which define new ones (ex, ActiveSupport's Module#delegate). So module_function is useful only for very simple cases, while extend self provides robust features.

@jaredbeck @bbatsov Should not this rule be removed?

Greyschale added a commit to gocardless/gc_ruboconfig that referenced this issue Nov 13, 2018
This cop implies that `extend self` and `module_function` behave
the same way, but they actually handle reflections and private method
visibility in slightly different ways. See here for more info:
https://idiosyncratic-ruby.com/8-self-improvement.html

Issue against Rubocop:
rubocop/ruby-style-guide#556
hoffm pushed a commit to codecademy-engineering/codecademy-ruby-style that referenced this issue Dec 13, 2019
@marcandre
Copy link
Contributor

marcandre commented Jul 4, 2020

I wouldn't mind resurrecting this thread.

tldnr: I would reverse that rule, i.e. favor extend self and avoid module_function, or at least not provide any definite recommendation and list the caveats of each if there is disagreement.

I ❤️ extend self.

First, it adds zero cognitive load. Everyone knows what extend does (or should), everyone knows what self is within a Module (or should). module_function requires extra knowledge and having a precise mental model of what it does is difficult.

Failures

A major plus for extend self: it works 🎉

module_function does not work as expected in many cases:

module MyMod
  def foo
  end

  module_function # called too late to work for :foo

  include OtherMod # won't work

  attr_reader :bar # won't work

  alias baz foo # won't work

  delegate ... # won't work

  def_delegator ... # won't work either
end

Visibility

To illustrate visibility issues, here's a typical example in the wild:

module Concurrent
  def dataflow(*inputs, &block)
    dataflow_with(Concurrent.global_io_executor, *inputs, &block)
  end
  module_function :dataflow

  def dataflow_with(executor, *inputs, &block)
    call_dataflow(:value, executor, *inputs, &block)
  end
  module_function :dataflow_with

  private

  def call_dataflow(method, executor, *inputs, &block)
    # ...
  end
  module_function :call_dataflow
end

extend self

What if instead of the bunch of module_function, the author had used extend self?

The only downside of extend self is that if a user actually include Concurrent, then dataflow and dataflow_with are public.

I don't see this as an important downside because I fail to see when or why anyone would call some_instance.dataflow instead of Concurrent.dataflow, but if it happens, it's the same public code being called anyways.

For anyone who really cares, it's easily remedied too:

class MyClass
  include Concurrent
  private *Concurrent.instance_methods
end

# or the module can do it itself:
module Concurrent
  extend self
  def self.included(base)  # Mark our methods as private for anyone including us
    base.private *instance_methods
  end
end

module_function gotchas

Did you spot the problem in the example code?

The main issue is that one can actually call Concurrent.call_dataflow, even though that method is meant to be private. This feels like a more important issue than someone calling a public method through an instance.

I find the multiple calls to module_function ugly and error prone (best not forget a single one!). Note that if you instead call module_function at the beginning, it won't work either, because confusingly module_function ignores private methods:

module Mod
  module_function

  private def private_1  # Singleton method will be created, but will be public
  end

  private
  def private_2 # No singleton method created at all!
  end
end

Note: private_1 is not ignored because it is first created as public and then right after is made private, while private_2 is created private and is thus ignored unless one calls module_function :private_2 explicitly 🤯

Conclusion

I personally never use module_function and I had to review what it did exactly before commenting here. I will now proceed to forget its confusing intricacies and continue using extend self

@marcandre
Copy link
Contributor

@bbatsov did you get a chance to read this?

@bbatsov
Copy link
Collaborator

bbatsov commented Aug 8, 2020

Just now.

While I no longer remember the original reasoning I do recall it was a combination of several factors:

  • some of the points made by @kke
  • the fact that with extend self you end up with two version of each method, which is rarely what you need in a utility type of module
  • the extend self idiom is confusing for many people; obviously that's subjective and impacts the most people without much Ruby experience. On the other hand module_function clearly communicates the intent, at least IMHO.

Looking at the examples, however, it seems that many people are discussing modules that go beyond a pure utility module where module_function excels, so at the very least we should clarify this aspect of the recommendation and the reasoning behind it. I definitely won't argue there are cases where extend self will be needed or will be the preferable approach.

@bbatsov
Copy link
Collaborator

bbatsov commented Aug 8, 2020

First, it adds zero cognitive load. Everyone knows what extend does (or should), everyone knows what self is within a Module (or should). module_function requires extra knowledge and having a precise mental model of what it does is difficult.

Well, in my experience that has rarely been the case. 😆

The only downside of extend self is that if a user actually include Concurrent, then dataflow and dataflow_with are public.

Yeah, that's what I meant in the second point.

Btw, these days you can also use module_function like this:

module_function def dataflow(*inputs, &block)
    dataflow_with(Concurrent.global_io_executor, *inputs, &block)
end

@bbatsov
Copy link
Collaborator

bbatsov commented Aug 8, 2020

@jaredbeck

Is making them private the point?

Yep, that was the point. 😄

As remarked earlier I didn't expect people would be including utility modules, as those are typically used stand-alone. Obviously, if also plan to re-use some methods in classes the module_function advice doesn't make much sense.

@jaredbeck
Copy link
Contributor Author

Yep, [making them private] was the point. 😄

Cool. Yeah, are we ready to turn this discussion into a PR, and add an explanation to this rule?

jonallured added a commit to jonallured/omniauth-artsy that referenced this issue Sep 11, 2020
I blindly followed some RuboCop changes that were actually breaking
things. I found this thread enlightening:

rubocop/ruby-style-guide#556

Especially this comment:

rubocop/ruby-style-guide#556 (comment)

> module_function does not work with attr_accessor
@AlexWayfer
Copy link
Contributor

I have a case with modules inheritance, without classes.

module BaseSpecHelper
  def environment_specs_coefficients
    {
      -> { ENV['CI'] } => 1,
      -> { RUBY_PLATFORM == 'java' } => 3,
      -> { Gem::Platform.local.os == 'darwin' } => 1
    }
  end

  extend self
end

# some specs using `BaseSpecHelper.environment_specs_coefficients`

module CLISpecHelper
  include BaseSpecHelper

  def environment_specs_coefficients
    super.merge(
      lambda do
        RUBY_PLATFORM == 'java' && ENV['CI'] && is_a?(Filewatcher::Spec::ShellWatchRun)
      end => 2
    )
  end

  extend self
end

# some specs for CLI

I don't need classes here, just scoped static methods.

I'd be glad to resolve it with module_function, class << self or something else, but I didn't find any way do this except extend self.

So, extend self seems pretty uniq and if it's so — there is should not be any cop with suggestion to replace it, at least without surrounding code analysis.

@bhasan
Copy link

bhasan commented Oct 23, 2020

I agree with the above. I have an issue with Module B including Module A. I then have Class A including Module B.

See this example:

module ModuleA
  private_class_method
  module_function
  def insert
     puts "inserting"
  end
end

module ModuleB
  include ModuleA
  private_class_method
  module_function
  def select
      puts "selecting"
  end
end

class ClassA
  include ModuleB
  # lets consider this a base class
  def common_base_class_method
    puts 'base class'
  end
end

class ClassB
  include ClassA
  def run
    select
    insert #fails here with NoMethodError!!!
  end
end

I can't use Module A function in Class A with module_function. I need extend self.

If you're trying to prevent people from creating 2 copies of methods, move this rule to RuboCop-performance, please.

I think there's enough of an argument that extend self and module_function are NOT the same and doesn't work in some cases.

Please undo this or provide a way to disable this in the rubocop.yml file.

@AlexWayfer
Copy link
Contributor

or provide a way to disable this in the rubocop.yml file.

You can do it:

Style/ModuleFunction:
  Enable: false

Or even do inline a code:

# rubocop:disable Style/ModuleFunction
extend self
# rubocop:enable Style/ModuleFunction

@pirj
Copy link
Member

pirj commented Feb 21, 2021

@AlexWayfer Here you go!

module BaseSpecHelper
  module_function

  def environment_specs_coefficients
    1
  end
end

module CLISpecHelper
  extend BaseSpecHelper

  module_function

  def environment_specs_coefficients
    super + 1
  end
end
BaseSpecHelper.environment_specs_coefficients # => 1
CLISpecHelper.environment_specs_coefficients # => 2

I'm not insisting to burn extend self with fire. @marcandre's arguments show a number of serious culprits of module_function.

Still, I believe both extend self and module_function have their usages.

@marcandre
Copy link
Contributor

marcandre commented Feb 22, 2021

@pirj: Your example is error prone:

class Anything
  include CLISpecHelper
end

Anything.new.environment_specs_coefficients # => Error, no `super` method

both extend self and module_function have their usages.

Which ones are they?

It's one datapoint, but I've survived all my Ruby programming years so far without module_function but I can't with extend nor with self.

@bbatsov
Copy link
Collaborator

bbatsov commented Feb 22, 2021

Again - I'll clarify the this guideline about module_function was always intended for the case of a module with functions (class methods) that are only meant to be used directly (with the module as the receiver). I see here that most people are discussing including modules in classes and what happens then, which is completely orthogonal IMO. It may be that my experience was such that I never wanted to mixin class methods as instance methods, but I do understand such use-cases.

One thing is clear - we need more precises wording at this point. :-)

@pirj
Copy link
Member

pirj commented Feb 22, 2021

@marcandre

Anything.new.environment_specs_coefficients
NoMethodError: private method `environment_specs_coefficients' called for #<Anything:0x000056243acd4768>

and this is intentional. environment_specs_coefficients is intended to be called on CLISpecHelper/BaseSpecHelper, not to be included.

Just tested on 2.5 and 3.0. I was frightened this might have changed at some point of time.

The purpose of this is to pass BaseSpecHelper or CLISpecHelper as an argument, so those methods can be called on it. Not to be included so that those methods are called on self.

@marcandre
Copy link
Contributor

Sorry, forgot that they would be private. So a method from Anything calling environment_specs_coefficients would trigger an error.

class Anything
  include CLISpecHelper
  def foo
    environment_specs_coefficients
  end
end
Anything.new.foo # => no super defined

@marcandre
Copy link
Contributor

Again - I'll clarify the this guideline about module_function was always intended for the case of a module with call functions that it's meant to be used directly.

It's not clear to me what use case there is. For example, are there any example of uses of module_function in the RuboCop source where extend self wouldn't be at least as good if not better?

If they are truly meant to be used directly, and only directly, why not define them as singleton methods directly?

module Example
  class << self
    def call_me_directly
      #...
    end
  end
end

@pirj
Copy link
Member

pirj commented Feb 22, 2021

Can't say about RuboCop, but:

https://github.com/rails/rails/blob/291a3d2ef29a3842d1156ada7526f4ee60dd2b59/activesupport/lib/active_support/security_utils.rb#L25 and its usage https://github.com/rails/rails/blob/291a3d2ef29a3842d1156ada7526f4ee60dd2b59/actionpack/lib/action_controller/metal/request_forgery_protection.rb#L378

Well, it can probably be replaces with extend self, but how would you prevent users of that modules from including them when they were not designed to be included?

There's a subtle difference:

module A
  module_function

  def a
    1
  end
end

class C
  include A

  def b
    a
  end
end

> C.new.a
NoMethodError: private method `a' called for #<C:0x000055a9491a4b98>

> C.new.b
=> 1

module B
  extend self
  def a
    1
  end
end

class D
  include B
end

> D.new.a
=> 1

And yes. If A would extend some other module and called super, it would error out.

The point here is that extend self allows for a double-purpose, permissive for both D.new.a and B.a usages.
While module_function only allows for direct usage of A.a.

And yes, module_function is not the same as:

module A
  class << self
    def a
      1
    end
  end
end

class C
  include A
  def b
    a
  end
end

C.new.a # => undefined method `a'
C.new.b # => undefined local variable or method `a'

Revenons a nos moutons

Favor the use of module_function over extend self when you want to turn a module's instance methods into class methods.

I guess in the light of the recent discoveries this statement still has a chance to stand.

While a different one, something like "if you design your public API to both allow to access your module's instance methods by directly calling them on a module, and to be able to call them on a class when this method is included, use extend self". Does this sound reasonable?

AlexWayfer added a commit to filewatcher/filewatcher that referenced this issue Feb 22, 2021
AlexWayfer added a commit to filewatcher/filewatcher-cli that referenced this issue Feb 22, 2021
@AlexWayfer

This comment has been minimized.

@AlexWayfer
Copy link
Contributor

@pirj it's a bit complex use-case, but I found a difference (in wrong real-life behavior without any error), here is an example: https://replit.com/@AlexWayfer/InfiniteLividTrees

The difference of my example from your is additional usage method (wait) and an additional class (Runner; I can't use CLISpecHelper.wait because it becomes private method).

So, I still see the difference between module_function and extend self not only with an ability "to call them on a class when this method is included", but also with modules inheritance.

@AlexWayfer
Copy link
Contributor

AlexWayfer commented Apr 29, 2021

Another example with constants, which is pretty hard to reproduce with module_function (if it's possible at all): https://replit.com/@AlexWayfer/FrightenedAgedQbasic#main.rb

extend self is more flexible solution, as I see.

no-reply pushed a commit to samvera/bulkrax that referenced this issue Apr 6, 2023
there's some debate about the value of this style guide rule at
rubocop/ruby-style-guide#556.

it seems in our case like using `module_function` is going to be prohibitively
hard, and i think the longer term desire is to remove both `extend Forwardable`
and `extend self`, expecting callers to just call `#config` directly. the
behavior around defaults needs to be cleaned up to make this possible, but it
seems like a bad plan to bend this implementation around (unhelpful?) style
rules for code we'd rather work on removing.
orangewolf pushed a commit to samvera/bulkrax that referenced this issue Apr 7, 2023
* move config to a Configurable class

the older style config setup is incompatible with Rails 6, preventing support
for Hyrax 4.

this is a minimal pass at extracting the config and keeping the current API.

* disable rubocop Style/ModuleFunction for Bulkrax module

there's some debate about the value of this style guide rule at
rubocop/ruby-style-guide#556.

it seems in our case like using `module_function` is going to be prohibitively
hard, and i think the longer term desire is to remove both `extend Forwardable`
and `extend self`, expecting callers to just call `#config` directly. the
behavior around defaults needs to be cleaned up to make this possible, but it
seems like a bad plan to bend this implementation around (unhelpful?) style
rules for code we'd rather work on removing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests