Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for passing arguments to a module's __init__ method #3894

Closed
stevengj opened this issue Jul 31, 2013 · 49 comments
Closed

Syntax for passing arguments to a module's __init__ method #3894

stevengj opened this issue Jul 31, 2013 · 49 comments

Comments

@stevengj
Copy link
Member

It would be very useful to be able to specify options to modules when they are loaded, e.g. to specify the Python interpreter to use for PyCall or to specify the graphics backend to use for a plotting package.

The syntax could be as simple as module Foo(kw1=opt1, kw2=opt2...) in the module file and a corresponding using Foo(kw1=opt1, kw2=opt2...) or import Foo(kw1=opt1, kw2=opt2...) to load it with optional keywords. The keywords would get turned into const globals within the module namespace.

It would be an error to load a module twice with different keywords, since the only reason to have these as module parameters would be for one-time initialization tasks. (However, simply import Foo on subsequent loads would be fine, since this would just stick with the previous option values.)

@Keno
Copy link
Member

Keno commented Jul 31, 2013

It seems to me that this is better solved with a module local init function that takes these arguments and sets the appropriate constants.

@quinnj
Copy link
Member

quinnj commented Jul 31, 2013

+1 for this. I think it's much cleaner than the "package registry" ideas that were going on a while ago about managing global constants.

@JeffBezanson
Copy link
Sponsor Member

There are several good things about this idea, but I'm worried about the possible conflicts. It would basically never be safe to use this feature in a library.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 31, 2013

Jeff beat me to mentioning my concerns.

The "package registry" or my earlier "config dictionary" have the benefit that they separate the user configuration from usage. Thus other packages don't need to be aware of the specific configurations in use (unless they need to care). I think this is similar to how I made DL_LOAD_PATH a global configuration variable, rather than a package (or library) specific parameter.

related: #3881 #2716

@nolta
Copy link
Member

nolta commented Jul 31, 2013

related: #1268

@stevengj
Copy link
Member Author

Let me rephrase the proposal. Basically, using Foo(...) could be syntactic sugar for

using Foo
Foo.init(...)

Whether it is safe to do this multiple times would then be entirely up to the Foo.init function.

@JeffBezanson
Copy link
Sponsor Member

That's better, but we already need an init function that's automatically called once on startup. If it became common to call it multiple times, many packages would need to manually check for this --- a sort of header file ifndef idiom.

@StefanKarpinski
Copy link
Sponsor Member

@stevengj, can you give some specific use cases for this? I have the same concerns as @JeffBezanson and @vtjnash and wonder if there isn't some way we can address those without making this so brittle.

@stevengj
Copy link
Member Author

One use case is something like a plotting library for which you want to set a backend at startup.

As for making it brittle, if you adopt my "syntactic sugar" proposal then it is no more brittle than having a separate init function, just a little bit cleaner.

@StefanKarpinski
Copy link
Sponsor Member

Oh, I agree that it's no worse than having an init function, but that's also brittle since everyone has to agree on who gets to call it the one and only time it gets to be called.

@stevengj
Copy link
Member Author

stevengj commented Aug 1, 2013

There are two possibilities: either the initialization is inherently one-time only, in which case the init function can simply be written to ignore subsequent calls by setting a flag the first time it is called (and there is not much one can do about it, syntax or no syntax), or the initialization is something whose parameters can be changed at run-time, in which case init can do this. In either case, however, if the semantics of the module involve some parametric initialization, introducing a nice syntax makes the problem neither better nor worse, so I'm baffled as to how that objection is relevant here. Are you worried that having a nice syntax will make people use it badly?

Let me give a concrete example: PyCall. PyCall needs to do some runtime initialization: it queries python to find the correct library to link to, then it loads that library, initializes it queries its version, and caches a bunch of types and functions in global variables. All of this is really a one-time-only task, because Py_Initialize in the Python library can only be called once. The code and usage would be simplest if this could be done when you import or using PyCall: it could just run the initialization and cache the necessary stuff in const globals.

However, initializing PyCall automatically on import would make it difficult to for the user to change what Python version is used, as there is no simple way to pass parameters during import (except by some kind of global configuration database ala #2716, but this is somewhat opaque). So, instead, I have to have a separate init function (called pyinitialize) that can be passed parameters. Since I don't want to crash if the user forgets to call pyinitialize, I have to add a pyinitialize call to essentially every PyCall routine, which is a pain. Furthermore, without eval hacks, the global cache variables can no longer be const (since they must be changed by pyinitialize), which means that Julia cannot infer their types properly.

It is not "brittle" in the sense that as many modules as you want can call pyinitialize; it simply sets a flag and ignores subsequent calls. But yes, the parameters to PyCall (the python library) are determined by the first call to pyinitialize. That's life.

All of this would be made immensely nicer if I could simply do using PyCall("python3") to pass initialization parameters.

The situation of plotting libraries is similar, because they often need to choose a backend (wx, Qt, GTK, Tk, etc.) once at startup and stick with that, so there is a one-time parametric initialization (and subsequent attempts at initialization need to be ignored because you can't easily switch backends on the fly).

@JeffBezanson
Copy link
Sponsor Member

A nice thing about using special syntax for this is that it lessens the expectation that initialization behaves like a normal run-time function call. For example, Winston.set_backend(...) would imply that you can do it any time (perhaps you can, not sure), but using Winston("Tk") looks more like it is a one-time thing done when the module is first loaded.

@stevengj
Copy link
Member Author

stevengj commented Aug 1, 2013

I still like the behavior in which using Foo(kw1=val1, ...) defines const kw1 = val1 etcetera in the module, as that way these constants could easily be used to define other const values in the module, whereas this would be harder to do if using Foo(kw1=val1, ...) were merely sugar for calling Foo.init(kw1=val1,...). Subsequent using Foo statements could simply ignore any arguments, in the same way that const members are not reinitialized if the module is re-imported.

@StefanKarpinski
Copy link
Sponsor Member

I guess my main objection is the idea that different pieces of code could initialize modules in different ways and the somewhat arbitrary ordering of how they're called ends up determining which arguments take effect and which don't. I think shrugging that of as "that's life" seems a bit premature. The real issue is that you want these choices all made in a single place – that's why @vtjnash brought up the config stuff he worked on previously since that provides a single central place for this kind of configuration. Another idea would be to centralize initialization in the Main module, wether by convention or by actual enforcement.

Yet another approach would be to let various modules register their preferences but delay actual initialization until the module actually needs to be used, at which point initialization must happen, but more parties have gotten a chance to register their preferences. In many cases, there may be no conflict between those preferences, and perhaps warnings can be issued in the case that conflicting options are requested.

@quinnj
Copy link
Member

quinnj commented Jun 10, 2014

Bump. Where does this stand? #5960 introduced the __init__() function called at module load time. Is this syntax still relevant for the rewrite @stevengj mentions? Or is it not worth it.

@stevengj
Copy link
Member Author

It would nice to be able to pass arguments to __init__ ...

@StefanKarpinski
Copy link
Sponsor Member

I don't think the objection that different pieces of code can try to initialize a module in different ways has been addressed. I'd rather have a central configuration place for modules.

@stevengj
Copy link
Member Author

Right now, the central configuration place is ENV.

@StefanKarpinski
Copy link
Sponsor Member

That's probably not ideal. @vtjnash had a PR for a central config file a long time ago, but @JeffBezanson didn't like the idea. Maybe we need to revisit that. I'm not a big fan of config files that can change the way everything works in a non-obvious manner, but having conflicting configuration scattered around a program seems worse.

@stevengj
Copy link
Member Author

@StefanKarpinski, you're referring to #2716. @JeffBezanson wrote there:

We're using environment variables because they're the standard mechanism people already know how to work with (e.g. adding to your bashrc). Whether they are "scattered" is more of a documentation and presentation issue, which exists with any configuration system (how to know what options and values are available).

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jun 10, 2014

| (how to know what options and values are available).

The primary feature of that pull request over using ENV was that packages were expected to add all of their options there, so the user could always query a list of options, defaults, values, and perhaps even doc strings.

@nalimilan
Copy link
Member

One could imagine keeping the idea that modules can list the supported options with default, types, docs, etc., but that the only way to set them would be to use a central configuration.

I agree ENV is really a poor configuration source: one cannot easily see what options applying to a given module are currently customized, variables do not have types and thus require each module to do its own parsing, and everything @vtjnash said...

@fxbrain
Copy link

fxbrain commented Jun 10, 2014

A central config for per-module configuration smells a little bit too much of Windows registry to me.
(Nightmare on DEC and on Windows likewise)

Considering C++, there is the possibility of static fields for e.g. classes, which are not part of the object, to have an influence on how an object is created.

What @stevengj wants, seems, to me, to get such a static initialization phase in order to construct an object appropriately and I can only totally agree with him, that this is decent and needed.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jun 10, 2014

The problem with the windows registry is that it is hard to make backups and is unobservable. I think that is a stronger argument against using ENV than against a central config location (implementation not functionality was flawed). Gnome and Mac have centralized, managed config that works well

@fxbrain
Copy link

fxbrain commented Jun 10, 2014

@vtjnash can't agree with you.

So the idea is, before you instantiate or import a namespace into Main, you've to change a configuration dictionary to tackle an environment that suits you best in the current environment you actually are?

such as:

Image["sink" -> "browser"] # nevermind the syntax chosen...
using Image; # distribute images as png/svg

?

@StefanKarpinski
Copy link
Sponsor Member

You would only need to do anything if you need to modify the default behavior.

@fxbrain
Copy link

fxbrain commented Jun 10, 2014

@StefanKarpinski well...ok. But the default behavior can be altered by changing the default argument of some CTor e.g. init() as well at construction time, agree?

If someone is mixing up domain-specific object constructions which can't fit together this fails. Fair enough. But you can misconfigure a chain of module instantiations as well.

Which leads to, that the central config must check for inter-module incompatibilities...and this may become a nightmare to support.

@StefanKarpinski
Copy link
Sponsor Member

So far the only concrete example we have is the using PyCall(python = "python3") one. Can we maybe have some more concrete examples to help reason about this? @stevengj, you mentioned wx, Qt, GTK, Tk – that actually seems like it might be a good case for a common graphics backend option rather than a haphazard per-package option. @vtjnash, presumably you had something in mind when you opened #2716.

@stevengj
Copy link
Member Author

@StefanKarpinski, if the graphics backend is used for e.g. PyPlot, then it depends on what matplotlib supports, and may be totally different from backend choices for other applications.

@StefanKarpinski
Copy link
Sponsor Member

Fair enough, although you could provide a global preference on backends and then a package picks the best one that it supports.

@fxbrain
Copy link

fxbrain commented Jun 10, 2014

@StefanKarpinski hmmm...well, for instance, i am working here without any X-Server installed. The only GUI is chrome. That is what i have. So, I adore the possibility I get with IPython notebook et al.

I tried to install Winston, but stopped it, since it wanted to install Tk which leads to some X-Server setup as well, which I have/want to avoid.

In a perfect world I wish, that constraints are checked and treaded respectively at installation time e.g. no X no Tk etc.
If I decided to setup a X-Server finally I do want to have the opportunity to overwrite the default settings and some extra modules are loaded to support this.

But the default shall be the situation/constraints at installation time.

Which, on the other hand speaks for a global package config registry....

So why not connect the CTor/init with the config settings? Presumably what you and @vtjnash had in mind already, had you?

@fxbrain
Copy link

fxbrain commented Jun 10, 2014

I guess what I try to say is, the moment you provide the opportunity for a global module registry, you are very near to a standard distribution such as anaconda, but this has to be supported.
If you want to avoid this you should give the user the possibility to change the default at instantiation time, which brings the responsibility back to the module creator/author.

This may be an ENV-setting or, favoured by me, change of default settings using init(), initializing some const globals within the module namespace.

@StefanKarpinski
Copy link
Sponsor Member

Passing configuration to the module init function would be the way to go. The init function is not called explicitly by the user.

@quinnj
Copy link
Member

quinnj commented Aug 21, 2014

Seems like this has stagnated a bit without any really compelling uses. Is there still interest in this? I think it would generally be useful to have a way to pass arguments to __init__() for modules; should we go with something like using Foo(1,2) that then calls __init__(1,2)?

@quinnj quinnj changed the title Syntax for passing options to modules: using Foo(kw1=opt1, kw2=opt2...)? Syntax for passing arguments to a module's __init__ method Aug 29, 2014
@quinnj
Copy link
Member

quinnj commented Aug 29, 2014

Changed the title. How about syntax like using Foo.__init__(arg1, arg2). This has the advantage of being a little more explicit that you're passing arguments to the __init__ method while using Foo

@StefanKarpinski
Copy link
Sponsor Member

The objection has never been that it's not sufficiently explicit that init is being called. The issue is that this only happens the first time the package is loaded. Do we just ignore the arguments all the other times it's loaded? Or do we call init every time it's loaded? In that case the init functions have to all be prepared to be called repeatedly and not fail when they are. Of we're going to do that, the. Maybe we should consider passing the init function the module object that called it so that it can do things like inject bindings or whatever. But that's a pretty slippery slope that we may not want to go down.

@StefanKarpinski
Copy link
Sponsor Member

Bold init = __init__.

@stevengj
Copy link
Member Author

@StefanKarpinski, sounds like a great application of Unicode in Julia: Let's get rid of underscores and use bold characters instead.

@StefanKarpinski
Copy link
Sponsor Member

I do think that calling dunder-init every time is a viable option to consider, but it's a lot of burden on the package author, which worries me. It basically puts the problem of idempotency and handling conflictings fully on the developer.

@ntessore
Copy link

ntessore commented Nov 1, 2014

How about a very simple approach similar to LaTeX package options?

In your module, you can define a set of options such as

module MyModule

option vectorize = true

# ...

end

Such options could be treated just like regular const variables at module scope, and used accordingly in any functions (including __init__).

When loading the module, options would be passed like keyword arguments:

using MyModule(vectorize = false)

Just as in LaTeX, options are set the first time the module is loaded, and then it's the responsibility of the module user to check that the options are compatible with what's expected. Furthermore, also as in LaTeX, one could define a function that can be used to pass options to modules before they are used:

options(:MyModule, vectorize = false)

# ...

using MyModule    # vectorize option gets passed here

While maybe more limited than a generic init-function approach, this seems to be very easy to use and straightforward. Plus, it's something that at least the LaTeX-using scientific community should be familiar with.

@StefanKarpinski
Copy link
Sponsor Member

How to invoke this is not really the question – the issue is what to do about multiple users of a module having conflicting requirements of that module. In LaTeX this is not an issue.

@ntessore
Copy link

ntessore commented Nov 1, 2014

I understand that this is the issue, and my comment was not trying to add any insight in this respect. In the spirit of the original "Syntax for passing arguments to a module" question, I was merely suggesting that instead of passing arguments to the init function, how about just specifying a set of options akin to const variables, and let the module do with those whatever it wants.

As for the multiple requirements, this is an issue in LaTeX occasionally, and you will be left with some "incompatible package options" error. In any case, this seems to be a quite natural solution. You are trying to use two incompatible packages. They won't work together, and that's the end of it.

@StefanKarpinski
Copy link
Sponsor Member

I hate getting that kind of error in LaTeX. If that started happening with Julia packages I would be really, really unhappy and is exactly why we can't add this feature. If we allow package/module options, they have to be specified in a single global place so that there can't be conflicts.

@ntessore
Copy link

ntessore commented Nov 1, 2014

I am in no way advocating for this, I was just offering an observation that other ecosystems just do not care about this issue. I do not pretend to be knowledgeable enough to have an opinion on how to resolve this, or even understand all implications of this problem.

What I proposed was the mechanism of passing options to modules, as per the title of this issue. I think a system based on variables keeps things simple and stupid. For example, I have a wrapper for an external library, which can be located in a non-standard location. I want to allow the user to pass the library path to the module. To me, it seems to be easiest to write my module like this:

module MyWrapper

option library::String = "libfoo" # does not get evaluated when option is passed, but checks type

# use option like a regular const variable: ccall( ( :run, library ), ... )

Now someone using my module can pass the library to it:

using MyWrapper(library = "/home/me/lib/libfoo.so") # overrides the option in MyWrapper

@vtjnash
Copy link
Sponsor Member

vtjnash commented Nov 1, 2014

other ecosystems just do not care about this issue

I feel that LaTeX may be one of the few ecosystems where it is OK to just say, "Pkg A and Pkg B can't be used together? oh, well. too bad for you". This is typically a bad idea however, since it breaks with the concept of modularity and other such nice abstractions of code independence

For example, I have a wrapper for an external library, which can be located in a non-standard location. I want to allow the user to pass the library path to the module

Exactly true. Which is why there is one canonical global location where the user can specify such a thing:

push!(DL_LOAD_PATH, "/where/to/find/library")
using MyWrapper

@vtjnash vtjnash closed this as completed Dec 21, 2014
@vtjnash
Copy link
Sponsor Member

vtjnash commented Dec 21, 2014

dropping this, because the feature hasn't come up as being all that essential in Julia's several years of growth

@zouhairm
Copy link

It's unfortunate that this has been dropped... I have a need for this in one of my research projects: I would like to be able to set a resolution for a solver which is defined as a module (I can't pass the resolution to a function as there are a few arrays/dictionaries/etc. that need to be initialized differently depending on the parameters).

What's the suggested way to go around this? Should I have all of these parameters initialized by some other function? if so, how should I declare these parameters to avoid type instability? Would the following work?

Module test
mesh = Array(Float64)

function setResolution(dx)
    global mesh
    mesh = linspace(0, 1., 1./dx)
   #more complicate setting up of mesh and other structures
end

#solver is called repeatedly,  so do not want to have to call setResolution
#over and over again
function solver(pars)
global mesh
 #use the mesh to generate solution
end
end

@yuyichao
Copy link
Contributor

You will need to do this manually anyway because a module can be imported multiple times and it is unclear who should initialize it.

I'd say passing the parameter is a better way since using a global state will cause interference between different users of the same package. If you really want a global state that's type stable but mutable, the current workaround is using a Ref{T} (which might be done under the hood automatically in the future)

Untested code

const parameter = Ref{Int}()

setParameter(x::Int) = (parameter[] = x)

@hayd
Copy link
Member

hayd commented Jan 7, 2016

I guess a workaround is:

type _Config
  secretkey::UTF8String
  other_thing::Bool
end
const config = _Config("", false)  # defaults
set_secretkey!(secretkey::UTF8String) = config.secretkey = secretkey
# etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests