Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use julia_project to manage Julia dependency #100

Closed
wants to merge 7 commits into from

Conversation

jlapeyre
Copy link

This PR uses julia_project and find_julia to handle installing and managing Julia and Julia packages.

@ChrisRackauckas
Copy link
Member

Hmm, this project still is setup for Travis. @tfk do you have a suggest CI setup to test this?

@jlapeyre
Copy link
Author

jlapeyre commented Jan 11, 2022

You could use julia_project for other projects such as JuliaPOMDP/quickpomdps#7 referenced above. But, you could not use both diffeqpy and quickpomdps together in one python runtime. julia_project should be modified to accomodate this. I'm not sure how to do it.

I think you can use these two projects together as they are, without julia_project. But, any messiness, conflicting libpython (which will happen often under windows IIUC) , creating a new Julia project, etc. would have to be handled manually. One goal of julia_project is to insulate the Python user from Julia (at least they have the choice to ignore Julia). Python modules written in Rust don't require the Python user to touch rust in any way. I see this as a way to drive Julia adoption.

I should probably fork and modify quickpomdps just so I can experiment with ways to get the two projects to work together. One clue might be in something David Anthoff wrote: The Project.toml (and Manifest.toml) serves two purposes, to define packages and to define environments. The uses are separate and the file Project.toml could (should) have been given two different names. I think in julia_project it's not clear which role it is playing. We might want to manage two Project.tomls. One for each Python package, eg diffeqpy. This is like a package Project.toml. It can include a compat section. But julia_project would also manage a Python module-level Project.toml that represents the environment. pyjulia already stores things at the module level, i.e. managing Julia is not all encapsulated in a class. This is more complicated than I like, but we may be forced into something like this.

EDIT: PythonCall.jl manages Julia dependencies from from python by using Pkg at a lower level
https://github.com/cjdoris/PythonCall.jl/blob/main/juliacall/deps.py

But, PythonCall.jl is not flexible enough. In pyjulia you have entry points in the initialization process to manage your own system image, etc. But, PythonCall.jl is hermetically sealed. The author is interested in splitting some stuff out.

Copy link
Contributor

@tkf tkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented in JuliaPy/pyjulia#473, I think there should be a language-agnostic way to handle julia installations with a single transparent user interface. It'd be bad if each language/framework handles Julia installations in its own way. It seems like https://github.com/JuliaLang/juliaup is the closest and most official approach to this.

That said, I'm not actively working on this right now and I don't want to block people who want to improve Julia-Python interop.

console_logging=False
)

julia_project.run()
Copy link
Contributor

@tkf tkf Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly suggest avoiding side-effects on import, at least from the top-level module (i.e., on import diffeqpy). It makes it impossible to provide APIs that can/should be used before initialization. Maybe it's OK to do it in import diffeqpy.de.

Copy link
Author

@jlapeyre jlapeyre Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes it impossible to provide APIs that can/should be used before initialization

I don't see at the moment when this would happen. But, I suspect you are correct. So, you would have to do import diffeqpy, then diffeqpy.setup() or something like that I suppose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import diffeqpy, then diffeqpy.setup()

Yes, that's the idea. It'd be better if

>>> import diffeqpy
>>> diffeqpy.update(julia=True)  # hypothetical API to update DifferentialEquations and Julia
>>> from diffeqpy import de

works. If we initialize things in import diffeqpy, it'd be impossible to use updated julia and DifferentialEquations (though the latter is somewhat possible if we integrate Revise).

I also consider that it's a language-agnostic best practice to avoid "magic" initialization on module/library import.

Re #100 (comment)

It installs packages and builds a system image in both the source dir and the environment created for tox.

I don't think this is related to the package initialization, though. Do you generate a sysimage for each libpython? If so, it might be due to that tox (or rather virtualenv) creates a Python executable and libpython for each environment (IIRC). I remember venv to be more "light weight" than virtualenv since it doesn't copy executable and libpython (IIRC). So maybe using tox-venv could help.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also consider that it's a language-agnostic best practice to avoid "magic" initialization

Yes, the idea of import package reading a bunch of state from the filesystem plus implicit inputs (environment variables) and then changing a bunch of state seems like its abusing the import system. I can't see an actual serious problem, but it doesn't feel right. I really like the idea of the user not being required to remember to do anything to use the package. But, requiring setup or install the first time is perhaps not so bad.

it'd be impossible to use updated julia and DifferentialEquations

You could require that the user restart after upgrading. I think people are pretty used to software that tells them they have to do that.

Do you generate a sysimage for each libpython?

If there is a conflicting libpython then julia_project gives the user the option of either rebuilding PyCall (and breaking whatever built it previously) or using a depot particular to the project and building a new PyCall (and all other packages) there. (I don't know of a more fine-grained option at the moment) In the tox test, so far, I only choose the latter. But, I think it may not be that. I did not spend much time debugging tox so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, requiring setup or install the first time is perhaps not so bad.

I'm OK with from diffeqpy import de working out-of-the-box by doing some magic by default. I'm just suggesting to provide an option to do some operations before booting up PyJulia and DifferentialEquations. In fact, this is exactly why diffeqpy does the initialization in diffeqpy/de.py and not in diffeqpy/__init__.py.

You could require that the user restart after upgrading. I think people are pretty used to software that tells them they have to do that.

I don't think we need to copy badly designed software when there is a simple and better solution that is already implemented.

Comment on lines +9 to +16
julia_project = JuliaProject(
name="diffeqpy",
package_path=diffeqpy_path,
preferred_julia_versions = ['1.7', '1.6', 'latest'],
env_prefix = 'DIFFEQPY_',
logging_level = logging.INFO, # or logging.WARN,
console_logging=False
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason why to do this in Python? Since Chris is the maintainer of this package, it's very likely he and his contributors want to deal with Julia programs than in Python. That's why I wrote things mainly in Julia (e.g., install.jl) and put a thin wrapper of Python.

Is this because of julia binary discovery and installation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the gist, but maybe not precisely what you are asking. Yes, JuliaProject does binary discovery and installation via another python package, find_julia, which uses jill.py. That part could be separated out, and the much of the rest could be done in Julia. I considered this. One argument for using Python is that errors and stack traces are more likely to be in Python this way. And a Python user is more likely to be able to spot an error, or make a suggestion or improvement. I'm trying, to the extent possible, to hide Julia from the end user. Admittedly, in diffeqpy, the interface to Julia is rather low level. But in other python projects, Julia is better hidden behind a Python API. I often work in environments where 0% of the people have enough interest in Julia to ever try it for anything. And 100% of the people love Python. Something like julia_project is a good way to introduce the possibility of Julia. On the other hand, I'll think more about whether more of it might be better written in Julia.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I often work in environments where 0% of the people have enough interest in Julia to ever try it for anything. And 100% of the people love Python. Something like julia_project is a good way to introduce the possibility of Julia. On the other hand, I'll think more about whether more of it might be better written in Julia.

I 100% agree with this. The only reason I could get my team to agree to use diffeqpy in our project is because of the speedup relative to SciPy ODE solvers. But the users of my project would not be interested in having to work with Julia code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also strongly agree with @jlapeyre , I like Julia very much but I have to force myself to use Python every time because the organization I am associated with deals only in Python, I doubt if anyone will be ready to deal with Julia while working on a project which is all in python. It will be a good idea to hide Julia altogether while appreciating all the benefits that Julia provides :)

@jlapeyre
Copy link
Author

I think there should be a language-agnostic way to handle julia installations with a single transparent user interface. It'd be bad if each language/framework handles Julia installations in its own way.

Here are several thoughts on choosing an installer. The bottom line is that using juliaup would be much more difficult for my purposes, which is to make installing a python module that depends on Julia as easy as installing a python module that depends on a rust or c++ library.

  • I agree it would be best to have a single interface. But, I agree with the comments in JuliaPy/pyjulia/#473 in that it is premature to pick a winner.

  • jill.py is a library, but also a cross-platform command line application. It's not Python centric. It's an application written in Python.

  • Python is a great language for writing an installer. juliaup is written in Rust, a high-performance systems language. It is much harder to develop, and has a much smaller base of possible contributors and eyeballs. This is worth it if you need high performance. But, that's not the case here.

  • When writing juliaup David Anthoff went C++ -> Julia -> Rust. I'm not sure why, except that clearly the last choice is the best of the three. It could well be that it is easier to deploy a self-contained rust program than a python program. I don't really know. But that would be a great argument for using rust over python. There are probably other good reasons to choose rust that I don't know about. But, the advantages of jill.py are transparent to me.

  • The location of Julia installations, and links to them is clearly documented for all platforms in jill.py. It is not documented at all for juliaup.

  • juliaup is more complicated than jill.py and the organization of the installed files and auxiliary binaries is a bit more complicated. The whole thing includes shell scripts, windows cmd scripts, xml files, rust source and compiled binary. There may be some benefit to the complexity.

  • juliaup for osx and linux is labeled experimental with known bugs. (In practice, it worked for the basic functionality on my system)

  • I am looking for something that allows a Python user to do pip install thepackage, and then import thepackage. So, I would have to detect which platform is being used and then download the appropriate juliaup installer (there are seven of them) (unless it's already installed) and then run the installer.

  • David Anthoff currently wants to hide the installation location of julia as an implementation detail and automatically add it to the user's path. This is a reasonable choice. But, it conflicts with other choices. For example, I like to make julia a script that runs julia with a particular system image. Johnny Chen already agreed to make returning a list of installed versions and their paths part of the jill.py library API (I did not finish the PR yet, but I copied the code to find_julia). Looking at my juliaup installation, I see that getting this information would be more complicated.

  • jill.py is both a library and an application. juliaup is only an application. So, find_julia and julia_project can be optionally controlled by environment variables, rather than interactive questions. This makes using containers, and test frameworks easier. As far as I can tell, this is not possible with juliaup.

@jlapeyre
Copy link
Author

One problem is that tox currently takes 15 minutes to run locally. It installs packages and builds a system image in both the source dir and the environment created for tox. I'm not sure, but I suspect this may be because of what @tkf mentioned. You should be able to import the module without doing any work that has side effects.

@tkf
Copy link
Contributor

tkf commented Jan 12, 2022

The bottom line is that using juliaup would be much more difficult for my purposes, which is to make installing a python module that depends on Julia as easy as installing a python module that depends on a rust or c++ library.

Why not install juliaup on the fly and then use it to install Julia? The point is that the user has access to the application storage from a CLI and it's shared across all languages and frameworks.

So, I would have to detect which platform is being used and then download the appropriate juliaup installer (there are seven of them) (unless it's already installed) and then run the installer.

I'm not sure if that's the downside. You only have to validate at least one binary for each platform. You can then use the cryptographic verification for all possible Julia binaries (and possibly new juliaup binary with self-update) implemented in juliaup. It seems like a big upside, given that other user-facing installation management interfaces come for free as well.

Python is a great language for writing an installer. juliaup is written in Rust, a high-performance systems language.

While I love Python as an excellent language for writing scripts easily, I disagree that it's a good language for creating simple-to-distribute self-contained CLI. Of course, there are various ways to create a self-contained Python application but it's not as straightforward as using a language with an AOT compiler. While I respect the effort and passion that went into jill.py and your julia_project, I don't think the argument "jill.py is a language-agnostic application" works for non-Python users if it requires them to understand how to use pip or Python to be already installed.


All that said, let me note again that I'm not working on this and I have no intention to be a blocker. Chris seems to like how things are handled in R which is similar to what is suggested in this PR, IIUC. So, I think there's a good chance this gets in.

@jlapeyre
Copy link
Author

Why not install juliaup on the fly and then use it to install Julia? The point is that the user has access to the application storage from a CLI and it's shared across all languages and frameworks.

As I said above, jill.py is a cross-platform application. Its a CLI application

shell> jill --help | cat
INFO: Showing help with the command 'jill -- --help'.

NAME
    jill

SYNOPSIS
    jill COMMAND

COMMANDS
    COMMAND is one of the following:

     download
       download julia release from nearest servers

     install
       Install the Julia programming language for your current system

     upstream
       print all registered upstream servers

     mirror
       Download/sync all Julia releases

     list
       List all Julia executable versions in symlink dir

     switch
       Switch the julia target version or path.

If someone already has python installed or is willing to install it, then jill.py is clearly a far easier solution than juliaup. @sibyjackgrove tested julia_project on windows even though I never tested it on windows, and it works. I could not do that with juliaup. (There were some install problems with julia_project in that case, but not due to windows, but rather cross-platform install issues that I corrected.) For julia_project the user always has Python installed already, so jill.py is a clear winner. I wanted to have find_julia look for julia where it is installed by juliaup. But, unlike jill.py that's not documented. I could probably even use jill.py to install to the juliaup locations if I knew where they are. But, that would require me to test on several platforms or try to find someone to do it for me. Some or all of the installation path is meant to be hidden from the API. On linux, I can install several versions to find how the links in ~/.juliaup/bin are done, and hope that that part is stable. (the links point to a tree inside ~/.julia. If I knew what I could count on for all platforms, I would strongly consider being at least compatible with juliaup.

disagree that it's a good language for creating simple-to-distribute self-contained CLI. Of course, there are various ways to create a self-contained Python application but it's not as straightforward as using a language with an AOT compiler.

I strongly suspected this from the beginning. Why else would someone use an AOT compiler for this? I looked briefly yesterday for how to package a simple-to-distribute self-contained python application. I think I saw dead projects, old ill-maintained projects. It did not look encouraging at all. So yeah, using jill.py for people who don't want, or can't, install python would be tough.

Python rules the world. In the spaces I am targeting, nothing else matters. I have to be practical given my environment and very scare resources (mainly time). They have no incentive to accommodate us. I have to accommodate them. I want to maximize the probability that the Python world accepts things like this. The more Python they see, the happier they are. (Of course, there is a small minority that has a broader view). I also have to do all of this myself, including the project that I originally wanted to do. If I get time in the future to try to support juliaup, I think it would be a good idea. By the way find_julia does the searching and downloading, and julia_project depends only on the find_julia API. So, juliaup could be used in find_julia in the future. I have every incentive to try to support juliaup (I could also support the shell jill). If I can avoid downloading Julia, so much the better.

All that said, let me note again that I'm not working on this and I have no intention to be a blocker.

Well you have by far the most experience in designing things to call Julia from Python. So, it is very useful to hear your opinions. For instance, not doing work with side-effects when importing. So thanks for taking the time to weigh in! (By the way, can you explain a bit more the situations in which side-effects on import are a problem?)

Chris seems to like how things are handled in R which is similar to what is suggested in this PR, IIUC.

Oh, I need to check that out.

@ChrisRackauckas
Copy link
Member

Rebase onto master for CI?

@tkf
Copy link
Contributor

tkf commented Jan 13, 2022

Python rules the world. In the spaces I am targeting, nothing else matters.

Yeah, I support the idea even though the implementation is not of my taste. It'd be great to see more Julia-based packages in PyPI. Anyway, now that we have POC jill integration merged #86, I'll stop complaining about this 🙂

BTW, consider #86 as a sketch of an implementation and feel free to tweak the CI setup if you have something else based on julia_project

(By the way, can you explain a bit more the situations in which side-effects on import are a problem?)

I'll comment on #100 (comment) to keep the conversation linear

@jlapeyre
Copy link
Author

I'll stop complaining about this

It's important to think about the options and defend your choice. I plan to ask the juliaup people some questions on paths and so forth to see whats possible.

I'm fairly sure the combination of the CI in #86 and tox.ini will not work with julia_project without tweaking.

@jlapeyre
Copy link
Author

Rebase onto master for CI?

I tried to do that. It was a bit of a mysterious process. I think what I pushed now is correct.

@jlapeyre
Copy link
Author

jlapeyre commented Jan 14, 2022

  • Hm CI found a code path with a bug. But it shouldn't have taken that path anyway. Um, I'll fix the bug first. EDIT: So the bug is fixed. But it occurred in this path: The consumer (diffeqpy) gave find_julia a list of preferred versions, including 1.7, which jill.py installs by default. Then a jill.py-installed was found, but it's not 1.7. I've never seen this in any tests.

  • tox on my local machine seems pretty slow even accounting for installing and building twice.

  • Maybe for another PR: There are some scripts that exercise DifferentialEquations during the building of the system image. But, they don't actually seem to speed up examples. I'm not sure if the examples run time to compilation time is really high, or if there is some subtlety that I'm missing

Latest julia_project depends on latest find_julia which has a bug fix.
@jlapeyre
Copy link
Author

jlapeyre commented Jan 14, 2022

tox succeeds locally, even on a machine with no Julia executable or packages installed.

EDIT: no wait, ignore below.

The path that is failing seems to imply a dict that evaluates to logical True in a conditional, yet iterating over values iterates zero times. That is the dict is apparently empty, but is True. Makes no sense. Also the dict being empty is correct, if there is no Julia installed which is the case.

@@ -56,3 +56,6 @@ docs/_build/

# PyBuilder
target/

**/Manifest.toml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does julia_project generate files inside diffeqpy directory? I don't think that's a good idea. For example, the directory may not be writable after the installation.

I see Pluto is using ~/.julia/environments/__pluto_$VERSION for its internal env. So, similarly, maybe you can generate ~/.julia/environments/__python_julia_project_$VERSION_$SLUG/ where $VERSION is the Julia version and $SLUG is a hash of the path of the current python environment (say).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does. But, more than just Manifest.toml and Project.toml. It optionally puts a depo in ./depot in the diffeqpy directory. I also don't like it, but this was easier to start with. I agree it should be changed. Probably under ~/.julia/julia_project. Both juliaup and pythoncall claim directories there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's very rare that you'd need a separate depot. If a Julia environment is enough, I think that's much better. It definitely helps reduce precompile time. It's different from how Python virtualenv/venv works but I think a slug-based mapping is enough for bridging the gap.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather just use a Julia environment as well. In fact, I do use one. But the separate depot is just to work around the problem of incompatible libpython. I want to avoid having multiple python projects fighting over PyCall, always rebuilding it.... Or maybe I don't follow what you mean here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a great understanding of the problem with incompatible libpython. My assumption, which I am not sure of, is that it is not enough to have a separate .ji file for PyCall for each python project. Some other .ji files might need to be recompiled and be incompatible as well. Or, is it enough to use multiple depots and one of them only contains PyCall (including the compiled .ji file) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the separate depot is just to work around the problem of incompatible libpython.

I'm pretty sure the direction of JuliaPy/PyCall.jl#945 makes this hack unnecessary. It let us configure libpython for each Julia project and so PyCall.jl can be precompiled for each libpython. It'll be a game changer for how Julia packages are used from Python.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. I looked around quite a bit, but missed this. It's a big deal. I'll have to do some reading to absorb it all.

@tkf
Copy link
Contributor

tkf commented Jan 14, 2022

https://github.com/SciML/diffeqpy/runs/4811765599?check_suite_focus=true#step:5:29

return next(iter(self.results.jill_julia_bin_paths.values())) # Take the first one

I'd write something like

for x in self.results.jill_julia_bin_paths.values():
    return x
return ??default???

@jlapeyre
Copy link
Author

jlapeyre commented Jan 14, 2022

That would probably be more clear, but the logic would be slightly different. In any case, the bug is because this is macos. The directory that jill installs to is always present, but has no julia installations. I did not exercise this path till now.
EDIT: it now reads as follows with the first line catching both None for non-existing directory and an empty dict for a directory with no julias in it. Might not be a bad idea to change the last line anyway.

        if not self.results.jill_julia_bin_paths:
            return None
        for pref in self.preferred_julia_versions:
            bin_path = self.results.jill_julia_bin_paths.get(pref)
            if bin_path:
                return bin_path
        if self._strict_preferred_julia_versions:
            return None
        return next(iter(self.results.jill_julia_bin_paths.values())) # Take the first one

There is another bug fix in find_julia
Copy link
Contributor

@tkf tkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(deleted)

@tkf
Copy link
Contributor

tkf commented Jan 14, 2022

So, I think there are still too much book keeping logic inside of diffeqpy. I think most of the stuff should go into julia_project (mainly so that you can improve things without bothering Chris or me). I suggest the following design.

(1) We have diffeqpy/_julia_project.py that "declares" but not execute julia_project.JuliaProject:

from from julia_project import JuliaProject

project = JuliaProject(
    name="diffeqpy",
    package_path=__file__,
    ... other things ...
)
# end of file

(2) diffeqpy/__init__.py directly exports julia_project.JuliaProject API:

from ._julia_project import project

This way, a user can run diffeqpy.project.update() etc. to manage the Julia project from Python. Crucially, each Python package does not need to define its own bookkeeping logic. That is to say, we have $PYTHON_PACKAGE.project.$MANAGING_COMMAND() as a consistent UI/API across all Julia-Python bridging packages.

(3) Invoke the magic command in diffeqpy/de.py:

from . import project
project.ensure_init()

Ideally, julia_project can provide an API like project.disable() or even julia_project.disable_all() so that project.ensure_init() is a no-op. This is useful for users who know and wants to control exact version of Julia packages.

Looking at julia_project README, it sounds like project.run() activates diffeqpy/Project.toml. This would be problematic when there are multiple Julia-based Python packages. Instead, I suggest the following:

  1. Copy diffeqpy/Project.toml to ~/.julia/environments/__python_julia_project_$VERSION_$SLUG/Project.toml where $VERSION is a Julia version and $SLUG is the hash of the realpath of sys.executable if it does not exist. Let us call ~/.julia/environments/__python_julia_project_$VERSION_$SLUG a $LOCAL_ENV.
  2. Instantiate $LOCAL_ENV/Manifest.toml if it does not exist.
  3. Push $LOCAL_ENV to the end of Base.LOAD_PATH (if it does not exist). Or to the beginning, if you want to make it more magical (i.e., ignore some stale package that exist in user's default environment).

This way, julia_project should be usable from multiple Python projects. Of course, this is still rather wacky since Pkg cannot ensure all the packages in the "stacked environment" Base.LOAD_PATH are of consistent versions. But that's an inherent problem for using an automagic approach like julia_project. For a sane behavior, users need to use the Pkg API as in julia.Pkg.activate(PATH). (Of course, julia_project can do more magics like keeping entire the stacked environment consistent.)

@jlapeyre
Copy link
Author

This would be problematic when there are multiple Julia-based Python packages.

I anticipated this in a comment above. I had planned to tackle this later because I did not have a clear idea of what to do. As a first step, I planned to provide a way to avoid activating the Project.toml so that the user would have a chance to manage the packages manually; something like disabling ensure_init.

$PYTHON_PACKAGE.project.$MANAGING_COMMAND()

In the end, I think this is better. I did it the other way because I wanted to hide more of the JuliaProject stuff. But, I agree the advantage of having a uniform UI is more important.

I thought of using "stacked environments", but I never managed to make that work for myself, so I shied away. I imagined I might have to do something more low-level, like parsing Project.toml and calling lower-level Pkg functions. But, maybe a stacked environment is fine.

Pkg cannot ensure all the packages in the "stacked environment" Base.LOAD_PATH are of consistent versions.

Isn't this problem inherent to the using stacked environments ? I mean, is this peculiar to the "automagic" approach?

For a sane behavior, users need to use the Pkg API as in julia.Pkg.activate(PATH)

You mean, if the user wants to use two python packages that depend on Julia, then activate a Julia project and add then necessary Julia packages for each? We could make something like this possible, but I would not want to require it.

Then there is the question of building system images. This is important because I want to reduce latency. If you use only a single Python package that uses julia_project, then this is not difficult. So I want to preserve this option. For two or more Python packages, I suppose you would load a system image (or not) for the first package. The remaining packages will have to be loaded and compiled.

@tkf
Copy link
Contributor

tkf commented Jan 14, 2022

Isn't this problem inherent to the using stacked environments ? I mean, is this peculiar to the "automagic" approach?

Yes, you are right. I was sloppy. The problem is inherent to how Julia itself handles LOAD_PATH. I wanted to emphasize that the approach I was proposing was wacky since bad thing can happen behind user's back.

You mean, if the user wants to use two python packages that depend on Julia, then activate a Julia project and add then necessary Julia packages for each? We could make something like this possible, but I would not want to require it.

Yeah, I get that this PR is about automation. I just wanted to point out something like julia_project.disable_all() provides a solution for users who want strong reproducibility. For example, you can check in Project.toml and Manifest.toml for Julia projects and something similar, say, pyproject.toml and poetry.lock for Python. You can then write a small activation script to set JULIA_PROJECT environment variable and start a program via poetry.

Then there is the question of building system images.

This is where "no magic init" principle is useful. If all Julia-based Python packages follow this principle and then don't initialize PyJulia on import, you can create a sysimage for each combination (in principle):

import diffeqpy
import makie  # hypothetical

import julia_project
julia_project.compileall()  # also initialize PyJulia (maybe not a good name)

from diffeqpy import de  # loaded from sysimage

where julia_project.compileall() combines and compiles all projects into a sysimage and then initialize PyJulia. But it's a rather challenging and I can see that sysimage-per-project covers a lot of use cases.

@jlapeyre
Copy link
Author

I can't afford to make something really robust at once. If I can get something that works well enough, my company (or others) might be more interested in allocating resources. But, it's probably a good idea to try to anticipate so that the interface doesn't change too quickly. No auto-init is one item to start with. I can spend some time redesigning; but I have less time for this in the near future, I did a lot of it over holidays.

Your system image idea is nice. What I have currently is simple, it just uses the API that PackageCompiler offers, and it is on the packager (me, or you or Chris) to include compile_julia_project.jl etc. Very easy. But for two projects, I will instead have to invent a system to record the packages and code that is passed via keyword argument compile_execution_file in say a toml file. Then read this from each project and combine it. And a system for storing the images. It has to be something that is somehow cached or not retriggered. I can't have it happen every time a Python user starts a new Jupyter notebook. Maybe have a system image for each combination of Julia-based Python packages.

Currently the density of Julia-based packages in use is very low, so package-package interactions are negligible. It would be great to be in a situation where were forced to deal with interactions.

@jlapeyre
Copy link
Author

still too much book keeping logic inside of diffeqpy.

I don't understand what you are referring to here. I don't see any book keeping. All I see that could be removed is

def compile_diffeqpy():
    """
    Compile a system image for `diffeqpy` in the subdirectory `./sys_image/`. This
    system image will be loaded the next time you import `diffeqpy`.
    """
    julia_project.compile_julia_project()

which I made as as an obvious convenience. I did it this way so the user does not have to know that julia_project exists (except for some intrusions during installation). Of course, there is a good argument for removing it and exporting project so that the user must do diffeqpy.project.compile_julia_project() (maybe rename these). It's slightly more robust. But, you do lose something in that the docstring explaining what it does now has to be put somewhere discoverable. Of course we should remove the string "julia" everywhere.

@tkf
Copy link
Contributor

tkf commented Jan 15, 2022

I can't afford to make something really robust at once.

Of course, it's not like everything has to be implemented in one go. But I thought the basic design (1)--(3) I commented #100 (comment) (without the future/ideal improvements I discussed) can be done with a very small effort. Essentially everything is in this PR. So, isn't it "just" removing compile_diffeqpy and update_diffeqpy and adding something like JuliaProject.ensure_init method like this?

class JuliaProject:
    ...

    initialized = False

    def ensure_init():
        if not self.initialized:
            self.initialized = True
            self.run()

I did it this way so the user does not have to know that julia_project exists

I think we just have to document that you can call diffeqpy.project.compile_julia_project etc. For example, we can put something like the following in the docstring of diffeqpy/__init__.py

Project management
------------------

You can call methods of ``diffeqpy.project`` to manage underlying Julia projects.
Notable methods are:

``diffeqpy.project.project.compile_julia_project()``
  Compile a system image for `diffeqpy` in ...

``diffeqpy.project.project.update()``
  Remove possible stale Manifest.toml files and compiled system image.
  ...

For more details, see: https://github.com/jlapeyre/julia_project

This way, users can get some overview by typing diffeqpy? in the REPL. Furthermore, we don't need to update diffeqpy every time julia_project adds new feature (e.g., new methods, new optional arguments).

you do lose something in that the docstring explaining what it does

Per-instance docstring is tricky but I think there are various ways to do it. Maybe you can create a subclass in __new__ (untested):

class JuliaProject:
    def __new__(cls, *, name, **_kwargs):
        class NewJuliaProject(cls):
            __doc__ = f"docstring for {name}"
        return object.__new__(cls)

or a simpler solution is to provide a factory function julia_project.new_project and then do something like

def new_project(*, name, **kwargs):
    class NewJuliaProject(cls):
        __doc__ = f"docstring for {name}"
    return NewJuliaProject(name=name, **kwargs)

Maybe the __new__-based solution is OK, but, for maximal flexibility on your side, using a factory function is better.

@ChrisRackauckas
Copy link
Member

So do I merge this?

@jlapeyre
Copy link
Author

jlapeyre commented Feb 3, 2022

I'd like to make some or most of the changes that @tkf asked for first. I was doing other things, but I am now finishing organizing/opensourcing the other application of julia_project, part of which will be making these tweaks to julia_project. Then I can update this PR to match the tweaks.

@ChrisRackauckas
Copy link
Member

Now managed by JuliaCall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants