Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fortran needs packaging ecosystem #55

Open
certik opened this issue Oct 27, 2019 · 22 comments
Open

Fortran needs packaging ecosystem #55

certik opened this issue Oct 27, 2019 · 22 comments

Comments

@certik
Copy link
Member

certik commented Oct 27, 2019

Most other languages have that, whether Python, Julia, Go, Rust, JavaScript....

Goals:

  • Make it easy to create an application that depends on N external libraries written in Fortran, and the build system can easily and robustly build it

  • Make it easy to create a new Fortran package (that depends on other packages) and distribute it

  • Package index where people can easily search for available packages provided by other people

  • Build a community and culture of such packages

  • Must work with any Fortran compiler

This needs to be carefully designed, we need to learn from the above mentioned languages.

Some related projects to consider:

Note: initially opened at https://gitlab.com/lfortran/lfortran/issues/109.

@zmiimz
Copy link

zmiimz commented Oct 28, 2019

imho, one of the best to consider is the dub manager from dlang project (it is also AIO: package manager and build system)
https://github.com/dlang/dub

@septcolor
Copy link

septcolor commented Nov 1, 2019

FWIW, this site gathers statistics for package registries of various languages. We can see more details by clicking the name of the registries.
http://www.modulecounts.com/

@sblionel
Copy link
Member

sblionel commented Nov 3, 2019

Looks like all the languages mentioned are interpreted. Keep in mind that there's more to the Fortran world than Windows, Linux and Mac. The Fortran world seems to have gotten along well with libraries without something in the standard for a package manager. These sorts of things tend to go obsolete quickly, anyway. I don't see it as something appropriate to add to the standard, which doesn't even discuss what source file names look like.

@gronki
Copy link

gronki commented Nov 3, 2019 via email

@sblionel
Copy link
Member

sblionel commented Nov 3, 2019

Given that compiled Fortran objects are not interoperable with those from different compilers, much less modules, I don't see a way forward for this proposal. The standard offers features (especially submodules) that help library developers. Build from source works.

Keep in mind that the Fortran standard doesn't say anything about the world outside the "processor" (compiler). Source lines are delivered by fairies in the night, input and output files are up to the whim of the environment, etc. One can use any packaging system that suits your fancy. What does one do for C or C++?

@gronki
Copy link

gronki commented Nov 3, 2019 via email

@certik
Copy link
Member Author

certik commented Nov 3, 2019

@sblionel, from the mentioned languages, Go and Rust are compiled, and Julia is a hybrid. Regarding your second question, there are two package managers specifically for C++: Conan and Vcpkg. A language neutral package managers that many recommend for C++ (and Fortran!) are Spack and Conda (already linked in the issue description above).

@gronki thanks for the feedback.

Fortran needs a standard way to create and distribute libraries. There is a lot to improve.

What is not clear at this stage is what things, if anything, needs to be improved in the Fortran standard itself. But there might be some things to improve there, and for that reason I would like to keep discussing it here.

I discussed this issue with many people, and there are generally two camps: a language specific package manager (like Julia, Python, ...) and those who advocate for packaging all languages (such as C++, Fortran, Python, ...) in language neutral package managers such as Conda or Spack.

I would much prefer if we can figure out ways to just use Conda, Spack or another solution, so that we do not need to maintain things ourselves. However, there might be some Fortran specific things that we might need to figure out.

Regarding building from source (Spack) or distributing binaries (Conda), I think we need both. We need to build from source, as that is what is needed on HPC to build an optimized static build for a specific architecture, but also be able to distribute binaries (Conda) is very helpful for users that just want to get something working quickly and do not want to wait hours to build all the dependencies.

@septcolor
Copy link

septcolor commented Nov 4, 2019

FWIW, D, Nim, Chapel (as well as Rust and Go) are also compiled languages, each of which has its package repository. Examples of major registries include...: Dub for D, Crates for Rust, and Gopm for Go, Hackage for Haskell, and so on.

Personally, I think this kind of package registry does not (necessarily) have to be "in the standard", but it is very nice if such a repository allows users to find "good" packages + install them efficiently with least troubles. The ideal situation is that a package registry (or manager) provides a search mechanism for candidate packages, show the degree of maintenance level explicitly (e.g. by showing validation/test results for major compilers/versions), show dependence (e.g. 3rd-party libraries/versions required), explicitly state license (to facilitate open-source use), and provide feedback mechanism such as popularity measures and issue reports...

@septcolor
Copy link

septcolor commented Nov 4, 2019

As for Rust, it is not only a new language but also the "most loved" one in the StackOverflow survey (2019)
https://insights.stackoverflow.com/survey/2019#most-loved-dreaded-and-wanted
and seems even considered as a possible replacement of C/C++
(according to Microsoft)
https://visualstudiomagazine.com/articles/2019/07/18/microsoft-eyes-rust.aspx
https://msrc-blog.microsoft.com/2019/07/18/we-need-a-safer-systems-programming-language/
so I guess it may also provide a useful reference for various aspects, including package management (in comparison to more traditional languages like C++ and Java).

@everythingfunctional
Copy link
Member

In order for this to work well, you would need to tie all the packages together with a standardized build tool. I've started putting the beginnings of this together in my own packages, but I haven't formalized it or properly automated the package management side of it yet. Basically, I put together a build tool that can scan the source tree and determine the dependency tree. Then I just manually add the src folder to the list in the build system and use git submodules to manage the dependencies. Take a look here and let me know what you think.
It only works if everything is in modules and doesn't deal with submodules yet. I also have extended it to work with linking in C/C++ code in one project.

@traversaro
Copy link

Regarding your second question, there are two package managers specifically for C++: Conan and Vcpkg.

As it may be relevant that there have been some effort in the past to add support for Fortran in vcpkg, even if until now it has not been merged upstream:

@certik
Copy link
Member Author

certik commented Apr 23, 2020

@traversaro thanks for the update!

We are developing a Fortran Package Manager (fpm) here: https://github.com/fortran-lang/fpm/, anyone is welcome to join us. It's very much work in progress, we will announce it once it is ready for users. If anyone wants to help us get there faster, please definitely join.

@everythingfunctional
Copy link
Member

To follow on @certik comment, the latest developments have made fpm usable, provided you have no dependencies. That's the next step. I have some vague idea about how to implement a minimal version, but I need to find a few hours to dedicate to it.

@wolfv
Copy link

wolfv commented May 1, 2020

FPM is implemented in Haskell?

We are actively working on mamba (https://github.com/quantstack/mamba) again, which is becoming a complete rewrite of conda in C++ -- this will shed the dependency of conda for a Python interpreter and make it much more lightweight. In the end, you'll be able to drop a statically compiled binary on a system and use it as package manager -- and it works on Linux, Windows and OS X.
Mamba is also based upon well established dependency management libraries (libsolv, and libcurl / libarchive). So not too much NIH.

So far we're following conda's ideas very closely to make it 99% interoperable with existing conda packages and environments.

We are also toying around with the idea of adding source distribution capabilities, which would be part of mamba & the yet to be made mamba-build.

If the only thing you're missing from Conda is the ability to distribute source easily, maybe we can formulate a plan together to add this to mamba? I think yet-another language specific package manager is not the way to go (but I am not a fortran expert so there might be good reasons, which I didn't see in this thread at least).

@certik
Copy link
Member Author

certik commented May 1, 2020

@wolfv thanks for getting in touch. Yes, you and I talked about this, and I also talked at length with @SylvainCorlay and discussed at your Gitter about this exact question. We also discussed with the Julia developers a few times.

FPM is still just a prototype, I started it in Rust, but I really wanted @everythingfunctional to join our effort and he already had a similar version implemented in Haskell, so I convinced @milancurcic to switch to Haskell for the prototype. For the production version, I still think it should be Rust or C++, to make it easier for people to contribute. But let's discuss that later, for the prototype it doesn't matter from the user perspective, as long as it produces a statically linked binary, which Haskell does.

About 80% of the arguments are the same for Rust as for Fortran. So let's discuss Rust, because it already has a mature ecosystem. Why couldn't Rust just use Conda? There are multiple reasons:

  • Must be fast and a simple binary / easy to distribute. Conda fails, but Mamba might fix this.
  • Must be a source distribution, in other words, build everything from source, not a binary distribution. Conda traditionally has been a binary distribution, and so a non-starter. If there was a way to make it a hybrid, as you mention, then let's discuss this more, maybe there is a way.
  • The key really is to build from source and have a robust build system from source.
  • Cargo understands the Rust default layout: https://doc.rust-lang.org/cargo/guide/project-layout.html, which makes it trivial to make a new package: just put files in the correct place on the filesystem, write a generic Cargo.toml and things will just work -- no need to write any manual build system (in CMake let's say), nor telling Cargo where files are.

In addition to these, Fortran has a few specific things:

  • It must work with any Fortran compiler (see here https://fortran-lang.org/compilers/ for a full list: several open source ones and about 12 commercial ones). One cannot mix and match compilers, one must recompile everything from scratch
  • You must be able to compile everything for your given computer with all optimizations on, to get the best performance (Fortran applications must be fast) --- one cannot just compile on an older hardware ahead of time in order to ensure that things run everywhere (as Conda does it), because you will miss performance.

FPM will also have Fortran specific knowledge, such as figuring out the dependencies between modules, and enforcing proper module naming convention based on where things are in the filesystem, and enforcing a Fortran specific layout. I don't know how that could be done with Mamba, as this is really Fortran specific.

Also, we want FPM to eventually become the default front end to Fortran: compiler independent invocation (i.e. you can use a compiler of your choice, and FPM will figure out the different ways Fortran compilers are being called), create a new project easily, all kinds of checks, automatic formatting, etc. (just like Cargo does this to rust --- you don't call rustc by hand, you just call cargo).

In general, we are aiming for a smooth and nice user experience, just like Cargo delivers it for Rust.

Let's discuss more if you are interested.

@wolfv
Copy link

wolfv commented May 2, 2020

Thanks for the lengthy reply! I know you did your homework thoroughly :)

Regarding source distribution: I don't see anything that would prevent this in Mamba -- conda packages are (almost) just tarballs of whatever was installed into the prefix that wasn't there before. So if your build script just copies the source over to some magic directory, then I think that's totally fine.

I understand that it's nice to have the build system and the package manager integrated tightly. In my opinion those are two slightly different roles.

We definitely want to do a conda-compatible mamba-build as well which should be much faster.
With conda-build or mamba-build nothing prevents you today from adding a package fortran-build-scripts that contains some shell scripts, depends on cmake etc. so that building Fortran packages becomes a one-liner in the meta.yaml. Here is a sample meta.yaml (for others, that's how one expresses dependencies and build steps in conda):

package:
  name: my_super_fortran_pkgs
  version: 0.12.2

source:
  path: https://.../download.tar.gz

build:
   script: fcomp -DSOME_ARG -MHELLO_WORLD

requirements:
  build:
    - my_fortran_buildscripts
    - {{ FORTRAN_IMPL }}
  host:
    - some_dependency 0.14.*

In this case, fcomp would be a shell script (or some other executable) that's part of the my_fortran_buildscripts package.
I have a conda-forge enhancement proposal that I want to push next week that would add these kind of build scripts to conda-forge at least for CMake and autogen.

One other thing I want to mention: I think the API surface of mamba is somewhat cleaner. For example, here is an example on how one can use the mamba API from Python to get a solution for a set of package specs:

https://gist.github.com/wolfv/cd12bd4a448c77ff02368e97ffdf495a

So if you wanted to you could also build on top of Mamba (and conda-packages) and implement the build system as a part on top of mamba (the same APIs shown in Python are obviously available from C++ as well). These API's will cover everything from prefix activation to repodata downloading and then to package dependency solving and installation.

I would be incredibly excited if you decided to do this with us, and obviously I would be happy to discuss this further.

@certik
Copy link
Member Author

certik commented May 2, 2020

@wolfv thanks for the reply. Yes, we would love to collaborate!

Here is what we really care about: the end user experience. Here is are initial tutorial that explains how to use fpm:

https://github.com/fortran-lang/fpm/blob/ed5dd080d45ea4a409e63a5f9b2ff26f1d82d2db/PACKAGING.md

Everything in there already works with the current fpm, but obviously fpm is still a prototype. As I mentioned, it is heavily inspired by Cargo, so if you want to play with a good well designed production tool, play with Cargo a little bit.

We are completely open about the underlying technology, but we really care about the end user experience, which we want to be exactly (or very close to) what is in the above PACKAGING.md document. The key part is that users just write a simple fpm.toml file:

name = "hello"
version = "0.1.0"
license = "MIT"
author = "Jane Programmer"
maintainer = "jane@example.com"
copyright = "2020 Jane Programmer"

and fpm figures out how to build the project from the file layout (the same idea as Cargo). It knows how to build the application / executable (if present), library (if present) and tests / benchmarks (in the future).

So for example, we do want fpm to be able to generate a Conda package, in fact we already have an issue for it: fortran-lang/fpm#70

In there the easiest would be to simply call fpm from meta.yaml. That's similar to what you mentioned in your last comment.

Once we have dependency management (we'll start working on that very soon) there might be a way to link with mamba to help out there too.

@wolfv
Copy link

wolfv commented May 5, 2020

What I am proposing is to use mamba as the tool to do everything related to "build-environment" and dependency management, as well as installing third party dependencies (or sources) into the environment.

I believe you could already achieve that with what we have in mamba today:

You can define some dependencies and install them into a build environment, then activate the environment (prefix) and build your package in that context.
We might have to think about how we can do source packages well in mamba / as conda packages but I am convinced that there are great solutions out there that don't require a lot of work to get done.

Do you guys have some sort of regular meeting / video chat? I would be happy to drop by to see how we could work together if you're interested.

@wolfv
Copy link

wolfv commented May 5, 2020

This is the basic mamba CLI right now which can create new prefix's based on conda packages: https://gist.github.com/wolfv/4827a7c18ffae89242cbc46ddf012b4e

@certik
Copy link
Member Author

certik commented May 5, 2020

@milancurcic literally just yesterday suggested to have a video chat. @milancurcic would you have time to set it up with @wolfv, @everythingfunctional and others? Let's brainstorm this.

Honestly, using Conda for non-Fortran dependencies especially on macOS and Windows would really make the user experience awesome. Things like HDF5 are notoriously long to install and just being able to install a binary would go a long way. For Fortran stuff I think we still want to build them ourselves, but let's brain storm. I think there is a huge opportunity for collaboration.

@milancurcic
Copy link
Member

Do you guys have some sort of regular meeting / video chat? I would be happy to drop by to see how we could work together if you're interested.

@wolfv Great, thank you, I appreciate your time! I sent an email.

@odiferousmint
Copy link

odiferousmint commented Apr 11, 2021

Just because no one has mentioned it yet: OCaml has a pretty great package manager too, called opam: https://opam.ocaml.org/ (https://github.com/ocaml/opam). In all fairness, I never built it from source, I just grab the binary, which is all you need to run opam. For the curious, on my current system ldd /usr/bin/opam prints:

	linux-vdso.so.1 (0x00007ffddfc59000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f8c17e81000)
	libglpk.so.40 => /lib/x86_64-linux-gnu/libglpk.so.40 (0x00007f8c17ba2000)
	libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f8c17b8f000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f8c17b73000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8c17a24000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8c17a1e000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f8c17a01000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8c1780f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f8c1895f000)
	libcolamd.so.2 => /lib/x86_64-linux-gnu/libcolamd.so.2 (0x00007f8c17806000)
	libamd.so.2 => /lib/x86_64-linux-gnu/libamd.so.2 (0x00007f8c177fb000)
	libltdl.so.7 => /lib/x86_64-linux-gnu/libltdl.so.7 (0x00007f8c177f0000)
	libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f8c1776c000)
	libsuitesparseconfig.so.5 => /lib/x86_64-linux-gnu/libsuitesparseconfig.so.5 (0x00007f8c17765000)

There is also a popular build system developed by Jane Street: https://opam.ocaml.org/packages/dune/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants