Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement same procedures for different numeric kinds #35

Closed
milancurcic opened this issue Dec 22, 2019 · 33 comments
Closed

How to implement same procedures for different numeric kinds #35

milancurcic opened this issue Dec 22, 2019 · 33 comments

Comments

@milancurcic
Copy link
Member

milancurcic commented Dec 22, 2019

This question comes up in #34 and elsewhere. How to implement specific procedures that work on different kinds (sp, dp, qp, int8, int16, int32, int64) as well as characters, where the body of the procedure is the same (can be copy/pasted entirely without breaking it). Let's first just focus on this scenario, and we can consider more complex cases later.

I know of a few approaches:

  1. Repeat the code, that is, implement all specific procedures explicitly. That's what I did in functional-fortran, see https://github.com/wavebitscientific/functional-fortran/blob/master/src/lib/mod_functional.f90. Repeating is fine if you do it once and forget about it. The upside is that you can see the specific code and it needs no extra tooling. The downside is combinatorial explosion if you have procedures that are to handle all combinations of types and kinds. Most procedures are rather simple (one or two arguments), and I ended up with > 3K lines of code for 23 generic procedures. Most work was in editing the argument types to specific procedures, and less work was in copy/pasting of the repeatable content. I don't recommend this approach for stdlib.

  2. Approach 1 can be somewhat eased by explicitly typing out the interfaces, and using #include 'procedure_body.inc', defined in a separate file. Then your procedure body collapses to one line. This reduces the total amount of code, but not so much the amount of work needed, as most work is in spelling out the interfaces. This approach still doesn't need extra tooling as a C preprocessor comes with all compilers that I'm aware of.

  3. Use a custom preprocessor or templating tool. For example, a function that returns a set of an array:

pure recursive function set(x) result(res)
  integer, intent(in) :: x(:) !! Input array
  integer, allocatable :: res(:)
  if(size(x) > 1)then
    res = [x(1), set(pack(x(2:), .not. x(2:) == x(1)))]
  else
    res = x
  endif
end function set

A template could look like this:

pure recursive function set(x) result(res)
  {int*, real*}, intent(in) :: x(:) !! Input array
  {int*, real*}, allocatable :: res(:)
  ... ! body omitted for brevity
end function set

or similar, where the custom preprocessor would spit out specific procedures for all integer and real kinds. Some additional or alternative syntax would be needed if you wanted all combinations of type kinds between arguments.

There may be tools that do this already, and I think @zbeekman mentioned one that he uses. In general, for stdlib I think this is the way to go because we are likely to see many procedures that support multiple arguments with inter-compatible type kinds. The downside (strong downside IMO) is that we're likely to introduce a tool dependency that also depends on another language. If the community agrees, we can use this thread to review existing tools and which would be most fitting for stdlib.

Let's say we pick a tool to do the templating for us, we have two choices:

a) Have user build specifics from templates. In this scenario, the user must install the templating tool in order to build stdlib. I think we should avoid this.
b) Use the templating tool as developers only, and maintain the pre-built specifics in the repo. This means that when we're adding new code that will work on many type kinds, we use the tool on our end to generate the source, and commit that source to the repo (alongside the templates in a separate, "for developers" directory).

Assuming we can find a fitting tool, I'm in favor of the 3b approach here. There may be other approaches I'm not aware of or forgot about. What do you think and any other ideas?

@certik
Copy link
Member

certik commented Dec 22, 2019

Actually I propose 3c).

c) The git repository contains the templated code, depends on a 3rd party tool, and does not contain any autogenerated files. Then we created release tarballs automatically on a CI. A release tarball contains all the necessary generated files and the only dependencies are cmake (or make) and a Fortran compiler, and does not contain the git history, the templated files, nor any CI files and other things that are not needed to actually build the library. Users, as well as distributions (Debian, Ubuntu, Homebrew, Conda, Spack, etc.) only use the tarball, not the git repository.

I follow exactly this approach with LFortran, and it seems to work great. The advantage is that the git repository does not have autogenerated files, which greatly simplifies PRs (a simple diff versus hundreds of lines of modified autogenerated files) and makes it obvious how things should be modified --- so that people who want to contribute do not accidentally modify the autogenerated files, or forget to generate the files. Rather, the files are automatically generated using a CI, so they are always generated correctly.

@milancurcic
Copy link
Member Author

Great! I didn't think of this and indeed it seems to me like the best way to go.

@zbeekman
Copy link
Member

The best templating tool I have found is Jin2For. It uses Jinja2 templating which people may be familiar with from web technology stuff and is the templating back end for FORD. These are Python based, which is likely a language that Fortran developers may be familiar with. It can auto-generate default type aliases, kind info, and declarations by querying the compiler and enumerating ISO_Fortran_ENV's real_kinds, integer_kinds, logical_kinds and character_kinds array.

By providing a generic implementation for each intrinsic kind (where it is sensible to do this) the user doesn't need to care about setting dp, sp, rk, or whatever other convention you have for selecting kinds. Things just work™️. The downside is that compilers are not required to support all kinds, so you end up generating code specific to the kinds that a given compiler supports. This is not necessarily a bad thing, but it means that you may want to distribute different source versions tailored to different compilers. Since Fortran doesn't have a standard/interoperable ABI this is not really an issue at all, IMO.

@milancurcic
Copy link
Member Author

I reviewed jin2for and I like its simplicity (a minimal and to the point tool) and the fact that it uses an existing templating language rather than inventing a new one. I think it's a good candidate.

@certik
Copy link
Member

certik commented Dec 22, 2019

Let's try to use jin2for and see how it goes.

In general, I think the Fortran language itself should make it easier to write subroutines that operate on different kinds. This is something I would love to experiment with in LFortran in the future, and using jin2for is a solid starting point -- the future goal would be to simplify the syntax using (future) Fortran features.

@jvdp1
Copy link
Member

jvdp1 commented Dec 23, 2019

I gave a try to jin2for with loadtxt/savetxt (see https://github.com/jvdp1/stdlib/blob/loadtxt_autogen/src/stdlib_experimental_io.F90 and other tests/loadtxt/*.F90 files).

I am not sure how it should be done with cases that involve both integer and real kinds. Should the pre-defined templates be used? If yes, how to use kinds defined as sp, dp, qp?

Anyway, jin2for seems to be a nice and useful tool, and the option 3c proposed by @certik seems to be a good approach (not implement in my branch).

@marshallward
Copy link

marshallward commented Dec 23, 2019

What are the disadvantages of using CPP for this? I am worried about deepening the necessity on external tools, which can hinder portability.

CPP is always almost always available and often baked into the compiler (I think it's literally a library inside gfortran). Another advantage of CPP is that the compiler is often aware of the step, and debugging can point directly to the template file, rather then a copy placed in some scratch directory for which the user is unaware.

We've used it in FMS for this task without much issue. Readability and debugging are the only major drawbacks, but this would be true fur any templating approach.

(I'm on the road right now but can supplement with links when I get a chance.)

@milancurcic
Copy link
Member Author

As far as I understand CPP is more limited in what can be done with it. I'm surprised that you could do this with CPP alone.

In the scenario 3c, the external tool is required only of stdlib developers and not of end users, so I don't see much of a portability issue.

@marshallward
Copy link

This post outlines what we can do with FMS with CPP templating:

j3-fortran/fortran_proposals#4 (comment)

@milancurcic
Copy link
Member Author

Thanks @marshallward, I just read the comment and the sources you linked and I agree, it does seem quite bloated and is likely to get more complicated when considering different combinations of argument types and kinds.

I think this illustrates well the downsides -- with CPP we can't loop, but only define/undefine macros and branch. There may be more esoteric stuff to it, but this is what I've seen.

@jvdp1 I looked at your templates and the code is quite clean and readable to my eyes. I like it. We probably shouldn't use the .F90 suffix here -- .F90 is still a valid Fortran source file, whereas these templates aren't. I think jin2for suggests .t90 suffix for templates.

@jvdp1
Copy link
Member

jvdp1 commented Dec 23, 2019

I tried to implement the IO module with CPP template (see https://github.com/jvdp1/stdlib/tree/loadtxt_cpp/src ).
Honestly I was easier for me to use CPP (it passed the CI) than jin2for. I could also extend the IO module to integers using CPP. For these simple subroutines, using CPP is easy to implement. But using CPP could become quite difficult when combining multiple options.

@jvdp1 I looked at your templates and the code is quite clean and readable to my eyes. I like it. We probably shouldn't use the .F90 suffix here -- .F90 is still a valid Fortran source file, whereas these templates aren't. I think jin2for suggests .t90 suffix for templates.

@milancurcic I followed the syntax implemented by @zbeekman in one of his libraries. I agree that the .t90 suffix might be better.

@certik
Copy link
Member

certik commented Dec 23, 2019

@jvdp1 thanks a lot for implementing both approaches. Here they are, side by side:

It seems the jin2for version is a lot shorter. Am I right? Was it more difficult to implement because it is new, but as we (now) have an example how to use it, it will be perhaps even easier than CPP?

@jvdp1
Copy link
Member

jvdp1 commented Dec 23, 2019

@certik jin2for is indeed less verbose, and I think less error prone in this example.
jin2for was more difficult because it was new (e.g., I couldn't extend the subroutines to support integers (as I did with CPP), but I didn't try hard to find the solution; @zbeekman may have some hints).
Both approaches have pros and cons (e.g., CPP passed the CI without any change to it, while it was not the case for jin2for).

@certik
Copy link
Member

certik commented Dec 23, 2019

@jvdp1 we have to update our CI to support jin2for obviously. I would not hold it against it. :)

@marshallward
Copy link

marshallward commented Dec 24, 2019

I can see the advantages of jin2for, most notably iteration, and think option 3c addresses my concerns about portability. I also agree that the files should use the t90suffix, or at least not [fF].90.

I had resisted Jinja2 integration in another project, because I was concerned that the Jinja2 tokens may clash with the native file's own tokenisation (usually various config files); Jinja2's syntax was designed to safely work with HTML and not much else. But I also wondered if I was being too conservative.

Are there any known limitations to using Jinja2 on Fortran markup, such as token mix ups?

@certik
Copy link
Member

certik commented Dec 24, 2019

Does Fortran use { for anything? The combination {% is almost for sure safe. And if there is some possibility of a clash, I think we can tackle it on a case by case basis by rewriting things appropriately.

@certik
Copy link
Member

certik commented Dec 24, 2019

Down the road, I would like to prototype some of this limited templated functionality into LFortran and then propose it for the Fortran language itself. jin2for is a good start, as the code looks pretty nice. If Fortran language was extended, then the syntax would get even better perhaps. And LFortran could in the future be used instead of jin2for to do the rewrite, until all compilers support it.

@nshaffer
Copy link
Contributor

I was really hoping that lexical macro processing would get re-introduced into this upcoming Fortran standard. In fact, a fleshed-out macro processing specification was already given in Fortran 2008 drafts (e.g., https://j3-fortran.org/doc/year/07/07-007.pdf) but was dropped then. It was also considered for the upcoming standard as a means of supporting generic programming. But we know how that went.

After templating/macro processing was forgone for F2020, I looked into m4 as a solution for my own generic-interface-producing needs, since it is the tool gfortran uses to generate specific implementations of generic intrinsics. Its strengths are its power and that is a standard POSIX utility. It has been fun to learn, but I do not think it is a good solution for a standard library. POSIX standardization is not enough to compensate for the fact that it's really hard for a 21st century programmer to grok and I think it will lead to heavy technical debt. The other downside is that it's hard to make m4 programs look like marked-up Fortran source, so that it could be tricky to "port" the m4 workflow to a hypothetical standard macro/templating scheme (assuming J3 ever produces one).

@jvdp1
Copy link
Member

jvdp1 commented Dec 30, 2019

I slightly modified the CPP implementation (https://github.com/jvdp1/stdlib/tree/loadtxt_cpp) to clarify a few things, and renamed the files .F90 to .t90 (https://github.com/jvdp1/stdlib/tree/loadtxt_autogen).
These 2 options seem to be the most acceptable among all proposed. Should we make a choice now?

@nshaffer
Copy link
Contributor

One thing I don't understand about the cpp approach shown is what it does better than, e.g.,

module ex
    use iso_fortran_env, only: real32, real64, int32, int64
    implicit none

    interface foo
        module procedure foo_real32
        module procedure foo_real64
        module procedure foo_int32
        module procedure foo_int64
    end interface foo

contains

    function foo(x) result(y)
        real(real32), intent(in) :: x
        real(real32) :: y
        include "foo.inc"
    end function foo    

! etc.       
end module ex

The main downside of this approach is the repetition of each subroutine "skeleton" and the need to manually populate the interface blocks, but the cpp example has those same issues at the cost of introducing a foreign (albeit well-supported) program. I see cpp as having the worst of both worlds: it's an external tool, but it's not significantly more powerful (in this application) compared with the technique above. Of course, this assessment is invalid if I've overlooked some cpp technique that's not used in the examples posted so far.

@jvdp1
Copy link
Member

jvdp1 commented Dec 30, 2019

One thing I don't understand about the cpp approach shown is what it does better than, e.g.,

With this proposed scenario, 1 file per skeleton would be needed (if I understand well your proposition), while with the CPP approach, all skeletons could be included in 1 same file. CPP is an external tool, but it is well supported by most compilers. However, I don't appreciate the use of additional .inc files in both approaches.

The jin2for approach also requires an external tool. While no additional files were used in my example, jin2for might require them for more complex implementations.

However, if we use a strategy as described by @certik where end users and distributions only use tarballs automatically generated by a CI, using an external tool should not be a problem.

@nshaffer
Copy link
Contributor

One thing I don't understand about the cpp approach shown is what it does better than, e.g.,

With this proposed scenario, 1 file per skeleton would be needed (if I understand well your proposition), while with the CPP approach, all skeletons could be included in 1 same file. CPP is an external tool, but it is well supported by most compilers. However, I don't appreciate the use of additional .inc files in both approaches.

I didn't show it, but you would include "foo.inc" for each type you want to implement. This works as long as the contents of "foo.inc" are actually type-generic. I think this is totally equivalent to the cpp approach. I also dislike the disembodied ".inc" files, but it is the most economical approach the standard gives us right now.

The jin2for approach also requires an external tool. While no additional files were used in my example, jin2for might require them for more complex implementations.

However, if we use a strategy as described by @certik where end users and distributions only use tarballs automatically generated by a CI, using an external tool should not be a problem.
Agreed. This is the best-sounding approach, provided we trust the external tool will continue to be maintained until we have proper generics facilities in the standard and in compilers.

@zbeekman
Copy link
Member

I haven't gone through this thread in the detail it deserves yet, but a few broad observations:

  1. Sometimes Jin2For and pre-processing via cpp or fpp (Intel) are not mutually exclusive and mixing them can be an elegant solution where simpler approaches are exceedingly complex
  2. Jin2For's compiler interrogation is great, but, for the scope of this project, we should probably limit kinds to the most reasonable/common kinds
  3. Jin2For's generic macro, looping and other capabilities make it quite powerful with or without it's compiler introspection

The biggest problem with the automatically generated types is that they are very non-portable: They basically interrogate the available kinds from iso_fortran_env and then just blindly use them. So using the default aliases & kinds provided by this and generating them from GFortran may (almost certainly will, but I have yet to confirm) generate code that can't run with Intel's ifort.

My personal preference is to use Jin2For, but don't use the built in type declarations, aliases and kinds that are created from compiler introspection. Instead, for reals at least, attempt to target single, double and quad precision. CMake introspection can be used to confirm which kinds exist for a given compiler and then Jin2For can be used to generate interfaces and implementations for each kind supported by the compiler.

Otherwise, if you use the built in t.decl, t.alias, t.kind macros you will be generating code specific to the compiler being used that won't be portable.

Since the existence of various kinds is not guaranteed by the standard, much less the integer associated with each kind, this is a rather sticky situation. But I would rather loop over a list of kinds (possibly generated from CMake introspection) using Jin2For templates than contend with the awkward square peg in a round hole that is CPP and other non-standardized pre-processing. The advantages of Jin2For (or Jinja2 really...) are its widespread use in other domains (so that it is battle hardened and good enough to be popular) and the fact that it's Python based and extensible.

But, by preprocessing the code for the end user so they don't need Jin2For (unless we want to provide different pre-processed code for different compilers) you lose a non-trivial quantity of its utility. Whereas if you can stick to standardized fortran and a subset of cpp/fpp that's implemented in all major compilers then the user can do the code pre-processing themselves at configure/build time.

@certik
Copy link
Member

certik commented Jan 3, 2020

The ideal solution for this would be j3-fortran/fortran_proposals#128 in my opinion. But we'll have to probably wait some time for that.

@aradi
Copy link
Member

aradi commented Jan 12, 2020

As an alternative to Jin2For, you may also consider the Fypp preprocessor for generating templates. (Disclaimer: I am the main author of Fypp). It has similar loops as Jin2For and additionally also offers macros, so it could be also used for the assert macros (#72). It consists of a single (Python) source file and can be, therefore, easily shipped with the library, so that the build only requires a standard Python (2.6, 2.7 or 3.x) installation.

@milancurcic
Copy link
Member Author

milancurcic commented Jan 12, 2020

I like fypp a lot. Having the author in Fortran community is a huge plus IMO.

What do you think about taking a minimal example function and comparing the jin2for and fypp syntax next to each other? For example:

integer function sum(a, b)
  integer, intent(in) :: a
  integer, intent(in) :: b
  sum = a + b
end function sum

Requirements:

  • a and b can be any of real(sp), real(dp), real(qp), integer(int8), integer(int16), integer(int32), integer(int64), as defined in stdlib_experimental_kinds.f90;
  • Result is of whichever is the higher kind between a and b.

The preprocessed source code would result in 49 specific functions.

What would the template look like with jin2for and fypp? What would the invocation look like? Let's compare them side by side.

@nshaffer
Copy link
Contributor

Here's a go at using fypp for the task. I used it in some personal projects a couple years back, but I only really know the basics. This is a pretty naive implementation. Gist

The only snag I ran into was fypp not liking constructions like

#:for i, (k, t) in enumerate(zip(KINDS, TYPES))
...
#:endfor

Not sure if that's a bug or if fypp just doesn't do multi-level tuple unpacking. It's easy to work around in this case, at least.

@aradi
Copy link
Member

aradi commented Jan 13, 2020

@nshaffer fypp currently does not support multi-level tuple unpacking due to technical reasons. It could be extended if this really made a huge difference in the user experience. (As it is a preprocessor, I try to keep it as simple as possible to prevent people doing something with it, which they should do in Fortran instead 😉)

You example is very neat. In some cases we may also need to loop over ranks. It would then need a simple macro and an additional loop more:

#:def ranksuffix(rank)
#{if rank > 0}#(${":" + ",:" * (rank - 1)}$)#{endif}#
#:enddef

#:set ranks = range(6)
...
#:for rank in ranks
  some_type, ... :: a${ranksuffix(rank)}$
#:endfor

@nshaffer
Copy link
Contributor

@aradi Cool, thanks for confirming that about tuple unpacking. I think it's not a problem overall, I just let my Python instincts take the reins. The less-fancy version I arrived at is arguably better.

@jvdp1
Copy link
Member

jvdp1 commented Jan 13, 2020

@aradi @nshaffer Thank you for the examples with fypp.

fypp seems quite flexible, and I like @aradi 's example with loops over ranks (we may actually need that for functions like mean(array), variance(array), ... :) ). So I think this feature has to be considered seriously. Such a feature may be tedious to implement with cpp (and I don't know if it would be possible with jin2for).

An additional advantage is that the author @aradi is involved in the Fortran community.

@urbanjost
Copy link

urbanjost commented Jan 13, 2020 via email

@jvdp1
Copy link
Member

jvdp1 commented Jan 17, 2020

Here fypp is used to generate loadtxt for all kinds, based on @nshaffer 's example.

Earlier I also did it with cpp and jin2for. Among the three implementations, fypp is the most complete (i.e., implemented for all kinds for loadtxt).

Without any research (I could have spent more time in jin2for to extend it to integers, but it was not straightforward for me), fypp was also the easiest one to use for me. cpp may come tedious for more complex cases (e.g., for loops over ranks).

@milancurcic
Copy link
Member Author

We've been using fypp for this and it's been working well enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants