Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python preprocessor #791

Closed
jalvesz opened this issue Apr 9, 2024 · 11 comments
Closed

python preprocessor #791

jalvesz opened this issue Apr 9, 2024 · 11 comments
Labels
idea Proposition of an idea and opening an issue to discuss it

Comments

@jalvesz
Copy link
Contributor

jalvesz commented Apr 9, 2024

Motivation

This proposal is meant to start a discussion on a replacement for the current fpm ci for stdlib. The idea would be to use a python script to preprocess stdlib before building with fpm or CMake. While CMake already has a customized fypp preprocessor, such script could serve as a replacement using a PRE_BUILD action with add_custom_command. Also, currently the fpm branch lacks the means for a flexible way of adding dependencies in the toml file. This proposal would try to remedy these shortcomings.

Prior Art

No response

Additional Information

#790 #787

Proposal

Say a fypp_deployement.py file is at the root of stdlib:

import os
import fypp
import argparse
from joblib import Parallel, delayed

def pre_process_toml(kargs):
    """
    Pre-process the fpm.toml
    """
    from tomlkit import table, dumps
    data = table()
    data.add("name", "stdlib")
    data.add("version", str(kargs.vmajor)+
                    "."+str(kargs.vminor)+
                    "."+str(kargs.vpatch) )
    data.add("license", "MIT")
    data.add("author", "stdlib contributors")
    data.add("maintainer", "@fortran-lang/stdlib")
    data.add("copyright", "2019-2021 stdlib contributors")

    if(kargs.with_blp):
        build = table()
        build.add("link", ["lapack", "blas"] )
        data.add("build", build)

    dev_dependencies = table()
    dev_dependencies.add("test-drive", {"git" : "https://github.com/fortran-lang/test-drive", 
                                        "tag" : "v0.4.0"})
    data.add("dev-dependencies", dev_dependencies)

    preprocess = table()
    preprocess.add("cpp", {} )
    preprocess['cpp'].add("suffixes", [".F90", ".f90", ".fypp"] )
    preprocess['cpp'].add("macros", ["MAXRANK="+str(kargs.maxrank), 
                                 "PROJECT_VERSION_MAJOR="+str(kargs.vmajor),
                                 "PROJECT_VERSION_MINOR="+str(kargs.vminor),
                                 "PROJECT_VERSION_PATCH="+str(kargs.vpatch)] )
    data.add("preprocess", preprocess)

    with open("fpm.toml", "w") as f:
        f.write(dumps(data))

C_PREPROCESSED = (
    "stdlib_linalg_constants" ,
    "stdlib_linalg_blas" ,
    "stdlib_linalg_blas_aux",
    "stdlib_linalg_blas_s",
    "stdlib_linalg_blas_d",
    "stdlib_linalg_blas_q",
    "stdlib_linalg_blas_c",
    "stdlib_linalg_blas_z",
    "stdlib_linalg_blas_w",
    "stdlib_linalg_lapack",
    "stdlib_linalg_lapack_aux",
    "stdlib_linalg_lapack_s",
    "stdlib_linalg_lapack_d",
    "stdlib_linalg_lapack_q",
    "stdlib_linalg_lapack_c",
    "stdlib_linalg_lapack_z",
    "stdlib_linalg_lapack_w"
)

def pre_process_fypp(kargs):
    kwd = []
    kwd.append("-DMAXRANK="+str(kargs.maxrank))
    kwd.append("-DPROJECT_VERSION_MAJOR="+str(kargs.vmajor))
    kwd.append("-DPROJECT_VERSION_MINOR="+str(kargs.vminor))
    kwd.append("-DPROJECT_VERSION_PATCH="+str(kargs.vpatch))
    if kargs.with_qp:
        kwd.append("-DWITH_QP=True")
    if kargs.with_xqp:
        kwd.append("-DWITH_XQP=True")
    
    optparser = fypp.get_option_parser()
    options, leftover = optparser.parse_args(args=kwd)
    options.includes = ['include']
    # options.line_numbering = True
    tool = fypp.Fypp(options)

    # Define the folders to search for *.fypp files
    folders = ['src', 'test', 'example']
    # Process all folders
    fypp_files = [os.path.join(root, file) for folder in folders
              for root, _, files in os.walk(folder)
              for file in files if file.endswith(".fypp")]
    
    def process_f(file):
        source_file = file
        basename = os.path.splitext(source_file)[0]
        sfx = 'f90' if os.path.basename(basename) not in C_PREPROCESSED else 'F90'
        target_file = basename + '.' + sfx
        tool.process_file(source_file, target_file)
    
    Parallel(n_jobs=kargs.njob)(delayed(process_f)(f) for f in fypp_files)
    
    return

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Preprocess stdlib source files.')
    # fypp arguments
    parser.add_argument("--vmajor", type=int, default=0, help="Project Version Major")
    parser.add_argument("--vminor", type=int, default=4, help="Project Version Minor")
    parser.add_argument("--vpatch", type=int, default=0, help="Project Version Patch")

    parser.add_argument("--njob", type=int, default=4, help="Number of parallel jobs for preprocessing")
    parser.add_argument("--maxrank",type=int, default=7, help="Set the maximum allowed rank for arrays")
    parser.add_argument("--with_qp",type=bool, default=False, help="Include WITH_QP in the command")
    parser.add_argument("--with_xqp",type=bool, default=False, help="Include WITH_XQP in the command")
    # external libraries arguments
    parser.add_argument("--with_blp",type=bool, default=False, help="Link against OpenBLAS")

    
    args = parser.parse_args()

    pre_process_toml(args)
    pre_process_fypp(args)

Example:

python fypp_deployment.py --with_blp 1

Would produce a fpm.toml:

name = "stdlib"
version = "0.4.0"
license = "MIT"
author = "stdlib contributors"
maintainer = "@fortran-lang/stdlib"
copyright = "2019-2021 stdlib contributors"

[build]
link = ["lapack", "blas"]

[dev-dependencies]
[dev-dependencies.test-drive]
git = "https://github.com/fortran-lang/test-drive"
tag = "v0.4.0"

[preprocess]
[preprocess.cpp]
suffixes = [".F90", ".f90", ".fypp"]
macros = ["MAXRANK=7", "PROJECT_VERSION_MAJOR=0", "PROJECT_VERSION_MINOR=4", "PROJECT_VERSION_PATCH=0"]

Current limitations

  • In this first example the fypp files are being preprocessed into .f90 files at the same location. Should they go somewhere else?
@jalvesz jalvesz added the idea Proposition of an idea and opening an issue to discuss it label Apr 9, 2024
@jalvesz
Copy link
Contributor Author

jalvesz commented Apr 10, 2024

@jvdp1 @perazz first draft to follow-up on the discussion in #790

@jalvesz
Copy link
Contributor Author

jalvesz commented Apr 10, 2024

I've made updates to the script, it now runs in parallel, takes into account c_preprocessed files to add the suffix .F90.
Running fpm build or fpm test from the root will effectively run as a fpm.toml is available.

  • Where should processed files end up if this strategy is to be adopted? (ok to leave them next to fypp files or a dedicated subfolder should be provisioned?)
  • When running the tests there are a bunch of files created that end up in the root... maybe a default dump directory should be set for all tests?

@perazz
Copy link
Contributor

perazz commented Apr 11, 2024

I believe a great starting point would be that the preprocessing script allows both CMake and the stdlib-fpm build to achieve exactly the same output as they have now. In other words, it would be the unique source to be modified when changes are introduced to the folder structure (remove customized commands from fpm-deployment)

@jalvesz
Copy link
Contributor Author

jalvesz commented Apr 11, 2024

Totally agree! Adding such script to CMake could look something like:

add_custom_command(
    TARGET stdlib
    PRE_BUILD 
    COMMAND python ${CMAKE_CURRENT_SOURCE_DIR}/fypp_deployement.py)

What I haven't figured out (tried) yet is if we could do something like

fpm build fypp_deployment.py [--options]

To have the fpm command line profit from it.

@jvdp1
Copy link
Member

jvdp1 commented Apr 12, 2024

I think I misunderstand the aim of this script. So my questions may sound strange.

Totally agree! Adding such script to CMake could look some like:

Would this replace the call to fypp inside CMake?

What I haven't figured out (tried) yet is if we could do something like

fpm build fypp_deployment.py [--options]

To have the fpm command line profit from it.

Since the script fypp_deployment.py generate the fpm.toml file, how such a strategy would work? I guess fypp_deployement.py should be run first, followed by fpm?

My script could be used as fpm build --compiler fypp_gfortran.py and would use the content of the existing fpm.toml to preprocess the files accordingly. However, fypp_deployement.py has a different aim, right (e.g., to generate the files in stdlib-fpm branch)?

@jalvesz
Copy link
Contributor Author

jalvesz commented Apr 12, 2024

Your question is totally valid:

Would this replace the call to fypp inside CMake?

It could, I'm starting to test this approach with a smaller project and it works: I can use the same script to preprocess the fypp files before compiling with fpm or building with CMake. Up to us to decide if that would be a good solution, I'm starting to think that yes, but I would like to hear your opinion.

What I find honestly extremely satisfying is that on Windows, with MVS this PRE_BUILD command is registered, so when I click on rebuild the project it launches it. So I can focus on editing the fypp and have everything running smoothly. And with VSCode I'll just use the CLI with fpm with two commands.

Since the script fypp_deployment.py generate the fpm.toml file, how such a strategy would work? I guess fypp_deployement.py should be run first, followed by fpm?

Yes, right now I use it in two steps, first do all the preprocessing then build. My next target was to include what you have done with your script to also include the building step (combine both ideas). But before going there, I wanted to get some feed back.

One of the limitations that has been discussed with the fpm branch and the fpm.toml is that it is static. No way of conditionally linking against OpenBLAS or MKL or other libs. Other conditional options might be missing as well, for instance the maxrank. By regenerating the manifest file with the script it could enable having the same flexibility as with CMake.

I've made a single script to show the different steps, but as you can see, there are already 2 different steps that could be decoupled pre_process_toml(args) and pre_process_fypp(args) a third step process or build could be included to complete the process.

@jalvesz
Copy link
Contributor Author

jalvesz commented Apr 13, 2024

UPDATE
Here is an updated version that manages 3 steps pre_process_toml, pre_process_fypp and fpm_build which enables to do something like the following from the root of the main branch of stdlib:

python fypp_deployment.py --with_qp 1 --flag "-O3 -march=native"

And it will: 1. regenerate the fpm.toml, 2. preprocess all the fypp files creating a .f90 or .F90 just next to the original with quad precision, 3. Build stdlib using the user flags.

For CMake only the pre_process_fypp is needed if we replace the fypp preprocessing there in. I'm already thinking about splitting this into 3 modules.

@jvdp1 I tried to adapt what you did for building, I wanted to make it such that it is not compiler dependent, but that the compiler can be predefined... I'm still a bit at loss with all the possibilities of the fpm CLI to see how could I make the script callable from it and ensure the 3 steps, Instead of doing what I did here, call fpm build from within with subprocess ...

import os
import fypp
import argparse
from joblib import Parallel, delayed

def pre_process_toml(args):
    """
    Pre-process the fpm.toml
    """
    from tomlkit import table, dumps
    data = table()
    data.add("name", "stdlib")
    data.add("version", str(args.vmajor)+
                    "."+str(args.vminor)+
                    "."+str(args.vpatch) )
    data.add("license", "MIT")
    data.add("author", "stdlib contributors")
    data.add("maintainer", "@fortran-lang/stdlib")
    data.add("copyright", "2019-2021 stdlib contributors")

    if(args.with_blp):
        build = table()
        build.add("link", ["lapack", "blas"] )
        data.add("build", build)

    dev_dependencies = table()
    dev_dependencies.add("test-drive", {"git" : "https://github.com/fortran-lang/test-drive", 
                                        "tag" : "v0.4.0"})
    data.add("dev-dependencies", dev_dependencies)

    preprocess = table()
    preprocess.add("cpp", {} )
    preprocess['cpp'].add("suffixes", [".F90", ".f90", ".fypp"] )
    preprocess['cpp'].add("macros", ["MAXRANK="+str(args.maxrank), 
                                 "PROJECT_VERSION_MAJOR="+str(args.vmajor),
                                 "PROJECT_VERSION_MINOR="+str(args.vminor),
                                 "PROJECT_VERSION_PATCH="+str(args.vpatch)] )
    data.add("preprocess", preprocess)

    with open("fpm.toml", "w") as f:
        f.write(dumps(data))
    return

C_PREPROCESSED = (
    "stdlib_linalg_constants" ,
    "stdlib_linalg_blas" ,
    "stdlib_linalg_blas_aux",
    "stdlib_linalg_blas_s",
    "stdlib_linalg_blas_d",
    "stdlib_linalg_blas_q",
    "stdlib_linalg_blas_c",
    "stdlib_linalg_blas_z",
    "stdlib_linalg_blas_w",
    "stdlib_linalg_lapack",
    "stdlib_linalg_lapack_aux",
    "stdlib_linalg_lapack_s",
    "stdlib_linalg_lapack_d",
    "stdlib_linalg_lapack_q",
    "stdlib_linalg_lapack_c",
    "stdlib_linalg_lapack_z",
    "stdlib_linalg_lapack_w"
)

def pre_process_fypp(args):
    kwd = []
    kwd.append("-DMAXRANK="+str(args.maxrank))
    kwd.append("-DPROJECT_VERSION_MAJOR="+str(args.vmajor))
    kwd.append("-DPROJECT_VERSION_MINOR="+str(args.vminor))
    kwd.append("-DPROJECT_VERSION_PATCH="+str(args.vpatch))
    if args.with_qp:
        kwd.append("-DWITH_QP=True")
    if args.with_xqp:
        kwd.append("-DWITH_XQP=True")
    print(kwd)
    optparser = fypp.get_option_parser()
    options, leftover = optparser.parse_args(args=kwd)
    options.includes = ['include']
    # options.line_numbering = True
    tool = fypp.Fypp(options)

    # Define the folders to search for *.fypp files
    folders = ['src', 'test']
    # Process all folders
    fypp_files = [os.path.join(root, file) for folder in folders
              for root, _, files in os.walk(folder)
              for file in files if file.endswith(".fypp")]
    
    def process_f(file):
        source_file = file
        root = os.path.dirname(file)
        basename = os.path.splitext(os.path.basename(source_file))[0]
        sfx = 'f90' if basename not in C_PREPROCESSED else 'F90'
        target_file = root + os.sep + basename + '.' + sfx
        tool.process_file(source_file, target_file)
    
    Parallel(n_jobs=args.njob)(delayed(process_f)(f) for f in fypp_files)
    
    return

def fpm_build(unknown):
    import subprocess
    #==========================================
    # check compilers
    if "FPM_FC" in os.environ:
        FPM_FC  = os.environ['FPM_FC']
    if "FPM_CC" in os.environ:
        FPM_CC  = os.environ['FPM_CC']
    if "FPM_CXX" in os.environ:
        FPM_CXX  = os.environ['FPM_CXX']
    #==========================================
    # Filter out the macro definitions.
    macros = [arg for arg in unknown if arg.startswith("-D")]
    # Filter out the include paths with -I prefix.
    include_paths = [arg for arg in unknown if arg.startswith("-I")]
    # Filter out flags
    flags = " "
    for idx, arg in enumerate(unknown):
        if arg.startswith("--flag"):
            flags= unknown[idx+1]
    #==========================================
    # build with fpm
    subprocess.run(["fpm build"]+[" --flag "]+[flags], shell=True, check=True)
    return

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Preprocess stdlib source files.')
    # fypp arguments
    parser.add_argument("--vmajor", type=int, default=0, help="Project Version Major")
    parser.add_argument("--vminor", type=int, default=4, help="Project Version Minor")
    parser.add_argument("--vpatch", type=int, default=0, help="Project Version Patch")

    parser.add_argument("--njob", type=int, default=4, help="Number of parallel jobs for preprocessing")
    parser.add_argument("--maxrank",type=int, default=7, help="Set the maximum allowed rank for arrays")
    parser.add_argument("--with_qp",type=bool, default=False, help="Include WITH_QP in the command")
    parser.add_argument("--with_xqp",type=bool, default=False, help="Include WITH_XQP in the command")
    # external libraries arguments
    parser.add_argument("--with_blp",type=bool, default=False, help="Link against OpenBLAS")

    args, unknown = parser.parse_known_args()
    #==========================================
    # pre process the fpm manifest
    pre_process_toml(args)
    #==========================================
    # pre process the meta programming fypp files
    pre_process_fypp(args)
    #==========================================
    # build using fpm
    fpm_build(unknown)

@jalvesz
Copy link
Contributor Author

jalvesz commented Apr 20, 2024

Update: I'm tracking this in the following branch https://github.com/jalvesz/stdlib/tree/deployment

The first target I'm trying to achieve is to replace the fpm-deployement.sh script by the use of the python script to generate the stdlib-fpm folder and do the fypp preprocessing on the fly. The script is in the ci folder but it might have a better place in config ?

This python ci/fypp_deployment.py creates the stdlib-fpm folder and preprocesses on the fly.
This python ci/fypp_deployment.py --destdir "." would consider an "in-place" preprocessing so the fypp files are preprocessed at their location in the root folder.
This python ci/fypp_deployment.py --build 1 would preprocess and build using fpm with the compiler defined in an env variable using FPM_FC = os.environ['FPM_FC'] if "FPM_FC" in os.environ else "gfortran"

@jvdp1
Copy link
Member

jvdp1 commented Apr 21, 2024

The first target I'm trying to achieve is to replace the fpm-deployement.sh script by the use of the python script to generate the stdlib-fpm folder and do the fypp preprocessing on the fly. The script is in the ci folder but it might have a better place in config ?

If it is a wider use than just for the CI/CD, then I agree it should be moved to the config directory.

This python ci/fypp_deployment.py creates the stdlib-fpm folder and preprocesses on the fly. This python ci/fypp_deployment.py --destdir "." would consider an "in-place" preprocessing so the fypp files are preprocessed at their location in the root folder.

What is the advantage to preprocess the files in-place? Generated .f90 files won't be tracked by git and a git clean -f won't work.

@jalvesz
Copy link
Contributor Author

jalvesz commented Apr 21, 2024

If it is a wider use than just for the CI/CD, then I agree it should be moved to the config directory.

Perfect! I'll move it there then

What is the advantage to preprocess the files in-place?

What I would like to achieve is being able to work on the .fypp files and launch the build process (which should trigger the preprocessing automatically) in a "one-click" style. Instead of having to move between folders. Basically to do fpm build at the root folder. I could instead create on the fly the following src\temp test\temp, and dump the .f90 files within having this temp subfolder ignored by git.

@RJaBi
Copy link

RJaBi commented Apr 25, 2024

I've used this to address my issue in #796. It was really useful to have a python script as the computer I'm working on is fairly restricted (If I install fypp via pip, I can't run it via command line, but can import it into python). The 'standard' fpm-deployment.sh would hence not work for me.

My only real issue was that I had to edit the script to point directly at my fpm as it's not on my path (for similar reasons, I use a pre-compiled binary).

I was then able to use it in my FPM project using i.e.

[dependencies]
stdlib = {path = 'temp/stdlib/stdlib-fpm/'}

in my project's fpm.toml

The script also doesn't handle Cray compilers or environment, but that's not unexpected really and FPM doesn't handle them anyway so I switched to the GCC ones.

Thanks!

@jalvesz jalvesz closed this as completed May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Proposition of an idea and opening an issue to discuss it
Projects
None yet
Development

No branches or pull requests

4 participants