Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow 'use' of a directory to create nested modules #10946

Closed
mppf opened this issue Sep 4, 2018 · 11 comments
Closed

allow 'use' of a directory to create nested modules #10946

mppf opened this issue Sep 4, 2018 · 11 comments

Comments

@mppf
Copy link
Member

mppf commented Sep 4, 2018

This is an alternative to #10909. It includes a proposal pulled out of #8470 which was pulled out of #7847.

The idea is to allow one to use a directory. If that happens, each .chpl file within that directory will be also used but they will be submodules within a module with the name of the directory. This is intended to be similar to Python packages.

The idea is that the compiler can treat a directory as a module, so that all files inside of that directory would be considered nested modules.

E.g. in the example from #8470 (comment) - the user wishes to create

  module SuperLib
     submodule Helper
     submodule Impl

  module NanoLib
     submodule Impl

with the additional restriction that each submodule is stored in its own file.

How does "using" a directory enable this? Here is a sketch of the file structure:

  SuperLib/Main.chpl
    ... // code to go into module SuperLib can go here
    use Helper; // Helper methods can be used without module scoping
    .... Impl.someFunction() ...; // Impl methods still need module decorator

  SuperLib/Helper.chpl
    ... // compiler makes Helper a submodule of SuperLib
        // this happens whether or not Helper relied on the implicit module

  SuperLib/Impl.chpl
    ...

  NanoLib/Impl.chpl
    ...

  Application.chpl
    use SuperLib, NanoLib;

Two tricky bits:

  1. There should be somewhere to put top-level module code, say procedures that should be at top-level in module SuperLib. That's what the SuperLib/Main.chpl does in the above.
  2. How does the compiler handle the use Helper in SuperLib/Main.chpl? Since Helper is meant to be a nested sub-module (rather than a top-level one), the compiler needs to know to first look in SuperLib/ for Helper rather than in other paths.
@mppf
Copy link
Member Author

mppf commented Sep 4, 2018

One might imagine confusion if SuperLib/ is added to the module search path (say). I'd propose that we require that these directory modules always require a particular file (say Main.chpl) and that the compiler simply fail with an error if such a file is detected in a path in the module search path.

@mppf mppf changed the title allow 'use' of a directory to get nested modules allow 'use' of a directory to create nested modules Sep 4, 2018
@BryantLam
Copy link

BryantLam commented Sep 7, 2018

Just an opinion. I would prefer not to mix language features and file-system constructs if possible. This seems like it could be tricky to understand "where is module ((directoryName))" as a language feature, sort of similar to how file-level modules work today.

This feature does seem like a logical extension of file-level modules though. I like that aspect of its elegance.

Edit 2019-05-21: My opinion on this proposal has changed. What I want to avoid is directly incorporating filesystem constructs into the language (via e.g., include file.chpl statements) that would enable too much flexibility (and thus would somehow be limiting factor for easily supported/well-supported incremental compilation). I reiterate that I do like that the current file-level modules fit this proposal really well.

@bradcray
Copy link
Member

bradcray commented Sep 7, 2018

I would prefer not to mix language features and file-system constructs if possible.

This resonates with me. Just to make sure we're all on the same page, the behavior today in which use Foo; causes the compiler to go look for files named Foo.chpl has been considered an undesirable workaround since it was introduced. The historical intention was to scan the module search path for any .chpl files that defined a module Foo and parse those files to satisfy the requirement. The workaround took the approach of "well, we don't have the technology to do that scanning today, so let's at least look a this file Foo.chpl since the odds are reasonable that it defines a module named Foo (either using the implicit file-scope module which would be named Foo or hoping that the user made Foo.chpl contain module Foo).

Point being: I don't view use Foo as meaning "parse Foo.chpl" and feel nervous about extending that support to refer to a directory named Foo.

@mppf
Copy link
Member Author

mppf commented May 21, 2019

There have been some discussions of adding such an idea to Rust, see

Note an important difference from what was initially proposed here:

  • .rs files within a directory collectively create that module
  • subdirectories create submodules

Additionally I've read somewhere in the Rust discussions (but don't have a reference) that using a name like Main.chpl might be a bad idea since it makes find harder. Anyway Rust's current status seems to allow something like a directory representing the (internals of) a module when you have both MyModule.rs and also MyModule/. In that event, MyModule.rs can access things in MyModule but can also use declarations to control their visibility. This seems like a reasonable strategy to me because it makes the scoping more obvious.

@BryantLam
Copy link

BryantLam commented May 24, 2019

I'm on board with this proposal. Some pros/cons:

Pro or Con depending on the user's opinion:

  • Module hierarchy matches filesystem hierarchy
    • To modify the definition of a module and its submodules, the module definition is at a directory level one-higher from its submodules
  • Forces consistency in source codes for easy packaging

Pro:

  • Deceptively straightforward (= easy to learn)
  • Easy for users to incrementally expand their code
    • Start with main file. Add a sibling module to same directory level. Sibling module gets too big, so you create a subdirectory and split some functionality into submodules. The sibling module doesn't have to physically move on the filesystem.
  • Pathing for incremental compilation is easier since you can isolate the source code within a "package boundary" (e.g., a top-level tree)
  • Local module search path is straightforward; Any module only has to look for sibling modules in the same directory
    • There's only exactly one location to look for the sibling module
    • Forward compatible if users really want the Python approach of putting the sibling module in its own directory, e.g., Helper/Helper.chpl, with the primary downside of two search locations instead of exactly one
  • Directory hierarchy is flatter

Con:

  • Facade pattern not addressed: Inline modules not supported because you can only break up a large module via submodules
    • Necessitates re-exporting
  • It looks odd when a project has a lot of modules and submodules. Each dir level could have up to 2x more files/dirs. Example:
# Aside: Annoying that this layout doesn't print cleanly with `tree`.
.
├── MyLibraryApi.chpl           # use Impl, Helper
├── Helper
│   ├── Utils
│   │   └── Logging.chpl
│   └── Utils.chpl
├── Helper.chpl
├── Impl
│   ├── Drivers
│   │   └── more_submodules
│   ├── Drivers.chpl
│   ├── Memory
│   │   └── more_submodules
│   ├── Memory.chpl
│   ├── Runtime
│   │   └── more_submodules
│   └── Runtime.chpl
├── Impl.chpl
├── Tests
│   ├── Test1.chpl
│   └── Test2.chpl
└── Tests.chpl                  # use MyLibraryApi

Compiled with:
chpl MyLibraryApi.chpl --library
chpl Tests.chpl

I think it's implied that any submodule can still get to another submodule (e.g., directly calling Helper.Utils.Logging.log()) without having to use that module.

Follow-up question: I might be wrong, but I think since Chapel today only has a global search path for modules, any Impl.* submodule can precisely use Helper.Utils.Logging. But if the user actually wanted to get to a Mason package that was poorly named as Helper, there's either no way to do it and still reach both modules, or it's an ambiguity and the code won't compile at all (which is even worse once people start creating more and more Mason packages, but this collateral damage can be limited by dependency tracking via Mason.toml).

@mppf
Copy link
Member Author

mppf commented May 24, 2019

Follow-up question: I might be wrong, but I think since Chapel today only has a global search path for modules, any Impl.* submodule can precisely use Helper.Utils.Logging. But if the user actually wanted to get to a Mason package that was poorly named as Helper, there's either no way to do it and still reach both modules, or it's an ambiguity and the code won't compile at all (which is even worse once people start creating more and more Mason packages, but this collateral damage can be limited by dependency tracking via Mason.toml).

I think you understand the situation correctly - but in this can be addressed by providing a way to anchor a use at the top level, like use ::Helper;. We wouldn't necessarily pick that syntax though.

@lydia-duncan
Copy link
Member

Probably not super important right away, but what should be done in this proposal when the directory contains a symlink to another .chpl file?

@mppf
Copy link
Member Author

mppf commented May 24, 2019

Probably not super important right away, but what should be done in this proposal when the directory contains a symlink to another .chpl file?

I don't expect it to do anything special, but that would allow 2 copies of the same file to be loaded in as modules with module paths (e.g. A.B.C and X.Y.C). But that doesn't worry me overmuch, because you could also have literally copied and pasted the file. Do you imagine it should do something special for symlinks?

@lydia-duncan
Copy link
Member

I think it would be nice to recognize when two modules are actually the same, so that if they accidentally would "conflict" we can just ignore the conflict if one is a symlink to the other (or both are symlinks to the same source). I think it would be fair to treat it like a copy+paste, though, and don't necessarily expect the situation to come up super often, it was just something that occurred to me because we were talking about file system organization.

@BryantLam
Copy link

BryantLam commented May 24, 2019

Treating it as a literal copy+paste makes sense to me. I'd like to use the filesystem for its hierarchy, but not rely too much on any particular features of the filesystem. It's a slippery slope towards considerations for hard links and case-insensitive filesystems (paired with bad module naming; the compiler might need to actually warn/enforce a style of module naming but it wouldn't help with e.g. ModuleName vs. MoDuLeName).

@mppf
Copy link
Member Author

mppf commented Mar 26, 2020

PR #15279 implemented something close to the alternative design in #13524.

@mppf mppf closed this as completed Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants