Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve require lookup #2690

Closed
asterite opened this issue May 30, 2016 · 31 comments · Fixed by #2788
Closed

Improve require lookup #2690

asterite opened this issue May 30, 2016 · 31 comments · Fixed by #2788

Comments

@asterite
Copy link
Member

Reference: https://groups.google.com/forum/?fromgroups#!topic/crystal-lang/eJ9Ijb54Tnk and https://groups.google.com/forum/?fromgroups#!topic/crystal-lang/uy-jxSp-b7U

TL;DR: to require a file "specific" inside a shard "foo" one has to do require "foo/foo/specific". It would be nice to just have to do require "foo/specific".

The lookup rules should probably change to this:

  • When doing require "foo", search "foo.cr" and "foo/foo.cr" in the path (this already works like that)
  • When doing require "foo/bar", search "foo/foo/bar.cr" in the path (I think searching "foo/bar.cr" is not needed)

As a separate concern, shards right now puts only the "src" directory inside "libs". Maybe it would be simpler/easier to put the whole repo inside "libs", inside a folder with the shard name. Then the require lookup should change to:

  • When doing require "foo", search "foo.cr", "foo/foo.cr" and "foo/src/foo.cr" in the path
  • When doing require "foo/bar", search "foo/foo/bar.cr" and "foo/src/foo/bar.cr" in the path

This has the additional benefit that if a file inside "src" requires something outside "src" (maybe a "data" directory"), it will be available, otherwise it won't be found. Example

And maybe this last point simplifies shards? I don't know.

Finally, we should maybe rename "libs" to "lib": #321

/cc @ysbaddaden @jhass

@jhass
Copy link
Member

jhass commented May 30, 2016

Tbh. I already dislike the current behavior, it's way too magical and I'm only waiting for conflicts to happen, in non obvious ways. Trying even more files will only make that ever more so likely.

I would still prefer a basically Ruby style system that's easy to follow and understand. For require "x", with x allowed to contain /, the compiler searches all load paths, in order, for load_path_entry/x, then again all load paths, in order, for load_path_entry/x.cr. The first file found wins and aborts the search. In this approach shards of course would need a way to modify the load path, individual shards should be allowed to define which folders they add to the load path.

@asterite
Copy link
Member Author

But the current way, even though with more "magical" logic in the compiler, is simpler for the user. Just add dependencies, do shards install, and that's it. No need to mess with the load path. And no need for shards to modify the load path. And all dependencies can be found relative to a project.

Also in Ruby there's this problem, which we don't have, and we would have if we support modifying the load path, and not making the compiler scope requires to a directory.

In this case, I think more power is less beneficial.

@jhass
Copy link
Member

jhass commented May 30, 2016

My point is that I already don't follow the current logic. Ruby's system is easy to exercise in your mind and thus easy to debug. Yes, the current way is easier for the user, but only as long as it works perfectly. Have a conflict and need to figure where/how/when/why? Welcome to debugging hell following all this logic. Then welcome again for finding a workaround taking all that complex logic into account.

@asterite
Copy link
Member Author

It's explained here. I think it's pretty simple:

  • require "foo" will require "libs/foo/foo.cr", which is the default entry-point of a shard
  • require "./foo" requires relative to the current file

Is there are way to make that simpler?

Next, we need to allow requiring just some files inside a shard. require "foo/bar" should be the most obvious way to do it, and since a shard "foo" is placed in "libs/foo", then that should try to find a file named "libs/foo/foo/bar.cr".

This logic can be explained in one paragraph, without mentioning CRYSTAL_PATH, how shards modifies it (because it doesn't), and users don't even need to know how it all works, they just do shards install "foo" and require "foo" or require "foo/bar".

Anyway, that's just my opinion :-)

@jhass
Copy link
Member

jhass commented May 30, 2016

  • When doing require "foo", search "foo.cr" and "foo/foo.cr" in the path (this already works like that)
  • require "foo" will require "libs/foo/foo.cr", which is the default entry-point of a shard

Alright, just forgot a rule when repeating it the second time. I guess it makes no difference if you forget or don't know 2 of 3 or 1 of 3 instead of 1 of 2. Or none of one because there's only one and you know there has to be one.

With the current rules a foo.cr will already shadow a foo/foo.cr, a foo/foo.cr will shadow a foo/foo/foo.cr, and so on. Adding your rule foo.cr shadows foo/foo.cr and foo/src/foo.cr, foo/foo.cr shadows foo/foo/foo.cr and foo/src/foo/foo.cr (but foo/foo/src/foo.cr, one has to keep in mind). Will foo.cr shadow bar/src/foo.cr? Will foo/foo.cr shadow bar/src/foo/foo.cr? And so on and so on, there's an exponential list of cases one has to consider.

@bcardiff
Copy link
Member

To src or not to src, it should be transparent to the user and they shouldn't care about how shards are expanded in lib folder as long as the usecases require 'foo', require 'foo/bar' , require './foo' are easy to make them work.

Maybe the more complicated story is how to create a shard in the app that you might want to extract later. This last user story might add some additional criteria.

Also, we can make the compiler give some information how each require is resolved...

@asterite
Copy link
Member Author

@jhass I don't follow, there's no way one shard can shadow another shard

  • when you do require "foo", it searches "foo/src/foo.cr" or "foo/src/foo/foo.cr"
  • when you do require "foo/any/thing.cr" it searches "foo/src/foo/any/thing.cr"

As you can see, the first component in the require always scopes the search. So one shard can't pollute another namespace.

Expanding to "foo/src/foo" is only applied for the first component.

@ysbaddaden
Copy link
Contributor

The whole src/foo logic is required in Ruby because Rubygem adds every foo-1.2.3/lib to the require path, and we thus have to namespace the files under foo-1.2.3/lib/foo to avoid conflicts. This isn't required in Crystal, where we only push the folder containing shards to the require path, and not every single shard's src, so there is no need to namespace the files. This also prevents a shard from hijacking another shard's namespace, which I consider a good thing.

I did that for minitest.cr as a matter of habit (it was my first shard), but I'll eventually put everything under src and drop src/minitest; all my other shards already put everything in src.

I think this is simpler and more solid than perpetuating the Rubygem issue. We already don't version the installed shards (because we deal with project specific dependencies, not a central install repository). We already don't push each shard src folder to the require path, so it's kept clean, fast and... doesn't require to namespace files in a subfolder. Let's not support this namespacing, it really means nothing in Crystal. At least I'd like to avoid it altogether.

On another topic: we should indeed change the way shards installs dependencies to extract the shard's root as libs/<name> instead of only extracting the src folder, as we previously discussed; this would be cleaner and allow to have C code, libraries or templates kept out of the src folder. Thus:

  • require "foo" would search:
    1. foo.cr
    2. foo/foo.cr
    3. foo/src/foo.cr
    4. foo/src/foo/foo.cr (for completeness with next example, could be skipped)
  • require "foo/bar" would search:
    1. foo/bar.cr
    2. foo/bar/bar.cr
    3. foo/src/bar.cr
    4. foo/src/bar/bar.cr

Where only the first part, foo, is ever expanded as foo/src (shards requirement) and the last part, foo or bar, is ever expanded as foo/foo.cr or bar/bar.cr (crystal nicety).

@cjgajard
Copy link
Contributor

cjgajard commented May 30, 2016

I like @ysbaddaden idea, but this either don't solves what people in google groups where looking for or change how shards were structured until today.

As an example, using a crystal init structure

├── src
│   ├── foo
│   │   ├── version.cr
│   │   └── bar.cr
│   └── foo.cr

to require src/foo/bar.cr you have to still write foo twice: require "foo/foo/bar"
or restructure everything as

├── src
│   ├── bar
│   │   └── bar.cr
│   ├── version.cr
│   └── foo.cr

to require version and bar separetedly like this:

require "foo/bar"
require "foo/version"

@ysbaddaden
Copy link
Contributor

Indeed, crystal init should be changed to create src/foo.cr and src/version.cr and never expose a src/foo folder to put things into.

@jhass
Copy link
Member

jhass commented May 30, 2016

Doesn't that give up namespace -> filename mapping conventions?

@phoffer
Copy link

phoffer commented May 30, 2016

As someone newer to Crystal, the behavior @asterite has described seems to be the method of least-surprise, for what it's worth. I think this matches up with the behavior described by @ysbaddaden in the previous comment, in terms of file loading logic (if I understand his comment correctly).

Separating load logic from directory structure would make the structure issue more of a developer preference. Personally, I would prefer the nested structure, but that is mainly due to the fact I'm coming from Ruby, whether that is good or bad. The flexibility to allow both structures would probably be well received, especially for people coming from other language backgrounds.

@asterite
Copy link
Member Author

@ysbaddaden @cjgajard I think the last point iv in this comment has a mistake. Shouldn't it be:

  • require "foo" would search:
    1. foo.cr
    2. foo/foo.cr
    3. foo/src/foo.cr
    4. foo/src/foo/foo.cr (for completeness with next example, could be skipped)
  • require "foo/bar" would search:
    1. foo/bar.cr
    2. foo/bar/bar.cr
    3. foo/src/bar.cr
    4. foo/src/foo/bar.cr (was foo/src/bar/bar.cr)

In fact, for the last part I'd say it should be:

  1. foo/bar.cr (for the standard library)
  2. foo/src/foo/bar.cr (for a shard)

@asterite
Copy link
Member Author

That is, a shard should keep the src/foo directory structure. We could actually make it just be src and put everything inside it, but since a shard foo usually defines a module Foo and then everything is inside that module, having src/foo/file.cr makes sense, and so require "foo/file" should search in foo/src/foo/file.cr

@cjgajard
Copy link
Contributor

Personally, I like foo/src/name/name.cr more than foo/src/foo/name.cr because it allows building submodules in a cleaner way. Requiring foo/src/foo/name.cr will lead foo/src/ or foo/src/foo/ to have a "start-point" file for each submodule, but using foo/src/name/name.cr allows to have that start-point file inside the folder of the submodule instead. A good example of this is crystal/src/yaml/yaml.cr

@asterite
Copy link
Member Author

@cjgajard For the standard library crystal/src is already on the path, so doing require "yaml/foo" will first search yaml/foo.cr and then yaml/src/yaml/foo.cr.

foo/src/name/name.cr doesn't happen in shards. Shards are expected to have this structure:

├── src
│   ├── foo
│   │   ├── version.cr
│   │   └── foo.cr
│   └── foo.cr

with more files inside src/foo.

@cjgajard
Copy link
Contributor

cjgajard commented May 30, 2016

@asterite Sorry I was not clear explaning my point, is not about requiring the standard library that's already covered.
Let's keep src/foo stucture (it's better not to break thinks up), but consider

├── src
│   ├── foo
│   │   ├── bar
│   │   │   ├── file1.cr
│   │   │   └── file2.cr
│   │   ├── baz
│   │   │   ├── file3.cr
│   │   │   └── file4.cr
│   │   ├── bar.cr
│   │   └── baz.cr
│   └── foo.cr

vs

├── src
│   ├── foo
│   │   ├── bar
│   │   │   ├── file1.cr
│   │   │   ├── file2.cr
│   │   │   └── bar.cr
│   │   └── baz
│   │       ├── file3.cr
│   │       ├── file4.cr
│   │       └── baz.cr
│   └── foo.cr

EDIT: I would like to set that require "foo/bar" don't require file2.cr, which is not posible with require "foo/bar/*", but it is with require "./file1" inside bar.cr

I think require "foo/bar" should search:
i. foo/bar.cr
ii. foo/bar/bar.cr
iii. foo/src/foo/bar.cr (was foo/src/bar.cr)
iv. foo/src/foo/bar/bar.cr (was foo/src/foo/bar.cr)

For iii you already said requiring foo/src/bar.cr is not needed.

@ysbaddaden
Copy link
Contributor

@asterite dropping the namespace under src is exactly my point. Sure, we namespace our code, but the shard name is already the namespace, and contrary to Ruby we don't have to namespace again. Read my comment again: this is merely a Ruby habit because of the way Rubygems works. Crystal doesn't need it.

The iv. example is because we support bar/bar.cr and it would be nice if it worked inside shards too. For example: require "frost/view" loads libs/frost/src/view/view.cr and require "frost/record" loads libs/frost/src/record/record.cr.

@ysbaddaden
Copy link
Contributor

@cjgajard having to put everything under src/frost would be terrible: the src folder would now contain a single folder frost (thought not as terrible as Java). This is the exact opposite of my point.

@asterite
Copy link
Member Author

@cjgajard I understand what you mean now.

When we were designing how require would work we decided to put shards inside libs/shard_name/ and so we added the rule that require "shard_name" would search shard_name/shard_name.cr in CRYSTAL_PATH so it could be found, because libs was in CRYSTAL_PATH.

We then though "Hm, since we have that rule, we could (ab)use it in the standard library", and so there's no src/ecr.cr, but we have src/ecr/ecr.cr, and because the src directory of the standard library is in the path, it works. But I think we shouldn't abuse that rule.

So, we can make require "foo" search:

  1. foo.cr (to find something in the standard library)
  2. foo/src/foo.cr (to find something in a shard)

We don't search for foo/foo.cr anymore, because that rule was for searching inside a shard given a structure we defined. (we of course fix the standard library source code to adapt for this change, but shards remain the same)

@ysbaddaden We then have to decide how to resolve require "foo/bar". I don't mind having a shard_name directory inside src, but it's true that it feels redundant because the shard name is already namespacing everything. When browsing some code in GitHub it's also a bit painful to go inside that directory and then come out because code is split between src/foo.cr and src/foo/other_files, so having everything inside src is more convenient.

So, we can make require "foo/bar" search:

  1. foo/bar.cr (to find something in the standard library)
  2. foo/src/bar.cr (to find something in a shard)

However, it's a bit odd that then doing require "foo/foo" would search foo/foo.cr and foo/src/foo.cr, basically finding the same as require "foo" for the case of a shard. So maybe keeping the namespace inside the shard isn't a bad idea.

Alternatively, we can make require "foo/bar" search:

  1. foo/bar.cr (to find something in the standard library)
  2. foo/src/bar.cr (to find something in a shard, non-namespaced structure)
  3. foo/src/foo/bar.cr (to find something in a shard, namespaced structure)

I don't think 2 and 3 conflict, it should not be common to have those two files.

We can adopt the non-namespaced structure, making crystal init do that, and eventually drop rule 3.

Just to clarify, require "foo/bar/baz/qux" would search:

  1. foo/bar/baz/qux.cr (to find something in the standard library)
  2. foo/src/bar/baz/qux.cr (to find something in a shard, non-namespaced structure)
  3. foo/src/foo/bar/baz/qux.cr (to find something in a shard, namespaced structure)

That is, only the first component foo is also searched as foo/src or foo/src/foo.

But what should we do? Should we encourage a namespaced structure inside shards, or a non-namespaced structure?

@DougEverly
Copy link
Contributor

Seems require "string" where string is relative path or shard is a bit overloaded. Some ideas

  • require "file" => ./file.cr, then CRYSTAL_PATH/file.cr
  • require_shard "shard" => shard/src/shard.cr
  • require_shard "shard/file" => shard/src/shard/file.cr
  • require "file", from: "shard" => shard/src/shard/file.cr
  • require shard: "shard" => shard/src/shard.cr
  • require shard("shard") => shard/src/shard.cr
  • require shard("shard", "file") => shard/src/shard/file.cr

@jhass
Copy link
Member

jhass commented May 31, 2016

Another thing to keep in mind while we're touching this is #321

@asterite
Copy link
Member Author

@jhass Yes, I'd like to do that too. I think that's just a matter of adding both lib and libs for a while in CRYSTAL_PATH and eventually remove libs, though shards will have to adjust to that.

@DougEverly I'd like to keep require simple. To require relative to the current file one now has to do require "./foo". That avoids a few conflicts and make things more explicit. Then if you need something provided either by the standard library or by a shard (but you maybe don't care from where it comes from) you can do require "some_shard" and that's it. Most of the users will just use that form. If some file inside a shard is needed then you can do require "some_shard/some_file" and that's it. I think adding multiple require forms adds more names and syntax to keep in mind.

@ysbaddaden
Copy link
Contributor

  • @asterite the 3 rules are good! we can decide whether we support the rule 3 (src/name) or not; I'd say no, but as long as the non namespaced rule (2) continues to work, I'm good.
  • @jhass yes, let's change libs to lib, it's merely changing a constant in Shards.

@asterite
Copy link
Member Author

@jhass what's your thought on namespacing vs. not namespacing?

@jhass
Copy link
Member

jhass commented May 31, 2016

Albeit not a super strong one I have a tendency towards doing it, never liked the foo/foo.cr for require "foo" in stdlib either. Generally I like the Rails convention of namespace -> file path conversion. Having to sometimes arbitrarily insert a src into the path breaks that.

@asterite
Copy link
Member Author

@jhass I mean, should we layout things like:

  • src
    • foo
      • other.cr
      • version.cr
    • foo.cr

Or like:

  • src
    • other.cr
    • version.cr
    • foo.cr

@jhass
Copy link
Member

jhass commented May 31, 2016

I understood that, hence my last sentence.

@asterite
Copy link
Member Author

Oh, I misread you. Well, for now we can support both forms and leave that decision for later.

@ozra
Copy link
Contributor

ozra commented Jun 2, 2016

I use both styles.

  • For a concise shard I'd let the proj-dir itself serve as "name-space reflection". Adding an additional dir inside is javaesque, as already noted.
  • When there are many sub-dirs and modules. I prefer putting files inside the subdirs, for file-navigation reasons and reducing the mayhem in the proj-src-dir.

Oh, I always use a "src/"-dir of course!

So, in short: I really like the proposed steps of look up.

@jhass makes a good point regarding the shadowing, which could happen when refactoring, and hitting save in an editor view pointing to old location or so), I think therefore all the locations should be checked for each require, and if found in more than one of the places: error! ("blabla.cr" shadows "blabla/blabla.cr". "blabla.cr" seems newer - resolve the conflict now before continuing. - or something like that.) Then it will be caught on first compile after the mistake has been made. (To get fancy this integrity check could be run in parallel with the compile process to not slow down compilation waiting for IO, so it just starts off with the first founds.)

@ozra
Copy link
Contributor

ozra commented Jun 2, 2016

Oh, and libs to lib - 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants