Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline module syntax is confusing when learning modules #1904

Closed
kornelski opened this issue Apr 8, 2019 · 9 comments
Closed

Inline module syntax is confusing when learning modules #1904

kornelski opened this issue Apr 8, 2019 · 9 comments

Comments

@kornelski
Copy link
Contributor

https://doc.rust-lang.org/book/ch07-02-modules-and-use-to-control-scope-and-privacy.html

This chapter introduces modules first by showing the inline module syntax. I understand that this syntax is easier to demonstrate in the book than separate files. However, I think it's only convenient for the teacher, not for the student:

The subchapter also starts with inline modules, which makes the examples look contrived.

https://internals.rust-lang.org/t/data-point-about-the-new-module-system-learnability-and-musings-about-language-stability/9770/27

Looks, but doesn't work like, namespace/package syntax in other languages

This syntax has a potential to be more confusing than helpful. The mod foo {…} statement is syntactically very similar to namespace foo {…} in C++ or PHP, but has very different semantics. If the reader has intuition based on these languages, they will be utterly confused that mod doesn't work the way they expect.

Rust's modules are different from namespaces in the other languages in the sense that a "parent" file names and declares existence of a "child" module ("I have a module X"). In C++, PHP and Go the "child" file itself declares that it's a module ("I'm a module X"). This is a common point of confusion, and the inline syntax muddles it.

File-based example still uses inline syntax

The first example about splitting into separate file mixes inline and file-based modules. This requires paying special attention to the fact that there were multiple modules, and only one of them has changed.

It can be easily misinterpreted as the C++/PHP/Go-like syntax which requires module to declare itself in the file.

Filename: src/sound.rs

pub mod instrument {
    pub fn clarinet() {
        // Function body code goes here
    }
}

For example, in Go, a very similar syntax could be a valid and idiomatic usage:

Filename: src/sound.go

package instrument;

func Clarinet() {
    // Function body code goes here
}

but has quite different meaning from Rust's! This false analogy is quite dangerous, because the reader may think "aha! I get it now!" and totally not get it.

To reiterate, there's a subtle difference how the syntax splits modules into files in Rust and other languages.

I'm marking - (red) lines that are moved to an external file. Rust's way:

+// Outside module
+pub mod instrument {
-    // Inside module
-    pub fn clarinet() {
-        
-    }
+}

vs C++/PHP/Go way:

+// Outside module
-pub mod instrument {
-    // Inside module
-    pub fn clarinet() {
-        
-    }
-}

and this chapter uses an unfortunate syntax and examples that don't clarify that point of confusion.

Shows uncommon case, delays most common

Apart from an occasional mod test {}, I don't see many inline modules used in Rust code. The modules are overwhelmingly a way to split project into separate files. But the reader isn't warned about that, and file separation is mentioned last.


I think it'd be better to explain modules with separate files from the start, so that C++/Go users will see that mod is closer to being #include/import, rather than namespace/package constructs. Neither analogy is very good, but the former is less harmful to understanding than the latter. I'd probably go as far as not showing the inline syntax at all, apart from some advanced chapter or appendix.

@dureuill
Copy link

I'd probably go as far as not showing the inline syntax at all, apart from some advanced chapter or appendix.

I don't know about this. Inline modules are a somewhat niche case, but also one that appears in all cargo-generated projects through the test module. It is also incredibly convenient to use in playground (which doesn't provide a way to have multiple files AFAIK) when you want to test things about privacy (it was a useful tool for self learning modules through small examples).
I think it deserves at least a mention in a dedicated paragraph (mentioning these two use cases).

Maybe we should be more upfront about the various ways modules are instantiated? Presenting the three ways (inline definition; explicit declaration, implicit definition in a file; explicit declaration, implicit definition as a directory) and their typical use cases (respectively: unit tests; general project organization; ditto, but with submodules), before we even go to how modules control privacy and appear in paths? With that out of the way, we could move forward and say something to the effect of "in the wild, you will generally see modules defined in files, but in this book and other online resources (e.g. playground) it is more convenient to use inline modules to explain how modules work, as it allows us to stay inside of a single file."

Maybe it generally wouldn't hurt to also draw a comparison to similar concepts in other languages (although I don't know if the book does this for other areas and if it is a good practice for the book), to especially emphasize that modules are somewhat different from these concepts. There are few obstacles I stumbled upon when first learning modules (mostly the "Things to keep in mind about modules" in this message) because it was defying my expectations coming from other languages. Some of these, and others bullet points could maybe be summarized somewhere in the page? Again, I don't know if this kind of "common mistakes" paragraph is a good fit for the book, though.

@kornelski
Copy link
Contributor Author

I haven't seen people struggle with privacy that much. Rust's pub is also unique, but it seems to be easy to explain that pub in a private module is still inaccessible. People are a bit surprised about pub(crate) syntax, but not about its meaning, so that's fine. Privacy in general is not a big problem — the compiler will tell exactly what is too private. What is too public can be seen in rustdoc.

But every time I teach modules, I see people really struggle with mod and its relationship with use. These two constructs appear too similar to package/namespace and require/include/import, but they don't quite behave like that.

My hypothesis is that it's easier to teach files first, and then inline modules. The split into files is the hard part, because it needs understanding that mod is not inside, but outside of the file.

Once you know how modules work, then seeing mod foo { code } is easy to figure out — you just put the file inline. The other way around is not easy to guess, because of the ambiguity where parent file ends and child file starts.

@carols10cents
Copy link
Member

pub mod instrument {
    pub fn clarinet() {
...

Please note that this chapter has had revisions and no longer contains the instrument example-- I'd love to hear opinions on the latest revision readable at https://doc.rust-lang.org/nightly/book/ch07-00-managing-growing-projects-with-packages-crates-and-modules.html

Maybe it generally wouldn't hurt to also draw a comparison to similar concepts in other languages (although I don't know if the book does this for other areas and if it is a good practice for the book)

There are a few places we mention other languages, but we're trying not to make too many assumptions about which programming language the reader comes from. This is definitely a challenge.

@kornelski
Copy link
Contributor Author

kornelski commented Apr 12, 2019

The new revision of the chapter IMHO still has the same overall issues: it still uses inline modules as if it was a normal thing to do, and still leaves splitting into files to last. And the multi-file example still includes mod foo {} in an external file, which may be mistaken for Go/PHP/C++-like declaration model.

This sentence strikes me as particularly odd:

When modules get large, you might want to move their definitions to a separate file

I haven't seen people actually start with an inline module and grow it. To me having the project structure reflected in the filesystem is not merely an unfortunate necessity caused by files getting too big, it's a goal in itself.

The book puts a lot of focus on privacy around modules, but to me this is an entirely different concern, of lower priority even. For example, in JS where the privacy basically doesn't exist, people still split into files as a way to organize the project:

https://github.com/webpack/webpack/blob/master/lib/OptionsApply.js

even when the file is totally unnecessary (and could have been inline):

https://github.com/webpack/webpack/blob/master/lib/MemoryOutputFileSystem.js

So my perspective is I want files, I have files, and how do I make the module system load my files? It's entirely opposite way of thinking from the book's approach of having modules, privacy boundaries and maybe splitting them also as files.

@kornelski
Copy link
Contributor Author

I've grepped all .rs files in all crates from crates.io:

  • stm* crates auto-generate tons of inline modules, majority of them are empty mod RW {} (440216 modules)
  • 38661 inline modules are mod test(s) {}
  • Excluding the tests and STM crates, of the remaining 338271 modules, 6% are inline.

There is a mod tests {} pattern showing. It's probably due to Cargo's template, but unfortunately I don't have data to check how popular it is compared to external test files and test cases outside of a dedicated module.

Inline mod seems to be useful in auto-generated code and macros. So I think it'd be fine to classify this as advanced usage.

But of general usage, 94% of module uses are to split code into files.

@kornelski
Copy link
Contributor Author

kornelski commented Apr 12, 2019

The new file example is close to the thing I've struggled the most with:

mod front_of_house;

pub use crate::front_of_house::hosting;

because it sort-of implies that you have to have a use statement to make use of a module, and doesn't show that it's not necessary (forbidden even!) because mod also brings the module name into scope (i.e. that fn main() {front_of_the_house::something()} would work).

I didn't get that when I was learning modules. I've only literally "understood" that mod just makes modules (not involving scope), and use brings them into scope, so I kept trying things like:

mod front_of_house; // create the module
use front_of_house; // now make it in scope

and was flabbergasted why the compiler doesn't like it, so I've tried deleting mod:

use front_of_house; 

and that didn't work either, so then I've replaced all use with mod

in lib.rs:

mod front_of_house; 

and because that finally worked, I also tried keep using the "working" version also in backyard.rs:

mod front_of_house; 

and again was frustrated that the compiler can't find it, while the exact same syntax worked just a second ago!

I had mod foo that wouldn't work, but the mod foo {} syntax always works everywhere. That made me feel cheated, as if the book was omitting the hardest part I was struggling with.


Now I have the curse of knowledge so I can't really tell if I'd learn modules from this chapter, but I'm not seeing it address the points of confusion I remember I had.

And I remember being really angry at inline modules, because I had no idea how they map to files, and I felt the documentation was about oranges, and I wanted apples.

@martin-t
Copy link

I agree that modules would be easier to understand if they were explained files-first. I saw that you were unable (and unwilling) to rewrite this part of the book - why is that?

Meanwhile there are a few gradual improvements I think would at least somewhat help:

  • The intro paragraph in ch7.2 should mention mod as well alongside the other keywords. It's equally important but only gets introduced later.
  • The summary in ch7.5 again doesn't mention mod at all. IMO, it should (re)state things explicitly:
    • mod has 2 forms - defining a module inline or declaring that the module is in another file
    • there's no implicit connection between the filesystem and modules - they all need to be declared with mod in one of the already "known" modules
    • mod also brings the module into scope so you don't need use afterwards

@steveklabnik
Copy link
Member

I'm going to give this a close for the same reason; we are not revisiting this chapter anytime soon, and if we do, I will be coming back to this thread. There's no reason to leave the issue open though.

@kornelski
Copy link
Contributor Author

Another case where a user misunderstood that chapter, and used inline syntax where it didn't make sense:

https://users.rust-lang.org/t/modules-with-multiple-files/54752

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants