New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Webpack source code more beginner-friendly #824

Open
gaearon opened this Issue Feb 25, 2015 · 14 comments

Comments

Projects
None yet
@gaearon
Copy link
Contributor

gaearon commented Feb 25, 2015

I understand this is a very controversial topic and I don't want anyone, especially @sokra, to feel offended. Please feel free to close if you consider my request inappropriate.

I was having a Twitter conversation with @markdalgleish about Webpack, and we both share the view that Webpack's source can be hard to read. (example)

The design is outstanding, and this allows each module to be written as a plugin responding to several compiler events, but on the other hand most of the code is written as nested anonymous functions instead of compiler event handlers, with little outside functions. To outsiders looking to tweak or understand Webpack, myself included, this proves to be a problem, and it's, in my opinion, an unnecessary barrier to adoption and outside contributions.

Personally, I'd find it easier to comprehend if there was little to no nesting in the compiler plugins, and anonymous functions were named and extracted. This would also solve the problem of deep conditions and loops. A potential small upside is improving performance / memory usage by avoiding too much closures. We made a similar refactoring in @babel, so maybe @sebmck can shed some light on whether this helped make codebase easier to maintain and understand.

As a style nit, I'd also consider converting from tabs to spaces (not kidding) to improve Github display. Yes, I do believe it's important and pragmatic issue: Github chooses 8 spaces for displaying tabs, whereas the most widespread JavaScript standard, used by React and a ton other libs, is 2 spaces. This aids in preventing alienation for developers taking a peek at how everything works. IMO it's part of the experience and unless we can persuade Github to make “tab size” a project preference, it's worth doing.

I wonder how much of this is intentional, whether you agree with me—or any feedback really.
Thanks for considering this!

@kittens

This comment has been minimized.

Copy link

kittens commented Feb 25, 2015

A potential small upside is improving performance / memory usage by avoiding too much closures. We made a similar refactoring in @babel, so maybe @sebmck can shed some light on whether this helped make codebase easier to maintain and understand.

Not sure if the same paradigm could be applied to webpack very easily. I'm entirely ignorant about the internals so this will be the only thing I say on the matter. It did make it easier to maintain since traversal logic was abstracted out of the transformation logic in a really nice and clean way (thanks for that @gaearon!) part of what made this easy was that a lot of logic was already abstracted away so abstracting out more was pretty seamless.

@gaearon

This comment has been minimized.

Copy link
Contributor

gaearon commented Feb 25, 2015

part of what made this easy was that a lot of logic was already abstracted away so abstracting out more was pretty seamless.

This is how I feel about Webpack. Module boundaries are separated in a very good way in Webpack source, and I wish inter-module functional boundaries were the same.

@briandipalma

This comment has been minimized.

Copy link

briandipalma commented Feb 26, 2015

You just need to append ?ts=4 to a URL and GitHub will display tabs sensibly. Tabs have semantic value in indentation terms but spaces have none. The more useful change would be to refactor the code.

@deepsweet

This comment has been minimized.

Copy link
Contributor

deepsweet commented Feb 26, 2015

+1.

also tools like ESLint and JSCS can be very useful here.

@gaearon

This comment has been minimized.

Copy link
Contributor

gaearon commented Feb 26, 2015

You just need to append ?ts=4 to a URL and GitHub will display tabs sensibly. Tabs have semantic value in indentation terms but spaces have none.

I see where you're coming from and I totally get this perspective. However somebody choosing between Browserify + plugins versus Webpack will likely not append anything to the URL. They'd look in the source, and make up their first impression about whether it is easy or difficult to extend, debug or tweak it. Anything we can do to make that first impression feel more familiar and easy, is worth doing IMO. There are of course downsides, such as screwing up git blame and existing pull requests. So I'm not saying it's an easy decision to take, or that there is a single right way. Perhaps you're right and it's not worth it.

@sokra

This comment has been minimized.

Copy link
Member

sokra commented Feb 27, 2015

I don't want to start a tabs vs. spaces discussion. Complain to Github that they display tabs so badly, no other code editor uses 8 space tabs.

@gaearon I aggree that some files could be easier to read with better code style, (The posted example doesn't look so bad... It only has 140 LOC ;) ) but I hardly find the time to answer all the issues and chat messages... (I already started to wait some time, most issues solve themselve after a while hihi)... If you want to add some comments feel free to send a PR :p

@jhnns

This comment has been minimized.

Copy link
Member

jhnns commented Mar 2, 2015

I have to agree with @gaearon: I also have difficulties to understand the source code. 😁

@niieani

This comment has been minimized.

Copy link
Contributor

niieani commented Mar 25, 2017

Given #3213 was closed in favor of this one, I'm just going to copy and paste my comment over here so it doesn't get lost:

I'm currently writing a couple of loaders and plugins for Webpack. Since the code lacks any comments and the developer docs only cover the basics (if that), I find myself browsing the codebase and stepping through various parts. @TheLarkInn's developer kit helps, but it's a really tedious and a mind-boggling job.

Two main reasons why understanding the code is difficult:

  1. mutability - internal plugins modify the objects all the time so without stepping-through/tracing every single Tapable plugin/callback, you have no any idea about the contents of a given object at any single point in the code; there are no "code-contracts" so to speak -- anything goes.
  2. callback hell. The Tapable plugin system is both a blessing and a curse, but its main problem is you can never see the dataflow. I'm just constantly thinking in the lines of: That goes... where and when, invoked by what... oh, invoked what next?

I'd like to propose a gradual rewrite of all the callback code - either to Promises or a stream library (like RxJS). A stream/observable library might be better suited to tapable's multi-output callback chains, but given that the Promise-enhancing await-async proposal is stage-3 and Node 7 already shipped support for it (behind a flag), either option would make understanding everything so much easier. By gradual rewrite, I really mean gradual - one method at a time. Since it's all callbacks, the code could be in a mixed state for some time, until all of it is rewritten.

Mutability / lack of contracts is a harder problem to tackle. I guess a good start would be either to stop passing/binding this around, or at least ensure this isn't modified outside of the method, and instead have those callbacks actually return the pure changes that need to be applied to the original object. So the actual changes to this would be done AFTER the callback, in the method that called it using something like Object.assign. In other words, have a rule that any tapable plugin can only mutate its own objects, while the callback-caller handles the mutation of its own objects based on the values returned from the callback.

This way it will be much easier to debug and understand what modifications were made by which plugin and at which point in the code. You'd set a breakpoint in the "higher-up" method that invokes the callbacks and simply see what it returns, instead of stepping through ALL the callbacks (which can get really tedious, like when debugging enhanced-resolve where the number of callbacks for a single resolve can be 30 or more).

I'm not saying these are the only options, or that this is exactly what has to be done. Just wanted to start a discussion regarding the Developer Experience.

I think good, self-documenting code is much better than a good documentation. Mainly because when you've got clear code, anybody can write great docs for it! I often see Sean trying to motivate people to write Webpack docs, and I'm always thinking like... I'd love to help out, but can't dig into the more advanced concepts, because the big-picture of the code is just so damn hard to read.

But please treat the above simply as some of my thoughts after digging through the codebase. In a recent Medium post I saw a TypeScript rewrite is planned - I guess that will help with the contract part too.

@jhnns

This comment has been minimized.

Copy link
Member

jhnns commented Mar 25, 2017

I agree with both points, though I don't have a big problem with the "callback hell". Mutability is a bigger issue for me. It's impossible to infer the actual shape of any object from the source code, because they are extended all the time.

@TheLarkInn

This comment has been minimized.

Copy link
Member

TheLarkInn commented Aug 23, 2017

Hmm we could at least start flattening and hoisting the hell out of as many plugins as possible. And that is super easy and awesome way to give new contributors a chance to land some commits. Let's freaking do this. Maybe then @kittens will consider contributing 😏

@niieani

This comment has been minimized.

Copy link
Contributor

niieani commented Aug 23, 2017

@TheLarkInn Isn't mutability the bigger problem though? Hoisting plugins might be okay, in case of single/very short ones, but really, it's only moving things from one file to another, so the usefulness of such a change is really limited. The fact that debugger jumps to a different line, rather than to a different file isn't really helpful, yet it still introduces breaking changes.

The problem isn't that we have hundreds of small plugins, but it is that changes resulting from plugins being executed are non-inspectable (without stepping through each and every one of them, which can take hours).

In other words, given a function that runs plugins -- in its first line of code you might have some state of the app, and in the second line of code (after running a plugin), the state will have magically changed (or not! no way to tell). There's no inspectable object with the list of changes that are to be applied, since side effects are not isolated and plugins are allowed to mutate everything and anything they want.

If instead we had APIs that aggregate/schedule changes, we could not only have better clarity of what is going on, but we would also be able to optimize/batch those changes in a meaningful way.

@TheLarkInn

This comment has been minimized.

Copy link
Member

TheLarkInn commented Aug 23, 2017

I don't disagree with you whatsoever. One of the things mentioned could start happening today like hosting functions out of nested scopes etc. The other takes careful planning and consideration that we're getting closer to. @sokra and I have both discussed rewriting tapable' APIS to be far more friendly.

@TheLarkInn

This comment has been minimized.

Copy link
Member

TheLarkInn commented May 4, 2018

Some updates that I think have potentially improve the readability and friendliness of the source:

  • Adding JSDoc Annotations that will TypeCheck the codebase (yarn type-lint). With VS Code, you can see all of the Type Intellisense as you go as well (just use local TSDK). Increasing ease of codebase understanding
  • We have added Prettier for easier formatting, now no one has to worry about formatting as we also correct it on commit
  • Upgraded Tapable 1.0 which now defines hooks statically (so they can be visible at build time) which is far easier to track down what events trigger what.

Things that could still be done:

  • Untangle the Module Graph traversal code (essentially starting at Compilation#addEntry). For @kittens @gaearon
  • Fully type with JS Doc Annotations (including implicitAny and strictNullChecks)
  • Add your feedback I'm sure and submit PR to help!
  • Separate Graph => Optimize => Render into three larger distinct chunks
  • Organize files easier into Deps, Module Types, Utils, etc.
@montogeek

This comment has been minimized.

Copy link
Member

montogeek commented May 4, 2018

@TheLarkInn Do you know if it is possible to get a graph from TS annotations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment