-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathfrontend-build-systems.tex
More file actions
336 lines (257 loc) · 19.6 KB
/
frontend-build-systems.tex
File metadata and controls
336 lines (257 loc) · 19.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
\documentclass{article}
\usepackage{common}
\title{Frontend Build Systems}
\begin{document}
Developers write JavaScript; browsers run JavaScript. Fundamentally, no build step is necessary in
frontend development. So why do we have a build step in modern frontend?
As frontend codebases grow larger, and as developer ergonomics become more important, shipping
JavaScript source code directly to the client leads to two primary problems:
\begin{enumerate}
\item \tb{Unsupported Language Features:} Because JavaScript runs in the browser, and because
there are many browsers out there of a variety of versions, each language feature you use
reduces the number of clients that can execute your JavaScript. Furthermore, language extensions
like JSX are not valid JavaScript and will not run in any browser.
\item \tb{Performance:} The browser must request each JavaScript file individually. In a large
codebase, this can result in thousands of HTTP requests to render a single page. In the past,
before HTTP/2, this would also result in thousands of TLS handshakes.
In addition, several sequential network round trips may be needed before all the JavaScript is
loaded. For example, if \tc{index.js} imports \texttt{page.js} and \texttt{page.js} imports
\tc{button.js}, three sequential network round trips are necessary to fully load the JavaScript.
This is called the waterfall problem.
Source files can also be unnecessarily large due to long variable names and whitespace
indentation characters, increasing bandwidth usage and network loading time.
\end{enumerate}
Frontend build systems process source code and emit one or more JavaScript files optimized for
sending to the browser. The resulting \ti{distributable} is typically illegible to humans.
\section{Build Steps}
Frontend build systems typically consist of three steps: transpilation, bundling, and minification.
Some applications may not require all three steps. For example, smaller codebases may not require
bundling or minification, and development servers may skip bundling and/or minification for
performance. Additional custom steps may also be added.
Some tools implement multiple build steps. Notably, bundlers often implement all three steps, and a
bundler alone may be sufficient to build straightforward applications. Complex applications may
require specialized tools for each build step that provide larger feature sets.
\subsection{Transpilation}
Transpilation solves the problem of unsupported language features by converting JavaScript written
in a modern version of the JavaScript standard to an older version of the JavaScript standard. These
days, ES6/ES2015 is a common target.
Frameworks and tools may also introduce transpilation steps. For example, the JSX syntax must be
transpiled to JavaScript. If a library offers a Babel plugin, that usually means that it requires a
transpilation step. Additionally, languages such as TypeScript, CoffeeScript, and Elm must be
transpiled to JavaScript.
\href{https://wiki.commonjs.org/wiki/Modules}{CommonJS modules} (CJS) must also be transpiled to a
browser-compatible module system. After browsers added widespread support for
\href{https://exploringjs.com/es6/ch_modules.html}{ES6 Modules} (ESM) in 2018, transpilation to ESM
has generally been recommended. ESM is furthermore easier to optimize and
\hl{tree-shaking}{tree-shake} since its imports and exports are statically defined.
The transpilers in common use today are Babel, SWC, and TypeScript Compiler.
\begin{enumerate}
\item \href{https://babeljs.io/}{\tb{Babel}} (2014) is the standard transpiler: a slow
single-threaded transpiler written in JavaScript. Many frameworks and libraries that require
transpilation do so via a Babel plugin, requiring Babel to be part of the build process.
However, Babel is hard to debug and can often be confusing.
\item \href{https://swc.rs/}{\tb{SWC}} (2020) is a fast multi-threaded transpiler written in Rust.
It claims to be 20x faster than Babel; hence, it is used by the newer frameworks and build
tools. It supports transpiling TypeScript and JSX. If your application does not require Babel,
SWC is a superior choice.
\item \href{https://github.com/microsoft/TypeScript}{\tb{TypeScript Compiler (tsc)}} also supports
transpiling TypeScript and JSX. It is the reference implementation of TypeScript and the only
fully featured TypeScript type checker. However, it is very slow. While a TypeScript application
must typecheck with the TypeScript Compiler, for its build step, an alternative transpiler will
be much more performant.
\end{enumerate}
It is also possible to skip the transpilation step if your code is pure JavaScript and uses ES6
Modules.
An alternative solution for a subset of unsupported language features is a polyfill. Polyfills are
executed at runtime and implement any missing language features before executing the main
application logic. However, this adds runtime cost, and some language features cannot be polyfilled.
See \href{https://github.com/zloirock/core-js}{core-js}.
All bundlers are also inherently transpilers, as they parse multiple JavaScript source files and
emit a new bundled JavaScript file. When doing so, they can pick which language features to use in
their emitted JavaScript file. Some bundlers are additionally capable of parsing TypeScript and JSX
source files. If your application has straightforward transpilation needs, you may not need a
separate transpiler.
\subsection{Bundling}
Bundling solves the need to make many network requests and the waterfall problem. Bundlers
concatenate multiple JavaScript source files into a single JavaScript output file, called a bundle,
without changing application behavior. The bundle can be efficiently loaded by the browser in a
single round-trip network request.
The bundlers in common use today are Webpack, Parcel, Rollup, esbuild, and Turbopack.
\begin{enumerate}
\item \href{https://webpack.js.org/}{\tb{Webpack}} (2014) gained significant popularity around
2016, later becoming the standard bundler. Unlike the then-incumbent Browserify, which was
commonly used with the Gulp task runner, Webpack pioneered ``loaders'' that transformed source
files upon import, allowing Webpack to orchestrate the entire build pipeline.
Loaders allowed developers to transparently import static assets inside JavaScript files,
combining all source files and static assets into a single dependency graph. With Gulp, each
type of static asset had to be built as a separate task. Webpack also supported
\hl{code-splitting}{code splitting} out of the box, simplifying its setup and configuration.
Webpack is slow and single-threaded, written in JavaScript. It is highly configurable, but its
many configuration options can be confusing.
\item \href{https://rollupjs.org/}{\tb{Rollup}} (2016) capitalized on the widespread browser
support of ES6 Modules and the optimizations it enabled, namely \hl{tree-shaking}{tree shaking}.
It produced far smaller bundle sizes than Webpack, leading Webpack to later adopt similar
optimizations. Rollup is a single-threaded bundler written in JavaScript, only slightly more
performant than Webpack.
\item \href{https://parceljs.org/}{\tb{Parcel}} (2018) is a low-configuration bundler designed to
``just work'' out of the box, providing sensible default configurations for all steps of the
build process and developer tooling needs. It is multithreaded and much faster than Webpack and
Rollup. Parcel 2 uses SWC under the hood.
\item \href{https://esbuild.github.io/}{\tb{Esbuild}} (2020) is a bundler architected for
parallelism and optimal performance, written in Go. It is dozens of times more performant than
Webpack, Rollup, and Parcel. Esbuild implements a basic transpiler as well as a minifier.
However, it is less featureful than the other bundlers, providing a limited plugin API that
cannot directly modify the AST. Instead of modifying source files with an esbuild plugin, the
files can be transformed before being passed to esbuild.
\item \href{https://turbo.build/pack}{\tb{Turbopack}} (2022) is a fast Rust bundler that supports
incremental rebuilds. The project is built by Vercel and led by the creator of Webpack. It is
currently in beta and may be opted into in Next.js.
\end{enumerate}
It is reasonable to skip the bundling step if you have very few modules or have very low network
latency (e.g. on localhost). Several development servers also choose not to bundle modules for the
development server.
\hypertarget{code-splitting}{
\subsubsection{Code Splitting}
}
By default, a client-side React application is transformed into a single bundle. For large
applications with many pages and features, the bundle can be very large, negating the original
performance benefits of bundling.
Dividing the bundle into several smaller bundles, or \ti{code splitting}, solves this problem. A
common approach is to split each page into a separate bundle. With HTTP/2, shared dependencies may
also be factored out into their own bundles to avoid duplication at little cost. Additionally, large
modules may split into a separate bundle and lazy-loaded on-demand.
After code splitting, the filesize of each bundle is greatly reduced, but additional network round
trips are now necessary, potentially re-introducing the waterfall problem. Code splitting is a
tradeoff.
The filesystem router, popularized by Next.js, optimizes the code splitting tradeoff. Next.js
creates separate bundles per page, only including the code imported by that page in its bundles.
Loading a page preloads all bundles used by that page in parallel. This optimizes bundle size
without re-introducing the waterfall problem. The filesystem router achieves this by creating one
entry point per page (\tc{pages/**/*.jsx}), as opposed to the single entry point of traditional
client-side React apps (\tc{index.jsx}).
\hypertarget{tree-shaking}{
\subsubsection{Tree Shaking}
}
A bundle is composed of multiple modules, each of which contains one or more exports. Often, a given
bundle will only make use of a subset of exports from the modules it imports. The bundler can remove
the unused exports of its modules in a process called \ti{tree shaking}. This optimizes the bundle
size, improving loading and parsing times.
Tree shaking depends on static analysis of the source files, and is thus impeded when static
analysis is made more challenging. Two primary factors influence the efficiency of tree shaking:
\begin{enumerate}
\item \tb{Module System:} ES6 Modules have static exports and imports, while CommonJS modules have
dynamic exports and imports. Bundlers are thus able to be more aggressive and efficient when
tree shaking ES6 Modules.
\item \tb{Side Effects:} The \tc{sideEffects} property of \texttt{package.json} declares whether a
module has side effects on import. When side effects are present, unused modules and unused
exports may not be tree shaken due to the limitations of static analysis.
\end{enumerate}
\subsubsection{Static Assets}
Static assets, such as CSS, images, and fonts, are typically added to the distributable in the
bundling step. They may also be optimized for filesize in the minification step.
Prior to Webpack, static assets were built separately from the source code in the build pipeline as
an independent build task. To load the static assets, the application had to reference them by their
final path in the distributable. Thus, it was common to carefully organize assets around a URL
convention (e.g. \tc{/assets/css/banner.jpg} and \texttt{/assets/fonts/Inter.woff2}).
Webpack ``loaders'' allowed the importing of static assets from JavaScript, unifying both code and
static assets into a single dependency graph. During bundling, Webpack replaces the static asset
import with its final path inside the distributable. This feature enabled static assets to be
organized with their associated components in the source code and created new possibilities for
static analysis, such as detecting non-existent assets.
It is important to recognize that the importing of static assets
(non-JavaScript-or-transpiles-to-JavaScript files) is not part of the JavaScript language. It
requires a bundler configured with support for that asset type. Fortunately, the bundlers that
followed Webpack also adopted the ``loaders'' pattern, making this feature commonplace.
\subsection{Minification}
Minification resolves the problem of unnecessarily large files. Minifiers reduce the size of a file
without affecting its behavior. For JavaScript code and CSS assets, minifiers can shorten variables,
eliminate whitespace and comments, eliminate dead code, and optimize language feature use. For other
static assets, minifiers can perform file size optimization. Minifiers are typically run on a bundle
at the end of the build process.
Several JavaScript minifiers in common use today are Terser, esbuild, and SWC.
\href{https://terser.org/}{\tb{Terser}} was forked from the unmaintained uglify-es. It is written in
JavaScript and is somewhat slow. \tb{Esbuild} and \textbf{SWC}, mentioned previously, implement
minifiers in addition to their other capabilities and are faster than Terser.
Several CSS minifiers in common use today are cssnano, csso, and Lightning CSS.
\href{https://cssnano.github.io/cssnano/}{\tb{Cssnano}} and
\href{https://github.com/css/csso}{\tb{csso}} are pure CSS minifiers written in JavaScript and thus
somewhat slow. \href{https://lightningcss.dev/}{\tb{Lightning CSS}} is written in Rust and claims to
be 100x faster than cssnano. Lightning CSS additionally supports CSS transformation and bundling.
\section{Developer Tooling}
The basic frontend build pipeline described above is sufficient to create an optimized production
distributable. There exist several classes of tools that augment the basic build pipeline and
improve upon developer experience.
\subsection{Meta-Frameworks}
The frontend space is notorious for the challenge of picking the ``right'' packages to use. For
example, of the five bundlers listed above, which should you pick?
Meta-frameworks provide a curated set of already selected packages, including build tools, that
synergize and enable specialized application paradigms. For example,
\href{https://nextjs.org}{\tb{Next.js}} specializes in Server-Side Rendering (SSR) and
\href{https://remix.run}{\tb{Remix}} specializes in progressive enhancement.
Meta-frameworks typically provide a preconfigured build system, removing the need for you to stitch
one together. Their build systems have configurations for both production and development servers.
Like meta-frameworks, build tools like \href{https://vitejs.dev/}{\tb{Vite}} provide preconfigured
build systems for both production and development. Unlike meta-frameworks, they do not force a
specialized application paradigm. They are suitable for generic frontend applications.
\subsection{Sourcemaps}
The distributable emitted by the build pipeline is illegible to most humans. This makes it difficult
to debug any errors that occur, as their tracebacks point to illegible code.
\href{https://developer.chrome.com/blog/sourcemaps/}{Sourcemaps} resolve this problem by mapping
code in the distributable back to its original location in the source code. The browser and triage
tools (e.g. Sentry) use the sourcemaps to restore and display the original source code. In
production, sourcemaps are often hidden from the browser and only uploaded to triage tools to avoid
publicizing the source code.
Each step of the build pipeline can emit a sourcemap. If multiple build tools are used to construct
the pipeline, the sourcemaps will form a chain (e.g. \tc{source.js} -> \texttt{transpiler.map} ->
\tc{bundler.map} -> \texttt{minifier.map}). To identify the source code corresponding to the
minified code, the chain of source maps must be traversed.
However, most tools are not capable of interpreting a chain of sourcemaps; they expect at most one
sourcemap per file in the distributable. The chain of sourcemaps must be flattened into a single
sourcemap. Preconfigured build systems will solve this problem (see Vite's
\href{https://github.com/vitejs/vite/blob/feae09fdfab505e58950c915fe5d8dd103d5ffb9/packages/vite/src/node/utils.ts\#L831}{\tc{combineSourcemaps}}
function).
\subsection{Hot Reload}
Development servers often provide a Hot Reload feature, which automatically rebuilds a new bundle on
source code changes and reloads the browser. While greatly superior to rebuilding and reloading
manually, it is still somewhat slow, and all client-side state is lost on reload.
\href{https://webpack.js.org/concepts/hot-module-replacement/}{Hot Module Replacement} improves upon
Hot Reload by replacing changed bundles in the running application, an in-place update. This
preserves the client-side state of unchanged modules and reduces the latency between code change and
updated application.
However, each code change triggers the rebuild of all the bundles that import it. This has a linear
time complexity relative to bundle size. Hence, in large applications, Hot Module Replacement can
become slow due to the growing rebundling cost.
The \href{https://vitejs.dev/guide/why.html}{no-bundle paradigm}, currently championed by Vite,
counters this by not bundling the development server. Instead, Vite serves ESM modules, each
corresponding to a source file, directly to the browser. In this paradigm, each code change triggers
a single module replacement in the frontend. This results in a near-constant refresh time complexity
relative to application size. However, if you have many modules, the initial page load may take
longer.
\subsection{Monorepos}
In organizations with multiple teams or multiple applications, the frontend may be split into
multiple JavaScript packages, but retained in a single repository. In such architectures, each
package has its own build step, and together they form a dependency graph of packages. The
applications reside at the roots of the dependency graphs.
Monorepo tools orchestrate the building of the dependency graph. They often provide features such as
incremental rebuilds, parallelism, and remote caching. With these features, large codebases can
enjoy the build times of small codebases.
The broader industry-standard monorepo tools, like \href{https://bazel.build/}{Bazel}, support a
broad set of languages, complicated build graphs, and hermetic execution. However, JavaScript for
frontend is one of the hardest ecosystems to completely integrate with these tools, and there is
currently little prior art.
Fortunately, there exist several monorepo tools designed specifically for frontend. Unfortunately,
they lack the flexibility and robustness of Bazel et al., most notably hermetic execution.
The frontend-specific monorepo tools in common use today are \href{https://nx.dev/}{\tb{Nx}} and
\href{https://turbo.build/repo}{\tb{Turborepo}}. Nx is more mature and featureful, while Turborepo
is part of the Vercel ecosystem. In the past, \href{https://lerna.js.org/}{\tb{Lerna}} was the
standard tool for linking multiple JavaScript packages together and publishing them to NPM. In 2022,
the Nx team took over Lerna, and Lerna now uses Nx under the hood to power builds.
\section{Trends}
Newer build tools are written in compiled languages and emphasize performance. Frontend builds were
terribly slow in 2019, but modern tools have greatly sped it up. However, modern tools have smaller
feature sets and are sometimes incompatible with libraries, so legacy codebases often cannot easily
switch to them.
Server-Side Rendering (SSR) has become more popular after the rise of Next.js. SSR does not
introduce any fundamental differences to frontend build systems. SSR applications must also serve
JavaScript to the browser, and they thus execute the same build steps.
\end{document}