Thinking in terms of API-driven and documentation-driven design will yield more usable modules than not doing so. You might argue that internals are not that important: "as long as the interface holds, we can put anything we want in the mix!". A usable interface is only one side of the equation; it will do little to keep the maintainability of your applications in check. Properly designed module internals help keep our code readable and its intent clear. In this chapter we’ll debate what it takes to write modules with scalability in mind, but without getting too far ahead of our current requirements. We’ll discuss the CRUST constraints in some more depth, and finally elaborate on how to prune modules as they become larger and more complex over time.
3.1 Growing a Module
Small, single-purpose functions are the lifeblood of clean module design. Purpose-built functions scale well because they introduce little organizational complexity into the module they belong to, even when that module grows to 500 lines of code. Small functions are not necessarily less powerful than large functions, but their power lies in composition.
Suppose that instead of implementing a single function with 100 lines of code we break it up into 3 or more smaller functions. We might later be able to reuse one of those smaller functions somewhere else in our module, or it might prove a useful addition to its public interface.
In this chapter we’ll discuss design considerations aimed at reducing complexity at the module level. While most of the concerns we’ll discuss here have an effect on the way we write functions, it is in the next chapter where we’ll be specifically devoting our time to the development of simple functions.
3.1.1 Composability and Scalability
Cleanly composed functions are at the heart of effective module design. Functions are the fundamental unit of our code. We could get away with writing the smallest possible number of functions required, the ones that are invoked by consumers or need to be passed for other interfaces to consume, but that wouldn’t get us much in the way of maintainability.
We could rely solely on intuition to decide what deserves to be its own function and what is better left inlined as part of a larger body of code, but this might leave us with inconsistencies that depend upon our frame of mind, as well as how each member of a team perceives functions are to be sliced. As we’ll see in the next chapter, pairing a few rules of thumb with our own intuition is an effective way of keeping functions simple, limiting their scope.
At the module level, it’s required that we implement features with the API surface in mind. When we plan out new functionality, we have to consider whether the abstraction is right for our consumers, how it might evolve and scale over time, and how narrowly or broadly it can support the use cases of its consumers.
When considering whether the abstraction is right, suppose we have a function that’s a
draggable object factory for DOM elements. Draggable objects can be moved around and then dropped in a container, but consumers often have to impose different limitations on the conditions under which the object can be moved, some of which we’ll outline in the following list.
Draggable elements must have a parent with a
Draggable elements mustn’t have a
Dragging must initiate from a child with a
Elements may be dropped into containers with a
Elements may be dropped into containers with at most 6 children
Elements may not be dropped into the container they’re being dragged from
Elements must be sortable in the container they’re dragged from, but they can’t be dropped into other containers
We’ve now spent quite a bit of time thinking about use cases for a drag and drop library, so we’re well equipped to come up with an API that will satisfy most or maybe even every one of these use cases, without dramatically broadening our API surface.
Consider, in contrast, the situation if we were to go off and implement a way of checking off each use case in isolation without taking into account similar use cases, or cases which might arise but are not an immediate need. We would end up with seven different ways of introducing specific restrictions on how elements are dragged and dropped. Since we’ve designed their interfaces in isolation, each of these solutions is likely to be at least slightly different from the rest. Maybe they’re similar enough that each of them is an option flag, but the consumer still can’t help but wonder why we have seven different flags for such similar use cases, and they can’t shake the feeling that we’ve designed the interface poorly. Except there wasn’t much in the way of design, we’ve mostly tacked requirement upon requirement onto our API surface as they came along, never daring to look at the road ahead and envisioning how the API might evolve in the future. If we had designed them with scalability in mind, we might’ve grouped many similar use cases under the same feature, and would’ve avoided an unnecessarily large API surface in the process.
Going back to the case where we do spend some time thinking ahead, and create a collection of similar requirements and use cases, we should be able to find a common denominator that’s suitable for most use cases. We’ll know when we have the right abstraction because it’ll cater to every requirement we have, and a few we didn’t even have to fulfill but which the abstraction satisfies anyhow. In the case of draggable elements, once we’ve taken all the requirements into account, we might choose to define a few options that impose restrictions based on a few CSS selectors, or we might introduce a callback where the user can determine whether an element can be dragged and another where they can determine whether the element can be dropped. These choices also depend on how heavily the API is going to be used, how flexible we want it to be, and how frequently we intend to make changes to it.
Sometimes we won’t have the opportunity to think ahead, we might not be able to foresee all possible use cases, our forecasts may fail us, or requirements may change, pulling the rug from under our feet. Granted, this never is the ideal situation to find ourselves in, but it is certain we wouldn’t be better off if we hadn’t paid attention to the use cases for our module in aggregate. On the other hand, extra requirements may fit within the bounds of an abstracted solution, provided the new use case is similar enough to what we expected when designing the abstraction.
Abstractions aren’t free, but they can shield portions of code from complexity. Naturally, we could boldly claim an elegant interface such as
fn ⇒ fn() solves all problems in computing — the consumer only needs to provide the right
fn callback. The reality is we wouldn’t be doing anything but offloading the problem onto the consumer, at the cost of implementing the right solution themselves while still consuming our API in the process.
When we’re weighing whether to offer an interface like CSS selectors or callbacks, we’re deciding how much we want to abstract, and how much we want to leave up to the consumer. When we choose to let the user provide CSS selectors, we keep the interface short, but the use cases will be limited as well. Consumers won’t be able, for example, to decide dynamically whether the element is draggable or not beyond what a CSS selector can offer. When we choose to let the user provide callbacks, we make it harder for them to use our interface, since they now have to provide bits and pieces of the implementation themselves, but that expense buys them great flexibility in how to decide what is draggable and what is not.
As most things in program design, API design is a constant tradeoff between simplicity and flexibility. For each particular case, it is our responsibility to decide how flexible we want the interface to be, but at the expense of simplicity. We can also decide how simple we want an interface to be, but at the expense of flexibility. Going back to jQuery, it’s interesting to note how they always favor simplicity, by allowing you to provide as little information as needed for most of their API methods. Meanwhile, they avoid sacrificing flexibility by offering countless overloads for each of their API methods. The complexity lies in their implementation, balancing arguments by figuring out whether they’re a
NodeList, a DOM element, an array, a function, a selector, or something else, — not to mention optional parameters — before even starting to fulfill the consumer’s goal when making an API call. Consumers observe some of the complexity at the seams, when sifting through documentation and finding out about all the different ways of accomplishing the same goals. And yet, despite all of jQuery’s internal complexity, code which consumes the jQuery API manages to stay ravishingly simple.
3.1.2 Design for Today
Before we go off and start pondering the best ways of abstracting a feature we need to implement so that it caters to every single requirement that might come in the future, it’s necessary to take a step back and consider simpler alternatives. A simple implementation means we pay smaller upfront costs, but it doesn’t necessarily mean that new requirements will result in breaking changes.
Interfaces don’t need to cater to every conceivable use case from the outset. As we’ve analyzed in chapter 2, sometimes we may get away with first implementing a solution for the simplest or most common use case, and then adding an options parameter through which newer use cases can be configured. As we get to more advanced use cases, we can make decisions as outlined in the previous section, choosing which use cases deserve to be grouped under an abstraction and which are too narrow for an abstraction to be worthwhile.
Similarly, the interface could start off supporting only one way of receiving its inputs, and as use cases evolve we might bake polymorphism into the mix, accepting multiple input types in the same parameter position. Grandiose thinking may take us to believe that, in order to be great, our interfaces must be able to handle every input type and be highly configurable with dozens of configuration options. This might well be true for the most advanced users of our interface, but if we don’t take the time to let the interface evolve and mature as needed, we might code our interface into a corner that can then only be repaired by writing a different component from a ground up with a better thought out interface, and later replacing references to the old component with the new one.
A larger interface is rarely better than a smaller interface which accomplishes the job consumers need it to fulfill. Elegance is of the essence here: if we wish for our interface to remain small but we predict the consumer will eventually need to hook into different pieces of our component’s internal behavior so that they can react accordingly, we’re better off waiting until this requirement materializes than building a solution for a problem we don’t yet have.
Not only will we be focusing development hours on functionality that’s needed today, but we’ll also avoid creating complexity that can be dispensed with for the time being. It might be argued that the ability to react to internal events of a library won’t introduce a lot of complexity. Consider, however, the case where the requirement never materializes. We’d have burdened our component with increased complexity to satisfy functionality we never needed. Worse yet, consider the case where the requirement changes between the moment we’ve implemented a solution and when it’s actually needed. We’d now have functionality we never needed, which clashes with different functionality that we do need.
Suppose we don’t only need hooks to react to events, but we need those hooks to be able to transform internal state — how would the event hooks interface change then? Chances are, someone might’ve found a use for the event listeners we’ve implemented earlier, and so we cannot dispose of them with ease. We might be forced to change the event listener API to support internal state transformations, which would result in a cringeworthy interface that’s bound to frustrate implementers and consumers alike.
Falling in the trap of implementing features consumers don’t yet need might be easy at first, but it’ll cost us dearly in terms of complexity, maintainability, and wasted developer hours. The best code is no code at all. This means fewer bugs, less time spent writing code, less time writing documentation, and less time fielding support requests. Latch onto that mentality and strive to keep functionality to exactly the absolute minimum that’s required.
3.1.3 Abstractions Evolve in Small Steps
It’s important to note that abstractions should evolve naturally, rather than have them force an implementation style upon us. When we’re unsure about whether to bundle a few use cases with an abstraction, the best option is often to wait and see whether more use cases would fall into the abstraction we’re considering. If we wait and the abstraction holds true for more and more use cases, we can go ahead and implement the abstraction. If the abstraction doesn’t hold, then we can be thankful we won’t have to bend the abstraction to fit the new use cases, often breaking the abstraction or causing more grief than the abstraction had originally set out to avoid on our behalf.
In a similar fashion to that of the last section, we should first wait until use cases emerge and then reconsider an abstraction when its benefits become clear. While developing unneeded functionality is little more than a waste of time, leveraging the wrong abstractions will kill or, at best, cripple our component’s interface. While good abstractions are a powerful tool that can reduce the complexity and volume of code we write, subjecting consumers to inappropriate abstractions might increase the amount of code they need to write and will forcibly increase complexity by having users bend to the will of the abstraction, causing frustration and eventual abandonment of the poorly abstracted component.
HTTP libraries are a great example of how the right abstraction for an interface depends entirely on the use cases its consumer has in mind. Plain
GET calls can be serviced with callbacks or promises, but streaming requires an event-driven interface which allows the consumer to act as soon as the stream has portions of data ready for consumption. A typical
GET request could be serviced by an event-driven interface as well, allowing the implementer to abstract every use case under an event-driven model. To the consumer, this model would feel a bit convoluted for the simplest case, however. Even when we’ve grouped every use case under a convenient abstraction, the consumer shouldn’t have to settle for
get('/cats').on('data', gotCats) when their use case doesn’t involve streaming and they could be using a simpler
get('/cats', gotCats) interface instead, which wouldn’t need to handle error events separately, either, instead relying on the Node.js convention where the first argument passed to callbacks is an error or
null when everything goes smoothly.
An HTTP library that’s primarily focused on streaming might go for the event-driven model in all cases, arguing that convenience methods such as a callback-based interface could be implemented on top of their primitive interface. This is acceptable, we’re focusing on the use case at hand and keeping our API surface as small as possible, while still allowing our library to be wrapped for higher-level consumption. If our library was primarily focused on the experience of leveraging its interface, we might go for the callback or promise based approach. When that library then has to support streaming, it might incorporate an event-driven interface. At this point we’d have to decide whether we’ll expose that kind of interface solely for streaming purposes, or if it’ll be available for commonplace scenarios as well. On the one hand, exposing it solely for the streaming use case keeps the API surface small. On the other, exposing it for every use case results in a more flexible and consistent API, which might be what consumers expect.
Context is of the utmost relevance here. When we’re developing an interface for an open-source or otherwise broadly available library, we might need to listen to a variety of folks who’ll be weighing into how the API should be designed. Depending on our audience, they may prefer a smaller API surface or a flexible interface. Over time, broadly available libraries tend to favor flexibility over simplicity, as the number of users grows and with them, the number of use cases the library needs to support. When the component is being developed in the context of our day jobs, we might not need to cater to a broad audience. It may well be that we ourselves are the only ones who will be consuming the API, or maybe our team. It might be that we belong to a UI platform team that serves the entire company, which would put us in a situation akin to the open-source case, though.
In any case, when we’re uncertain if our interface will be needing to expose certain surface areas, it’s highly recommended that we don’t expose any of it until we are indeed certain. Keeping API surfaces as small as possible reduces the odds of presenting the consumer with multiple ways of accomplishing the same task. This is often undesirable given that users will undoubtedly become confused and come knocking about which one is the best solution. There’s a few answers. When the best solution is always the same, the other offerings probably don’t belong in our public interface. When the best solution depends on the use case, then we should be in the lookout for better abstractions which encapsulate those similar use cases under a single solution. If the use cases are different enough, so should the solutions offered by the interface, in which case consumers shouldn’t be faced with uncertainty: our interface would only offer a single solution for that particular use case.
3.1.4 Move Deliberately and Experiment
You might have heard the "Move Fast and Break Things" mantra from Facebook. It’s dangerous to take this mantra literally in terms of software development, which shouldn’t be hurried nor frequently broken, let alone on purpose. The mantra is meant to be interpreted as an invitation to experiment, where the things we should be breaking are assumptions about how an application architecture should be laid out, how users behave, what advertisers want, and any other assumptions. Moving fast means to quickly hash out prototypes to test our newfound assumptions, to timely seize upon new markets, to avoid engineering slowing to a crawl as teams and requirements grow in size and complexity, and to constantly iterate on our products or codebases.
Taken literally, moving fast and breaking things is a dreadful way to go about software development. Any organization worth their salt would never encourage engineers to write code faster at the expense of their product quality. Code should exist mostly because it has to, in order for the products they make up to exist. The less complex the code we write, provided the product remains the same, the better.
The code that makes up a product should be covered by tests, minimizing the risk of bugs making their way to production. When we take "Move Fast and Break Things" literally, we are tempted to think testing is optional, since it slows us down and we need to move fast. A product that’s not test covered will be, ironically, unable to move fast when bugs inevitable arise and wind down engineering speed.
A better mantra might be one that can be taken literally, such as "Move Deliberately and Experiment". This mantra carries the same sentiment as the Facebook mantra of "Move Fast and Break Things", but its true meaning isn’t meant to be decoded or interpreted. Experimentation is a key aspect of software design and development. We should constantly try out and validate new ideas, verifying whether they pose better solutions than the status quo. We could interpret "Move Fast and Break Things" as "A/B test early and A/B test often", and "Move Deliberately and Experiment" can convey this meaning as well.
To move deliberately is to move with cause. Engineering tempo will rarely be guided by the development team’s desire to move faster, but is most often instead bound by release cycles and the complexity in requirements needed to meet those releases. Of course, everyone wants engineering to move fast where possible, but interface design shouldn’t be hurried, regardless of whether the interface we’re dealing with is an architecture, a layer, a component, or a function. Internals aren’t as crucial to get right, for as long as the interface holds, the internals can be later improved for performance or readability gains. This is not to advocate sloppily developed internals, but rather to encourage respectfully and deliberately thought out interface design.
3.2 CRUST Considerations
We’re getting closer to function internals, which will be discussed at length in chapter 4. Before we do so, we need to address a few more concerns on the component level. This section explores how we can keep components simple by following the CRUST principle outlined in chapter 2.
3.2.1 Do Repeat Yourself, Occasionally
The DRY principle (Don’t Repeat Yourself) is one of the best regarded principles in software development, and rightly so. It prompts us to write a loop when we could write a hundred print statements, it makes us create reusable functions so that we don’t end up having to maintain several instances of the same piece of code, and it questions the need for slight permutations of what’s virtually the same piece of code repeated over and over across our codebases.
When taken to the extreme, though, DRY is harmful and hinders development. Our mission to find the right abstractions will be cut short if we are ever vigilant in our quest to suppress any and all repetition. When it comes to finding abstractions, it’s almost always best to pause and reflect on whether we ought to force DRY at this moment, or if we should wait a while and see whether a better pattern emerges.
Being too quick to follow DRY may result in picking the wrong abstraction, costing us time if we realize the mistake early enough, and causing even more damage the longer we let an undesirable abstraction loose.
In a similar fashion, blindly following DRY for even the smallest bit of code is bound to make our code harder to follow or read. Merging two sides of a regular expression that was optimized for readability (a rare sight in the world of regular expressions) will almost certainly make it harder to read and correctly infer its purpose. Is following DRY truly worthwhile in cases like this?
The whole point of DRY is to write concise code, improving readability in turn. When the more concise piece of code results in a program that’s harder to read than what we had, DRY was probably a bad idea, a solution to a problem we didn’t yet have, not in this particular piece of code, not yet anyway. In order to stay sane, it’s necessary to take software development advice with a grain of salt, as we’ll discuss in section 3.3.4.
Most often, DRY is the correct approach, but there are indeed cases when DRY might not be appropriate, such as when it yields trivial gains at the expense of readability or when it hinders our ability to find better abstractions. We can always come back to our piece of code and sculpt pieces away making it more DRY. This is typically easier than trying to decouple bits of code we’ve mistakenly made DRY, which is why sometimes it’s best to wait before we commit to DRY.
3.2.2 Feature Isolation
We’ve discussed interface design at great length, but we haven’t touched on decisions around when to split a module into smaller pieces. In modern application architectures, having certain modules may be required by conventional practices. For instance, a web application made up of different views may require that each view is its own component. This limitation shouldn’t, however, stop us from breaking up the internal implementation of the view into several smaller components. These smaller components might be reused in other views or components, tested on their own, and better isolated than they might have otherwise been if they were tightly coupled to their parent view.
Even when the smaller component isn’t being reused anywhere else, and perhaps not even tested on its own, it’s still worth moving it to a different file. Why? Because we’re removing the complexity that makes up the child component from its parent virtually for free. We’re only paying a cheap indirection cost, where the child component is now referenced as a dependency of its parent instead of being inlined. When we split up the internals of a large component into several children, we’re chopping up its internal complexity and ending up with several simple components. The complexity didn’t dissipate, it’s subtly hidden away in the interrelationships between these child components and their parent, but that’s now the biggest concern in the parent module, whereas each of the smaller modules doesn’t need to know much about these relationships.
Chopping up internals doesn’t merely only work for view components and their children. That said, view components pose a great example that might help us visualize how complexity can remain flat across a component system, regardless of how deep we go, instead of being contained in a large component with little structure and a high-level of complexity or coupling. This is akin to looking at the universe on a macroscopic level and then taking a closer look, until we get to the atomic level, and then beyond. Each layer has its own complexities and intricacies waiting to be discovered, but the complexity is spread across the layers rather than clustered on any one particular layer. The spread reduces the amount of complexity we have to observe and deal with on any given layer.
Speaking of layers, it is at this stage of the design process that you might want to consider defining different layers for your application. You might be used to having models, views, and controllers in MVC applications, or maybe you’re accustomed to actions, reducers, and selectors in Redux applications. Maybe you should think of implementing a service layer where all the business logic occurs, or perhaps a persistance layer where all the caching and persistent storage takes place.
When we’re not dealing with modules which we ought to shape in a certain way, like views, but modules that can be composed any which way we choose, like services, we should consider whether new features belong in an existing module or in an entirely new module. When we have a module which wraps a Markdown parsing library adding functionality such as support for emoji expansions, and want an API that can take the resulting HTML and strip out certain tags and attributes, should we add that functionality to the Markdown module or put it in a separate module?
On the one hand, having it in the Markdown module would save us the trouble of importing both modules when we want the sanitization functionality, but on the other hand, there may be quite a few cases where we have HTML that didn’t come from Markdown parsing but which we still want to sanitize. A solution that’s often effective in these cases is putting the HTML sanitization functionality into its own module, but consume it in the Markdown module for convenience. This way, consumers of the Markdown module always get sanitized output, and those who want to sanitize a piece of HTML directly can do so as well. We could always make sanitization opt-in (or better yet, opt-out) for the Markdown module, if the feature isn’t always what’s needed by consumers of that interface.
It can be tempting to create a
utilities.js module where we deposit all of our functionality which doesn’t belong anywhere else. When we move onto a new project, we tend to want some of this functionality once again, so we might copy the relevant parts over to the new module. Here we’d be breaking the DRY principle, because instead of reusing the same bits of code we’re creating a new module that’s a duplicate of what we had. Worse yet, over time we’ll eventually modify the
utilities.js component, so they might not contain the same functionality anymore.
The low hanging fruit here would be to create a
lib directory instead of a single
utilities.js module, and place each independent piece of functionality into its own module. Naturally, some of these pieces of functionality will depend on other utility functions, but we’ll be better off importing those bits from another module than keeping everything in the same file. Each small file makes it obvious what the utility is, what other bits it relies on, and can be tested and documented individually. More importantly, when the utility grows in scope, file size, and complexity, it will remain manageable because we’ve isolated it early. In contrast, if we kept everything in the same file but then one of the utilities grew considerably, we’d have to pull the functionality into a different module, at which point our code might be coupled with other utilities in subtle ways that might make the migration to a multi-module architecture a bit harder than it should be.
Were we to truly embrace a modular architecture, we might go an extra mile after promoting each utility to its own module. Aftering identifying utility modules we’d like to reuse — such as a function used to generate slugs like
this-is-a-slug based on an arbitrary string that might have spaces, accents, punctuation, and symbols, besides alphanumeric characters — we could move the module to its own directory, along with documentation and tests, register any dependencies in
package.json, and publish it to an npm registry. In doing so, we’d be honoring DRY across projects, and when we update the slugging package while working on our latest project, older projects would also benefit from new functionality and bug fixes.
This approach can be taken as far as we consider necessary: as long as we’d benefit from making a piece of functionality reusable across our projects, we can make it reusable, adding tests and documentation along the way. Note that hypermodularity offers diminishing returns, the more we take modularity to the extreme, the more time we’ll have to spend on documentation and testing. If we intend to release each line of code we develop as its own well-documented and well-tested package, we’ll be spending quite some time on tasks that are not directly related to developing features or fixing bugs. As always, use your own judgement to decide how far to take modular structures.
When a piece of code is not very complex and rather small, it’s usually not worth creating a module for. It might be better kept in a function on the module where it’s consumed, or inlined every time. Such short pieces of code tend to change and branch out, often necessitating slightly different implementations in different portions of our codebase. Given the amount of code is so small, it’s hardly worth our time to figure out a way to generalize the snippet of code for all or even most use cases. Chances are we’d end up with something more complex than if we just inlined the functionality to begin with.
When a piece of code involves enough complexity to warrant its own module, that doesn’t immediately make it worthwhile to create a package for it. External modules often involve a little bit more of maintenance work, in exchange for being reusable across codebases and offering a cleanlier interface that’s properly documented. Take into consideration the amount of time you’ll have to spend on extricating the module and on writing documentation, and whether that’s worth the effort. Extricating the module will be challenging if it has dependencies on other parts of the codebase it belongs to, since those would have to be extricated as well. Writing documentation is typically not something we do for every module of a codebase, but we have to document modules when they’re their own package, since we can’t expect other potential consumers to effectively decide whether they’ll be using a package without having read exactly what it does or how to use it.
3.2.3 Trade-offs when Designing Internals
When we’re designing the internals of a module, it’s key to keep our priorities in order: the goal is to do what consumers of this module need. That goal has several aspects to it, so let’s visit them in order of importance.
First off, we need to design the right interface. A complicated interface will frustrate and drive off consumers, making our module irrelevant or, at best, a pain to work with. Having an elegant or fast implementation will be of little help if our reluctant consumers have trouble leveraging the interface in front of them. A programming interface is so much more than beautiful packaging making up for a mediocre present. For consumers, the interface should be all there is. Having a simple, concise, and intuitive interface will, in turn, drive down complexity in code written by consumers. Thus, the number one aspect to our goal is to find the best possible interface that caters to the needs and wants of its consumers.
Second, we need to develop something that works precisely as advertised and documented. An elegant and fast implementation that doesn’t do what it’s supposed to is no good to our consumers. Promising the right interface is great, but it needs to be backed up by an implementation that can deliver on the promises we make through the interface. Only then can consumers trust the code we write.
Third, the implementation should be as simple as possible. The simpler our code is, the easier it will be for us to introduce changes to it without having to rewrite the existing implementation. Note that simple doesn’t necessarily mean terse. For example, a simple implementation might indulge in long but descriptive variable names and a few comments explaining why code is written the way it is. Besides the ability to introduce changes, simple code is easier to follow when debugging errors, when new developers interact with the piece of software, or when the original implementors need to interact with it after a long period of time without having to worry about it. implementation simplicity comes in third, but only after a proper interface that works as expected.
Fourth, the internals should be as performant as possible. Granted, some measure of performance is codified in producing something that works well, as something that’s too slow to be considered reliable would be unacceptable to consumers. Beyond that, performance falls to the fourth place in our list of desirable traits. Performance is a feature, to be treated as such, and we should favor simplicity and readability over speed. There are exceptions where performance is of the utmost importance, even at the cost of producing suboptimal interfaces and code that’s not all that easy to read, but in these cases we should at least strive to heavily comment the relevant pieces of code so that it’s abundantly clear why the code had to be written the way it was.
Flexibility, other than that afforded by writing simple code and providing an appropriate interface, has no place in satisfying the needs of our consumers. Trying to anticipate needs is more often than not going to result in more complexity, code, and time spent, with hardly anything to show for in terms of improving the consumer’s experience.
3.3 Pruning a Module
Much like modern web development, module design is never truly done. In this section we’ll visit a few discussion topics that’ll get you thinking about the long half-life of components, and how we can design and build our components so that they don’t cause us much trouble after we’ve finished actively developing them.
3.3.1 Error Handling, Mitigation, Detection, and Solving
While working on software development we’ll invariably need to spend time analyzing the root cause that led to subtle bugs which seem impossible to hunt down. Only after spending invaluable time we will figure out it was caused by a small difference in program state than what we had taken for granted, and that small difference snowballed through our application’s logic flow into the serious issue we just had to hunt down.
We can’t prevent this from happening over and over — not entirely. Unexpected bugs will always find their way to the surface. Maybe we don’t control a piece of software which interacts with our own code in an unexpected way, which works well until it doesn’t anymore because of a problem in the data. Maybe the problem is merely that a validation function isn’t working the way it’s supposed to, allowing some data to flow through the system in a shape that it shouldn’t, but by the time it causes an error we’ll spend quite some time until we figure out that indeed, the culprit is a bug in our validation function, triggered by a malformed kind of input that was undertested. Since the bug is completely unrelated to the error’s stack track information, we might spend a few hours hunting down and identifying the issue.
What we can do is mitigate the risk of bugs by writing more predictable code or improving test coverage. We can also become more proficient at debugging.
On the predictable code arena, we must be sure to handle every expected error. When it comes to error handling we typically will bubble the error up the stack and handle it at the top, by logging it to an analytics tracker, to standard output, or to a database. When using a function call we know might throw, like
JSON.parse on user input, we should wrap it with
catch and handle the error, again bubbling it up to the consumer if our inability to proceed with the function logic is final. If we’re dealing with conventional callbacks that have an error argument, let’s handle the error in a guard clause. Whenever we have a promise chain, make sure to add a
.catch reaction to the end of the chain that handles any errors occurring in the chain. In the case of
async functions, we could use
catch or, alternatively, we can also add a
.catch reaction to the result of invoking the async function. While leveraging streams or other conventional event-based interfaces, make sure to bind an
error event handler. Proper error handling should all but eliminate the chance of expected errors crippling our software. Simple code is predictable. Thus, following the suggestions in chapter 4 will aid us in reducing the odds of encountering unexpected errors as well.
Test coverage can help detect unexpected errors. If we have simple and predictable code, it’s harder for unexpected errors to seep through the seams. Tests can further abridge the gap by enlarging the corpus of expected errors. When we add tests, preventable errors are codified by test cases and fixtures. When tests are comprehensive enough, we might run into unexpected errors in testing and fix them. Since we’ve already codified them in a test case, these errors can’t happen again (a test regression) without our test suite failing.
Regardless of how determined we are to develop simple, predictable, and thoroughly tested programs, we’re still bound to run into bugs we hadn’t expected. Tests exist mostly to prevent regressions, preventing us from running once again into bugs we’ve already fixed; and to prevent expected mistakes, errors we think might arise if we were to tweak our code in incorrect ways. Tests can do little to prognosticate and prevent software bugs from happening, however.
This brings us to the inevitability of debugging. Using step-through debugging and inspecting application state as we step through the code leading to a bug is an useful tool, but it will not help us debug our code any faster than we can diagnose exactly what is going on.
In order to become truly effective debuggers, we must understand how the software we depend on works internally. If we don’t understand the internals of something, we’re effectively dealing with a black box where anything can happen from our perspective. This adventure is left as an exercise to the reader, who is better equipped to determine how to obtain a higher understanding of how their dependencies truly work. It might be the case that reading the documentation will suffice, but note that this is rarely the case. Perhaps you should opt to download the source code from GitHub and give it a read. Maybe you’re more of a hands-on kind of person and prefer to try your hand at making your own knock-off of a library you depend on, in order to understand how it works. Regardless of the path you take, the next time you run into an expected error related to a dependency you’re more intimately familiar with, you’ll have less of a hard time identifying the root cause, since you’ll be aware of the limitations and common pitfalls of what was previously mostly a black box to you. Documentation can only take us so far in understanding how something works behind the hood, which is what’s required when tracking down unexpected errors.
3.3.2 Documentation as an Art
It is true, in the hard times of tracking down and fixing an unexpected error, documentation often plays a diminished role. Documentation is, however, often fundamental when trying to understand how a piece of code works, and this can’t be underestimated. Public interface documentation underscores readable code, providing not only a guide for consumers to draw from for usage examples and advanced configuration options that may aid them when coming up with their own designs, but is also useful for implementers as a reference of exactly what consumers are promised and, hence, ultimately expect.
In this section we’re talking about documentation in its broadest possible sense. We’ve discussed public interface documentation, but tests and code comments are also documentation in their own way. Even variable or function names should be considered a kind of documentation. Tests act as programmatic documentation for the kinds of inputs and outputs we expect from our public interfaces. In the case of integration tests, they describe the minimum acceptable behavior of our application, such as allowing users to log in providing an email and a password. Code comments serve as documentation for implementers to understand why code looks the way it does, areas of improvement, and often refer the reader to links that offer further details on a bug fix that might not look all that elegant at first sight. Descriptive variable names can, cumulatively, save the reader considerable time when explicit names like
products are preferred over vague and ambiguous names like
data. The same applies to function names, where we should prefer names like
aggregateSessionsPerDay over something shorter but unclear such as
Getting into the habit of treating every bit of code and the structure around it (formal documentation, tests, comments) as documentation itself is only logical. Those who will be reading our code in the future — developers looking to further their understanding of how the code works, and implementers doing the same in order to extend or repair a portion of functionality — rely on our ability to convey a concise message on how the interface and its internals work.
Why would we not, then, strive to take advantage of every variable, property, and function name; every component name, every test case, and every bit of formal documentation, to explain precisely what our programs do, how they do it, and why we went for the trade-offs we took?
In this sense, we should consider documentation to be the art of taking every possible opportunity to clearly and deliberately express the intent and reasoning of all of the different aspects of our modules.
The above doesn’t mean to say we should flood consumers and implementers alike until they drown in a tumultuous stream of neverending documentation. On the contrary, only by being deliberate in our messaging can be strike the right balance and describe the public interface in formal documentation, describe notable usage examples in our test cases, and explain abnormalities in comments.
Following a holistic approach to documentation, where we’re aware of who might be reading what and what should be directed to whom, should result in easy-to-follow prose that’s not ambiguous as to usage or best practices, nor fragmented, nor repetitive. Interface documentation should be limited to its usage, and is rarely the place to discuss design choices, which can be relayed to architecture or design documentation, and later linked in relevant places. Comments are great for explaining why, or linking to a bug fixed in their vicinity, but they aren’t usually the best place to discuss why an interface looks the way it does, and this is better left to architecture documentation or our issue tracker of choice. Dead code should definitely not be kept around in comment blocks, as it does nothing but confuse the reader, and is better kept in feature branches or git stashes, but off the trunk of source control.
Tom Preston-Werner wrote about the notion of README-driven development as a way of designing an interface by first describing it in terms of how it would get used. This is generally more effective than test-driven design (TDD), where we’ll often find ourselves rewriting the same bits of code over and over before we realize we wanted to produce a different API to begin with. The way README-driven design is supposed to work is self-descriptive: we begin by creating a README file and writing our interface’s documentation. We can start with the most common use cases, inputs and desired outputs, as described in section 2.1.2, and grow our interface from there. Doing this in a README file instead of a module leaves us an itsy bit more detached from an eventual implementation, but the essence is the same. The largest difference is that, much like TDD, we’d be committing to writing a README file over and over before we settle for a desirable API. Regardless, both API-first and README-driven design offer significant advantages over diving straight to an implementation.
3.3.3 Removing Code
Naturally, it’s easier to modify a module’s internal implementation than to change its public API, as the effects of doing so would be limited to the module’s internals. Internal changes that don’t affect the API are typically not observable from the outside. The exception to that rule would be when consumers monkey-patch our interface, sometimes becoming able to observe some of our internals. In this case, however, the consumer should be aware of how brittle monkey-patching a module they do not control is, and they did so assuming the risk of breakage.
In section 3.1.2 we observed that the best code is no code at all, and this has implications when it comes to removing code as well. Code we never write is code we don’t need to worry about deleting. The less code there is, the less code we need to maintain, the less potential bugs we are yet to uncover, and the less code we need to read, test, and deliver over mobile networks to speed-hungry humans.
As portions of our programs become stale and unused, it is best to remove them entirely instead of postponing their inevitable fate. Any code we desire to keep around for reference or the possibility of reinstating it in the future can be safely preserved by source control software without the necessity of keeping it around in our codebase. Avoiding commented out code and removing unused code as soon as possible will keep our codebase cleaner and easy to follow. When there’s dead code, a developer might be uncertain as to whether this is actually somehow being in use somewhere else, and reluctant to remove it. As time passes, the theory of broken windows comes into full effect and we’ll soon have a codebase that’s riddled with unused code nobody knows why it’s there or how it is that the codebase has come to be so unmanageable.
Reusability plays a role in code removal, as more components depend on a module, it becomes more unlikely we’ll be able to trivially remove the heavily depended-on piece of code. When a module has no connections to other modules, it can be removed from the codebase, but might still serve a purpose as its own standalone package.
3.3.4 Applying Context
Software development advice is often written in absolute terms, rarely considering context. When you bend a rule to fit your situation, you’re not necessarily disagreeing with the advice, you might just have applied a different context to the same problem. The adviser may have missed that context, or they might have avoided it as it was inconvenient.
However convincing an eloquent piece of advice or tool might seem, always apply your own critical thinking and context first. What might work for large companies at incredible scale, under great load, and with their own unique set of problems, might not be suitable for your personal blogging project. What might seem like a sensible idea for a weekend hack, might not be the best use of a mid-size startup’s time.
Whenever you’re analyzing whether a dependency, tool, or piece of advice fits your needs, always start by reading what there is to be read and consider whether the problem being solved is one you indeed need to solve. Avoid falling in the trap of leveraging advice or tools merely because it became popular or is being hailed by a large actor.
Never overcommit to that which you’re not certain fits your needs, but always experiment. It is by keeping an open mind that we can capture new knowledge, improve our understanding of the world, and innovate. This is aided by critical thinking and hindered by rushing to the newest technology without firsthand experimentation. In any case, rules are meant to be bent, and broken.
Let’s move into the next chapter, where we’ll decypher the art of writing less complex functions.