Skip to content

Commit

Permalink
feat: you don't know performance (#32)
Browse files Browse the repository at this point in the history
  • Loading branch information
Eomm committed Mar 21, 2023
1 parent 8d1f3ec commit be26f0a
Show file tree
Hide file tree
Showing 2 changed files with 234 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ To see the last version of the application, it is published at [https://fastify-
1. [BONUS: What is AST?](https://backend.cafe/what-is-ast) - [📝](./posts/bonus-ast.md)
1. [BONUS: How to build a Pokedex with Platformatic](https://backend.cafe/how-to-build-a-pokedex-with-platformatic) - [📝](./posts/bonus-pokedex.md)
1. [BONUS: Dynamic GraphQL queries with Mercurius](https://backend.cafe/dynamic-graphql-queries-with-mercurius) - [📝](./posts/bonus-mercurius-dynamic.md)
1. [BONUS: The Secrets Behind High Performance](https://backend.cafe/the-secrets-behind-high-performance-with-node-js) - [📝](./posts/bonus-performance-secrets.md)


> Yeah, I know, the `BONUS` articles are more than the `fastify-discord-app-demo` articles
Expand Down
233 changes: 233 additions & 0 deletions posts/bonus-performance-secrets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
# The Secrets Behind High Performance With Node.js

I always strive to write actionable articles, but let’s face it: improving code performance can be a bit intimidating.
Not today, my friends!
In this post, I’ll share some lesser-known but highly effective strategies to speed up your codebase — starting tomorrow.
Get ready to learn some serious performance hacks!

During a starry evening, I happened to remember what I thought was a performant solution compared to what I see in the Node.js panorama,
and I think it's worth sharing.

It happens to me to remember what I thought was a performant solution compared to what
I see in the Node.js panorama, and I think it's worth sharing.

What I thought is that the fastest code implementation is achieved by using:

- Esoteric data structures
- Unknown algorithms
- Bitwise operations
- A master degree in rocket science

...and I was wrong!

Let me share some of the secrets behind high performance with Node.js.


## Regular expressions are slow

The title explains it all — regular expressions are slow. If you see a regular expression in your code then
that code can be optimized by removing the regular expression.

The error I was making is to think that regular expressions have a complexity of `O(1)`, but they don't.
_I know, I know, I was silly to think that._
Anyway, the regular expression complexity depends on the engine implementation.
In Node.js, the `RegExp` engine is performed by V8 v10.2.x, which implements a
backtracking algorithm with a complexity of <code>O(2<sup>n</sup>)</code>!

In fact, if you try to run this simple code:

```js
/(a*)*b/.exec('a'.repeat(100))
```

It will take 90 seconds.
Instead, if you run the previous code with `node --enable-experimental-regexp-engine-on-excessive-backtracks regexp.js`,
it will take only **0,01 seconds**!!

The V8 flag `--enable-experimental-regexp-engine-on-excessive-backtracks` enables a new experimental non-backtracking RegExp engine.
You can deepen this new algorithm by following the guidance in this [V8 article](https://v8.dev/blog/non-backtracking-regexp).

So, the first takeaway is that regular expressions are slow, and you should avoid them as much as possible.
If you can't avoid them, you should try to check the `node --v8-options` and verify if you can tune the
default RegExp engine.

If you are interested in the topic then you can find out more by reading [this article](https://swtch.com/~rsc/regexp/regexp1.html). It explains the complexity of regular expressions and compares the different algorithms.


## You don't need regular expressions

Sorry to be so repetitive, but I must talk about regular expressions again.
Sometimes a regular expression seems to be the best solution, but it's not.

Let's take a look at the following code:

```js
const utf8MimeType = /^text\/|^application\/(javascript|json)/
const isUtf8MimeType = utf8MimeTypeRE.test('application/json')
```

The code above is a simple check to verify if a MIME type is UTF-8 and matches multiple combinations of MIME types.

The code is simple, but it's not performant because it can be optimized by removing the regular expression:

```js
function isUtf8MimeType (value) {
const len = value.length
return (
(len > 21 && value.indexOf('application/javascript') === 0) ||
(len > 14 && value.indexOf('application/json') === 0) ||
(len > 5 && value.indexOf('text/') === 0)
)
}

const isUtf8MimeType = isUtf8MimeType('application/json')
```

The latter code is faster than the former code because it doesn't use a regular expression.
What does "faster" mean?
The first code executes 10,632,205 ops/sec ±1.69% (191 runs sampled), while
the second code executes 1,132,847,245 ops/sec ±0.15% (192 runs sampled).
The second code is 106x times faster than the first code.

The code example is not trivial because it is one of [Fastify's pieces of code](https://github.com/fastify/send/pull/39)!
[Uzlopak](https://github.com/Uzlopak) is one of the Fastify team members obsessed with performance.


## Silly code is not so silly

There may be times when you think the source code is silly, but it's not.
Let's normalize an unknown input into a string array. If we can't normalize the input, we throw an error:

```js
function normalizeList (val) {
if (typeof val === 'string') {
return [val]
} else if (val === false) {
return []
} else if (Array.isArray(val)) {
for (let i = 0, il = val.length; i < il; ++i) {
if (typeof val[i] !== 'string') {
throw new TypeError('must be array of strings or false')
}
}
return val
} else {
throw new TypeError('must be array of strings or false')
}
}
```

I'm sure this code works, but it may not pass a coding interview — it's too verbose, and there is a smell of duplication... but [it is performant](https://github.com/fastify/send/pull/38)!
The code above is 4x times faster than the nice and clean code below:

```js
function normalizeList (val) {
const list = [].concat(val || [])
for (let i = 0; i < list.length; i++) {
if (typeof list[i] !== 'string') {
throw new TypeError('must be array of strings or false')
}
}
return list
}
```

So it is important to understand that sometimes simple checks are faster than a single line of code.
Let's call it an `exit early` pattern.
This doesn't mean that you should write verbose code, but you should be aware how to improve the performance:

- Conditions that are more likely to be true should be checked first
- Simple checks are faster than forcing a unique code path
- A bit of duplication is better than a bit less performance

Another interesting example is to map all the supported ASCII characters to a `Map` object.
In Fastify [we did this too](https://github.com/fastify/fast-uri/pull/7/files), and all in the name of performance.
This new code helps us to reduce the complexity of the code by treating uppercase and lowercase characters as the same.
For sure, this code would not pass a coding interview as well!

### Performance shortcuts

Earlier, I mentioned that you should write readable code and prefer simple checks. However, it's worth mentioning that
sometimes you may isolate a piece of code that is not readable, but it is faster.

A perfect example is the `new Function` usage. By writing:

```js
const functionArgs = ['value']
const functionBody = 'return value[0].toUpperCase()'
const fn = new Function(...functionArgs, functionBody)

console.log(fn('Hello')) // prints H
```

You can create a function body at runtime by concatenating strings.

This technique is the secret sauce of Fastify's [`fast-json-stringify`](https://github.com/fastify/fast-json-stringify) module.
Another extreme example is the `string.startsWith` and `string.endsWith` methods.
If we want to check that a string starts with the `foo` string, we can generate a function body like the following:


```js
const functionArgs = ['value']
const functionBody = `return value[0] === 'f' && value[1] === 'o' && value[2] === 'o'`
const fn = new Function(...functionArgs, functionBody)

console.log(fn('foo')) // prints true
console.log(fn('bar')) // prints false
```

The generated function is **14x** times faster than the native method!
This is the technique [likely to be integrated](https://github.com/fastify/help/issues/711) into Fastify's code.

⚠️ Note that generating a function body at runtime could be unsafe if you are managing user input to generate the function.
Be careful when you use this technique to make sure it improves the performance!

## Know your language and runtime

Node.js is a JavaScript runtime, but it has its own peculiarities that you should know in order to be able to write performant code.
If you deal with the `Error` object, or you have ever implemented a custom error, you should know about the [`Error.stackTraceLimit`](https://nodejs.org/api/errors.html#errorstacktracelimit)

The `Error.stackTraceLimit` is the maximum number of stack frames captured by the `Error.stack` property.
The default value is `10`, but you can change it to `0` when you don't need the complete stack trace, and this will improve the performance by 600x times!

This is an example of the most uncommon property of Node.js, but it is not always about tweaking the runtime,
sometimes you need to know the best constructs to use in your code.
[RafaelGSS](https://github.com/RafaelGSS) is a Node.js core contributor, and he has
written a [benchmark](https://github.com/RafaelGSS/nodejs-bench-operations) to compare the performance of
the different operations in JavaScript.
For example:

- Concatenating strings using `['a', 'b', 'c'].join('')` is 100x slower than `a + b + c`!
- The `foreach` loop is ~5% slower than the `for` loop!

The takeaway is to write readable and maintainable code, then optimize it by isolating the hot path
and benchmarking it. The last step is to optimize the hot path, focusing on tiny pieces of code.


## Use the right data structure

The data structure you use in your code can greatly impact your application's performance.

> There is no slow programming language, just bad algorithms and data structures design.
Says [Michele Riva](https://github.com/micheleriva) in his talk at [NodeConf EU 2022](https://www.youtube.com/watch?v=42sMkbGLlh4).
That is incredibly true, and you should always consider the data structure you use in your code.

If you are looping an array repeatedly, it is essential to ask yourself if you can use a `Map` or a `Set` instead.
A tree data structure is sometimes the best solution to run performant searches and traversals.

I can't suggest a simple rule for you to follow, but you should always think about the data structure you are using
and the algorithm that uses it.
I find it useful to check the [Big O notation cheat sheet](https://www.bigocheatsheet.com/) of the data structure
and to read some books about these topics.


## Summary

In this article, we have seen how to improve the performance of a Node.js application by using and writing
performant code and avoiding unnecessary operations.

Now, think about the application you are working on and try to speed it up by using the techniques we have highlighted in this article.

If you enjoyed this article, comment, share and follow me on [Twitter](https://twitter.com/ManuEomm)!
Remember to follow [Uzlopak](https://github.com/Uzlopak) on GitHub to see his amazing work on Fastify!

0 comments on commit be26f0a

Please sign in to comment.