The Statefulness Aspect of NodeJS
Ask NodeJS developers why they choose this environment, and people typically will tell you the following:
- NPM is the biggest package management system.
- It can handle more connections than other languages.
I was never sold on these points, which kept me skeptical for a long time, for the following reasons:
- Any modern language will have something similar, and - for me, at least - big numbers are never a selling point. I prefer having fewer modules, all of which are killer, over having a massive database that I have to sift through to find the good stuff.
- True, but this feature is used in extreme niche situations. In reality, 99% of people don’t have hugely popular websites, like Netflix and the like. Meaning that you won’t take advantage of this feature in your everyday projects, anyway.
- True, but also not. Going from working on animations and UX to the back-end is a completely different beast. In either case, you need a completely different mind set. In addition, companies will make the mistake of thinking that they can get two employee for the price of one.
In addition, you won't find a great deal of information or help on the NodeJS.com site. Currently, all you get is an example how “easy” it is to create your own web server, and thats pretty much it. No additional advance examples, and no use cases, which could be of great help in understanding the product much better.
My hope with this article is that I’ll give you a good foundation to better understand what NodeJS is, so you’ll be able to make an informed decision for your next project.
The above sentence is very mysterious for those who have never used NodeJS. You hear people saying that you can build the backend of a website (server-side in this case), desktop apps, and even mobile apps and command line interface (CLI)). So it seems that this NodeJS can do all sorts of things, but how?
- what Cocoa is to Swift,
- what Unity 3D is to C#.
- what .NET is to C#
Check the documentation, and you’ll see many useful classes - such as File System, HTTP, OS, NET, and more, that will help you in your projects.
The Networking Part
Sockets are the base of any Internet-ready environment, and NodeJS basically has them built in. This means that you can write any network app that you can dream of; NodeJS is the Web Server itself.
If you know the protocol standard of a particular service you’ll be able to make it. For example, Node JS has the HTTP protocol built in. This allows your app to pars HTTP requests, and turn NodeJS into a Web Server, hence being a server and environment at the same time. But NodeJS can also be much more.
Other developers actually created some very interesting and fun projects that showcase what you can do if you implement a known protocol on top of the built-in sockets. For example, you can turn NodeJS into a:
- network printer (https://github.com/watson/ipp-printer)
- DNS server (https://www.npmjs.com/package/dnsd)
- POP3 server (https://github.com/akshut/pop3)
Boiling it down: You have access to TCP and UDP sockets, which means you can build anything network-related.
Not just networking
NodeJS also offers access to the file system of the machine it's running on. This means that you don’t have to make a network-enabled app. You can use NodeJS to parse, copy, move, and delete files. As with other languages on the market, you can make a CLI app for the terminal world.
The Event Loop
The Main Part: Statefulness
Until now, this article has given you a good frame of reference for gaining a better understanding of NodeJS from this point on. In this next part I’m going to talk about the best feature in my mind of NodeJS, which is its Statefulness environment.
Lets start. There are two types of environment that you can have in a computer program. It can work in a stateful or stateless environment, and the differences are as follows:
Stateful: The program can keep its state in memory for as long as it works, or there is power to the system, since RAM is volatile (can’t retain its state as a hard drive).
Stateless: The code has access to memory for the lifetime of the script, and every time your code finishes executing, you lose what you had in memory. That is why you'll use a database to save the progress of your code so you can get back to it once you run the script again.
History: How It All Started
I’m not a historian, but this is more or less how backend web development began. PHP started as side project to make it easier to build and maintain a web page. Before PHP you would write a web site using CGI, which was an interface to scripting languages or compiled languages of your choosing in the system. It all worked by accessing a URL, and the server would execute the corresponding code, display the result and die. There was no state.
Later, people started using databases to store data for later use, and for years this was how things were done.
I think this is the reason why developers don’t associate statefulness with web servers. When people think server they're still recall the very beginnings of the web.
Even though NodeJS is stateful, it seems that nobody uses this feature because of their awareness of history. There are some NPM packages that use this feature, but they hide it behind the word “magic”, and a new developer to the environment won’t learn about this feature.
My point in writing this is to raise awareness, and to prove that if used in the right way, it's a powerful aspect that makes writing more efficient and code faster, since you can’t get direct access to RAM.
Of course people know RAM is fast
Just one more thing to make sure that we are on the same page. Of course people knows that RAM is fast, but not having the ability to store data there is a huge disadvantage. That’s why databases like Memcached, Redis and more were built. But NodeJS offers a third option, direct access.
What can we do with statefulness?
Since statefulness allows you to use the RAM in your server, you can use this feature to:
- Store temporary data that you don’t care to keep around
- Prototype a database structure without the database
Dry theory is nothing compared to a practical example. So, let's wander off in our minds, and picture ourselves working on a blog. This blog displays a list of 100 articles on one page. A bit crazy, but that’s the point.
Normally, our code would do a pretty big query each time someone visits the home page. Not only do you have to get the full post, but you also need the relevant tags, time, author, etc. Our database will have some processing to do.
You would have only one query per restart, and after that you could even kill your database, because the site would still keep displaying all your posts.
The caveat, as you can see, is that you need an interface for your data. If you were to edit a blog post straight from the database, you would need to restart the whole site to reload the new information.
Are databases dead?
Of course not! The idea of databases is to store data, and later access it in a reliable way. One important aspect of a database is to make sure to handle situations where two action are performed on the same peace of data at the same time. For example:
- One action is trying to delete the data
- The next action is trying to read the data from that variable
The RAM way of storing data is bare bones, what you write is what you get, so there is nothing to shield yourself from situations like this, but is perfectly fine if you just read data 99.99 percent of the time.
If the data is static, why bother keeping it in the database?
Because you'll want to be able to edit it easily. Maybe you'll need to change prices or edit the product description from time to time.
To learn NodeJS, I decided to build a full product from start to finish, called https://simpe.li. At first, I was developing the site as if there were no state. But in the middle of development, I realized that my variables would stay in memory if I declared them outside of the routing function.
When that happened, not only did I have an eureka moment, but I completely changed my approach to storing data. The structure of the site is as follows:
- Server for the main page, with the dashboard
- Server for the public API calls
- Server for the PDF creation
- Server for the database which is in a API form
- Other server for miscellaneous stuff
Let's break down how I use memory on all of these servers
- The home page has three places where the data is dynamic: the list of templates, the specific template itself, and the price page. When you visit these places, you get the data from an array of objects which are loaded in memory. This data is non unique, of course. The only place where I always query the database is after you log in, since the data is unique to the user, and working out a solution to only keep the most frequent user in memory would take to much time to work out at this point. But I believe that this will definitely be possible in the future.
- The API itself doesn’t store anything; it just checks the data prior to saving it.
- The code that makes the PDF use RAM for storing all the templates. This improves the conversion process, since I’m not reading the template from the hard drive for each conversion. When I start the app, the code first loads all the templates in an array, and once that part is done, it starts the main component.
- This part is the most fun: the server for the database. Since I designed it as an API, all of the apps that want access to the database have no idea how the data is stored. But I know. ;) I store data in three different ways:
- Direct interaction with the database, for unique data, such as user related data.
- Only in RAM, for data that I don’t care about. A good example would be all my telemetry data to check the efficiency of the servers, amount of memory they are using, how many PDFs are being made, and any other type of useful information. This data will live in the server until the next reset. I don’t mind losing it. This way, collecting this information has a minimal impact on server performance.
To Sum It All Up
My hope with this article is that I’ll make you consider this cool aspect of NodeJS in your existing or feature project. Since using RAM not only increases the speed of a website; it also lowers the cost for a simple site with lots of queries. Imagine you actually have a blog, with your articles, and you use a database that you constantly read from. The more users you get, more connection you’ll have, and suddenly you’ll have to buy a bigger account for your database to support more connections. If you used RAM to serve your content, more page views won’t equal more connection to your database.
I have a Favor to Ask
I would love to have a list of all the Web Server back-end solutions that are actually stateful. A list like this for sure would help others discover other languages with this feature.
- Python. Thank you to Dmitry Sadovnychyi for the contribution.
If you've enjoyed this article/project, please consider giving it a
Also check out my GitHub account, where I have other articles and apps that you might find interesting.
If you'd like me to help you, I'm available for hire. Contact me at firstname.lastname@example.org.
Where to follow
You can follow me on social media
More about me
I don’t only live on GitHub, I try to do many things not to get bored