Activity Feed, Timeline, News Feed, Notification Feed with MongoDB, Node and CRDTs
Clone or download
Latest commit 29a90d2 Aug 20, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src better docs and faster benchmark setup for benchmark 3 Aug 15, 2018
test run socket in cluster mode Aug 14, 2018
.babelrc wip Jul 29, 2018
.eslintignore linting and cleanup Jul 30, 2018
.eslintrc.js prettier Jul 30, 2018
.gitignore one step closer Jul 28, 2018
.prettierrc prettier Jul 30, 2018
LICENSE Initial commit Jul 24, 2018
package.json release v0.2.3 Aug 20, 2018
process.json udpate Aug 14, 2018
readme.md Update readme.md Aug 19, 2018
setup-tests.js prettier Jul 30, 2018
test-entry.js prettier Jul 30, 2018
yarn.lock support for loading settings via .env files Aug 14, 2018

readme.md

Activity Feed Node & Mongo DB

Simple example of how to build a news feed with Node and MongoDB. I created it for this blogpost: "Scalable News Feeds - MongoDB vs Stream" Presentation about the architecture is on SlideShare

It uses CRDTs to reduce the need for locks.

Install

yarn add mongodb-activity-feed

MongoDB & Redis Activity Feed

brew install redis mongodb
brew services start redis
brew services start mongodb

Initialization

Here's a very short example

import { FeedManager } from 'mongodb-activity-feed'
const fm = new FeedManager(mongoConnection, redisConnection, {
	bull: false,
	firehose: false,
})

And a bit longer one:

import { FeedManager, FayeFirehose } from 'mongodb-activity-feed'
import Redis from 'ioredis'
import mongoose from 'mongoose'

const redis = new Redis('redis://localhost:6379/9')
const mongo = mongoose.connect(
	'mongodb://localhost:27017/mydb',
	{
		autoIndex: true,
		reconnectTries: Number.MAX_VALUE,
		reconnectInterval: 500,
		poolSize: 50,
		bufferMaxEntries: 0,
		keepAlive: 120,
	},
)
const fayeFirehose = new FayeFirehose('http://localhost:8000/faye')

const fm = new FeedManager(mongo, redis, { bull: true, firehose: fayeFirehose })

The bull option determines if activity fanout is done over a bull queue or synchronous. The firehose option allows you to listen to feed changes in realtime using Faye.

Timeline MongoDB

Here's a quick tutorial on a simple timeline with mongodb-activity-feed

const timelineScott = await fm.getOrCreateFeed('timeline', 'scott')
const userNick = await fm.getOrCreateFeed('user', 'nick')
await fm.follow(timelineScott, userNick)
const activity = {
	actor: 'user:nick',
	verb: 'watch',
	object: 'video:123',
}
await fm.addActivity(activity, userNick)
const activities = await fm.readFeed(timelineScott, 0, 10)

Notification System MongoDB

Here's a quick tutorial on a simple timeline with mongodb-activity-feed

const notificationBen = await fm.getOrCreateFeed('notification', 'ben')
// lets say you want to notify Ben that Nick likes his post
const activity = {
	actor: 'user:nick',
	verb: 'like',
	object: 'post:123',
}
await fm.addActivity(activity, notificationBen)
// group together all activities with the same verb and actor
const aggregationMethod = activity => {
	return activity.verb + '__' + activity.actor
}
const groups = await fm.readFeed(notificationBen, 0, 3, null, aggregationMethod)

Adding an activity

Add an activity like this.

const activity = {
	actor: 'user:nick',
	verb: 'like',
	object: 'post:123',
}
fm.addActivity(activity, feed)

Removing an activity

Remove an activity:

const activity = {
	actor: 'user:nick',
	verb: 'like',
	object: 'post:123',
}
fm.removeActivity(activity, feed)

Follow a feed

// follow with a copy limit of 10
const timelineScott = await fm.getOrCreateFeed('timeline', 'scott')
const userNick = await fm.getOrCreateFeed('user', 'nick')
await fm.follow(timelineScott, userNick, 10)

Follow Many Feeds

// follow with a copy limit of 10
const source = await fm.getOrCreateFeed('timeline', 'scott')
const target = await fm.getOrCreateFeed('user', 'nick')
const target2 = await fm.getOrCreateFeed('user', 'john')
await fm.followMany([{ source, target }, { source, target2 }], 10)

Unfollow a feed

const timelineScott = await fm.getOrCreateFeed('timeline', 'scott')
const userNick = await fm.getOrCreateFeed('user', 'nick')
await fm.unfollow(timelineScott, userNick)

Create Many Feeds at Once

const feedReferences = [
	{ group: 'timeline', feedID: 'scott' },
	{ group: 'notification', feedID: 'ben' },
]
const feedMap = await fm.getOrCreateFeeds(feedReferences)

Reading a feed from MongoDB

Basic Read

const notificationAlex = await fm.getOrCreateFeed('notification', 'alex')
await fm.readFeed(notificationAlex, 0, 10)

Ranked Feed

const notificationAlex = await fm.getOrCreateFeed('notification', 'alex')
// asumes that you have a property on your activity called "popularity"
const rankingMethod = (a, b) => {
	return b.popularity - a.popularity
}
const activities = await fm.readFeed(notificationAlex, 0, 3, rankingMethod)

Aggregated Feed

const notificationAlex = await fm.getOrCreateFeed('notification', 'alex')
// group together all activities with the same verb and actor
const aggregationMethod = activity => {
	return activity.verb + '__' + activity.actor
}
await fm.readFeed(notificationAlex, 0, 10, null, aggregationMethod)

Activities are unique on the combination of foreign_id and time. If you don't specify foreign id the full activity object will be used.

Firehose Configuration

// socket (recommended)
const firehose = new SocketIOFirehose(SOCKET_URL)
// faye
const firehoseFaye = new FayeFirehose(FAYE_URL)
// dummy firehose
const firehoseDummy = new new DummyFirehose(message => {})()
fm = new FeedManager(mongo, redis, { firehose: firehose, bull: false })

Pros/Cons

MongoDB is a nice general purpose database. For building activity feeds it's not a great fit though. Cassandra and Redis will in most scenarios outperform a MongoDB based solution.

Dedicated activity feed databases like Stream are typically 10x more performant and easier to use.

So in most cases you shouldn't run your activity feed on MongoDB. It only makes sense if your traffic is relatively small and you're not able to use cloud hosted APIs. Unless you really need to run your feeds on-prem you should not use this in prod.

If you do need to run on-prem I'd recommend the open source Stream-Framework

Contributing

Pull requests are welcome but be sure to improve test coverage.

Running tests

yarn test

Linting

yarn lint

Prettier

yarn prettier

Benchmarks

These docs aim to make it easy to reproduce these benchmark. Initial plan is to run these again in 2019 to see how Mongo and Stream changed.

Benchmark prep (dev mode)

** Step 1 - Clone the repo **

git clone https://github.com/GetStream/mongodb-activity-feed.git
cd mongodb-activity-feed
yarn install
brew install redis mongodb
brew services start redis
brew services start mongodb

** Step 2 - Environment variables **

You'll want to configure the following environment variables in a .env file

STREAM_APP_ID=appid
STREAM_API_KEY=key
STREAM_API_SECRET=secret
MONGODB_CONNECTION=connectionstring
SOCKET_URL=http://localhost:8002
REDIS_HOST=localhost
REDIS_PORT=6379

** Step 3 - Start worker and socketio **

For dev purposes you can use this setup to start the processes

yarn build
pm2 start process.json

This will start a worker and socket.io cluster.

** Step 4 - Benchmark dir **

cd dist/benchmark

Benchmark 1 - Read latency

MongoDB

# flush your mongo instance before running this
REPETITIONS=10 CONCURRENCY=5 node read_latency_mongo.js

Stream

 REPETITIONS=10 CONCURRENCY=5 node read_latency.js

The blogpost runs the benchmark with 10 repetitions and concurrency set to 5, 10 and 20.

Benchmark 2 - Fanout & realtime latency

MongoDB

# flush your mongo instance before running this
CONCURRENCY=1 node fanout_latency_mongo.js

Stream

CONCURRENCY=1 node babel-node fanout_latency.js

The blogpost runs the benchmark with 1, 3 and 10 for the concurrency.

Benchmark 3 - Network Simulation/ Capacity

MongoDB

# flush your mongo instance before running this
MAX_FOLLOWERS=1000 node capacity_mongo.js

Stream

MAX_FOLLOWERS=1000 node capacity.js

The blogpost increase max followers from 1k to 10k and finally 50k

Benchmark prep (production notes)

  1. SocketIO:

Note that you need to configure the load balancer for Socket.io to be sticky

https://socket.io/docs/using-multiple-nodes/

  1. Redis

For optimal performance be sure to setup redis to not be persistent