1. What
2. Why
3. How it Works
4. Client to Server
5. WebSocket Client
6. Tech Stack
7. Infrastructure
8. Data Storage
9. Deploying a Stack
Near real-time event-driven analytics server for my blog site.
- For the love of data 😊
- Reader privacy - tracks views not users
- It's fun!
This project is also a way for me to practise my Go programming skills by building something I actually use.
Architectural diagram can be found here.
The server counts the total hit on a page and determines unique views based on the value of refreshed
as false sent by the client. If this is not supplied, unique views will be calculated the same way as total views. It is up to the client to decide whether a page is refreshed or not to be counted as non-unique.
The WebSocket server also sends back the total views for each article or blog. This can be added to blog pages. (I haven't added to mine, it requires CSS >.< lol_).
The following is the JSON data expected by the server:
{
"message": "views", // websocket key route; this is mandatory
"articleId": string,
"articleTitle": string,
"previousPage": string, // optional
"currentPage": string, // optional
"refreshed": boolean, // optional
"referrer": string
}
You can use any WebSocket client but ultimately, the tracking/tagging is up to you. (Blog post coming soon on how I set this up in my Gatsby blog site).
- Go: for AWS Lambda
- NodeJS/TypeScript & AWS CDK: for Infrastructure as Code
- SNS, SQS, DynamoDB, DynamoDB Streams and API Gateway WebSockets
The infra
directory contains the CDK app for the resources that constitute the entire infrastructure of the project on AWS.
The resources include: API Gateway (Websocket), Lambda functions, SQS, SNS and DynamoDB.
The API Gateway resources are backed by Lambda functions that process incoming requests via a websocket and publish them to SNS.
SNS is subscribed to by a number of SQS queues that receive notifications based on specific filters: post_views
, profile_views
and so on.
Each SQS queue has a corresponding Lambda function which processes messages from the queue and sends them off to DynamoDB.
The app infrastructure contains four DynamoDB tables:
- HomeAndProfile: stores hit counts of my blog homepage and contacts page
- PostCountWriter: stores hit counts of my blog posts and is listened to by a DynamoDB stream
- PostCountReader: stores data written to DynamoDB stream which is a copy of updates to the PostWriter table. It also serves the data back to the WebSocket client via a Lambda
- Referrer: stores referrer data if available
- Go and NodeJS
- An AWS account
- AWS CLI
- AWS CDK CLI
- Set your AWS profile using the AWS CLI
- Choose your stack namespace. Allowed namespaces are:
prod
,stage
orlocal
. The namespace you set will map to the client domain that can talk to the analytics server on AWS. The client url is yours to decide. Mine is currently my blog site and so I useprod
to deploy and set a prod client forPROD_CLIENT_URL
.dev
is ideal for deploying and testing locally. A list of enviroment variables to set can be found ininfra/env.template
- From the root of the project run the script:
./build_packages.sh
- Then run
./deploy.sh
to deploy to your personal AWS account
- The test script
test.sh
can be used to run all the unit tests in the project for the Go Lambda functions
My deployed stack serves my blog site which is built with Gatsby and bootstrapped with an open-source template by Lumen which has pages and posts. I have however added react context components and the WebSocket API client to stream data my analytics server.