Replacing Elasticsearch with Rust and SQLite
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
benches
migrations
rrinlog-server
rrinlog
src
test-assets
.gitignore
.travis.yml
Cargo.lock
Cargo.toml
LICENSE.txt
README.md
rrinlog-server.service
rrinlog.service

README.md

Build Status

rrinlog

rrinlog is my attempt at Replacing Elasticsearch with Rust and SQLite for my nginx access logs, as Elasticsearch is a resource hungry application even at idle. rrinlog's success has been outstanding, with a 100x reduction in memory, 1000x reduction in CPU usage, and 100x reduction in disk usage.

This project contains two binaries:

  • rrinlog is for consuming nginx acces logs and storing them in a SQLite database. This binary may be built on Rust stable.
  • rrinlog-server exposes this SQLite database according to Grafana's JSON API datasource. This binary requires Rust nightly. Current Rust web framework is actix web, but there is a Rocket branch as well

This project currently isn't meant at replacing Elasticsearch for the general populous for the following reasons:

Custon Nginx Access Log

rrinlog ingests a custom nginx access log format:

log_format vhost    '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent" "$host"';

Any other format would likely result in parsing errors.

Hardcoded SQL Queries

rrinlog-server let's me know what my top blog articles with the following SQL query:

SELECT referer,
       Count(*) AS views
FROM   logs
WHERE  host = 'comments.nbsoftsolutions.com'
       AND method = 'GET'
       AND path <> '/js/embed.min.js'
       AND epoch >= ?
       AND epoch < ?
       AND referer <> '-'
       AND remote_addr <> ?
GROUP  BY referer
ORDER  BY views DESC

This SQL query is tailored to me and how my blog is setup, so make no mistake that the intended audience with this query is solely me 😄

Limited Endpoints

These hardcoded SQL queries are needed as Grafana doesn't support SQLite as a native datasource. One day it may be supported like Mysql and Postgres, but until that day, rrinlog-server contains only a limited set of visualizations:

  • What are my top blog articles
  • How much outbound web data is leaving the server to other external IPs
  • How many requests are being serviced by other virtual hosts

No GeoIP Capabilities

Elasticsearch has the ability to take an IP address and turn it into a location. This is called GeoIP. I had a Grafana panel showing the top visiting cities, which is novel but not critical to monitor. Migrating from Elasticsearch meant I had to remove the visualization.