prime search engine
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
inheritance-poms
kinesis-logappender
miru-amza
miru-analytics-plugins-test
miru-analytics-plugins
miru-anomaly-deployable
miru-anomaly-plugins-test
miru-anomaly-plugins
miru-api
miru-bitmaps-roaring5
miru-bitmaps-roaring6
miru-bot-deployable
miru-catwalk-deployable
miru-catwalk-shared
miru-cluster-client
miru-cluster
miru-edge-siphon
miru-logappender
miru-manage-deployable
miru-metric-sampler
miru-plugin-test
miru-plugin
miru-reader-deployable
miru-reco-plugins-test
miru-reco-plugins
miru-reco
miru-service-test
miru-service
miru-siphon-api
miru-siphon-deployable
miru-stream-plugins-test
miru-stream-plugins
miru-stumptown-deployable
miru-stumptown-plugins-test
miru-stumptown-plugins
miru-sync-api
miru-sync-deployable
miru-tools-deployable
miru-ui
miru-wal-client
miru-wal-deployable
miru-wal
miru-writer-deployable
miru-writer
.gitignore
.travis.yml
LICENSE
NOTICE.txt
README.md
pom.xml
release.sh

README.md

Miru

What It Is

Miru is a multi-tenant stream engine. Choo choo! No, not a steam engine, a stream engine. What is a stream engine? It is a search engine without all the overhead of scoring. Documents are maintained in insertion order, or "time order" in streams parlance. The single ranking for a document (sometimes referred to as prime ordering) is its position in the index.

Alternative super hero names include Bit Collider and Join-O-Rama.

How It Works

Miru tenancy is comprised of a schema and time-ordered partitions. Each partition contains millions of ordered documents, and documents contain tens or thousands of terms in alignment with the schema. Each term is a sequence of bits, so at its heart Miru operates via efficient use of huge parallel bitsets.

Every partition is replicated for HA. Because every discrete partition is backed by a small number of files and a unified delta/append model, Miru supports tens or hundreds of thousands of partitions per node.

Use Cases

  1. Activity/message streams

    Miru is well suited for streams of messages such as content creation, commenting, and status updates. The filter and full-text search APIs let you write once but very efficiently present in any number of views/facets.

  2. Collaborative filtering and linear regression

    Miru can combine and collide meta information about content, people or other relatable features in order to compute recommendations and trends for dynamic queries and filters in realtime or near-realtime.

  3. Metrics and logging

    Similar to activity streams, chronologically ordered data inputs such as health metrics (resource utilization, counts, deltas) and logging (console/application output) fit easily into Miru’s time-ordered facet filters. Entire clusters or networks can aggregate as a single searchable set, presenting an instantaneous view of application health and error trends.

Getting Started

Check out Miru over at the wiki.

Licensing

Miru is licensed under the Apache License, Version 2.0. See LICENSE for full license text.