GitHub - tontinton/toshokan: Log search engine on object storages

Introduction

toshokan is a search engine (think Elasticsearch, Splunk), but storing the data on object storage, most similar to Quickwit.

It uses:

tantivy - for building and searching the inverted index data structure.
Apache OpenDAL - for an abstraction over object storages.
PostgreSQL - for storing metadata atomically, removing data races.

I've also posted a blog post explaining the benefits and drawbacks of using an object storage for data intensive applications.

Architecture

How to use

toshokan create example_config.yaml

# Index a json file delimited by new lines.
toshokan index test ~/hdfs-logs-multitenants-10000.json

# Index json records from kafka.
# Every --commit-interval, whatever was read from the source is written to a new index file.
toshokan index test kafka://localhost:9092/topic --stream

toshokan search test "tenant_id:[60 TO 65} AND severity_text:INFO" --limit 1 | jq .
# {
#   "attributes": {
#     "class": "org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace"
#   },
#   "body": "src: /10.10.34.30:33078, dest: /10.10.34.11:50010, bytes: 234, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-202827006_103, offset: 0, srvID: d9ef1b17-4314-4cd8-91eb-095413c3427f, blockid: BP-108841162-10.10.34.11-1440074360971:blk_1074072709_331885, duration: 2571934",
#   "resource": {
#     "service": "datanode/01"
#   },
#   "severity_text": "INFO",
#   "tenant_id": 61,
#   "timestamp": "2016-04-13T06:46:54Z"
# }

# Merge index files for faster searching.
toshokan merge test

toshokan drop test

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
.github		.github
.sqlx		.sqlx
migrations		migrations
src		src
tests		tests
.env		.env
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md
architecture.svg		architecture.svg
example_config.yaml		example_config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Introduction

Architecture

How to use

About

Licenses found

Languages

License

Licenses found

tontinton/toshokan

Folders and files

Latest commit

History

Repository files navigation

Introduction

Architecture

How to use

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Languages