Skip to content
🐞 Portable log aggregation tool for middle-scale system operation/observation.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
client
cmd/hrv
collector
config
db
doc
logger
parser
stdout
testdata
version
.gitignore
.goreleaser.yml
.travis.yml
CHANGELOG.md
LICENSE
Makefile
README.md
go.mod
go.sum

README.md

Harvest Build Status GitHub release Go Report Card

Portable log aggregation tool for middle-scale system operation/observation.

screencast

Harvest provides the hrv command with the following features.

  • Agentless.
  • Portable.
  • Only 1 config file.
  • Fetch various remote/local log data via SSH/exec. ( hrv fetch )
  • Output all fetched logs in the order of timestamp. ( hrv cat )
  • Stream various remote/local logs via SSH/exec. ( hrv stream )
  • Copy remote/local raw logs via SSH/exec. ( hrv cp )

Usage

🐞 Fetch and output remote/local log data

1. Set log URLs (and log type) in config.yml

---
targetSets:
  -
    description: webproxy syslog
    type: syslog
    urls:
      - 'ssh://webproxy.example.com/var/log/syslog*'
    tags:
      - webproxy
      - syslog
  -
    description: webproxy NGINX access log
    type: combinedLog
    urls:
      - 'ssh://webproxy.example.com/var/log/nginx/access_log*'
    tags:
      - webproxy
      - nginx
  -
    description: app log
    type: regexp
    regexp: 'time:([^\t]+)'
    timeFormat: 'Jan 02 15:04:05'
    timeZone: '+0900'
    urls:
      - 'ssh://app-1.example.com/var/log/ltsv.log*'
      - 'ssh://app-2.example.com/var/log/ltsv.log*'
      - 'ssh://app-3.example.com/var/log/ltsv.log*'
    tags:
      - app
  -
    description: db dump log
    type: regexp
    regexp: '"ts":"([^"]+)"'
    timeFormat: '2006-01-02T15:04:05.999-0700'
    urls:
      - 'ssh://db.example.com/var/log/tcpdp/eth0/dump*'
    tags:
      - db
      - query
  -
    description: PostgreSQL log
    type: regexp
    regexp: '^\[?(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} \w{3})'
    timeFormat: '2006-01-02 15:04:05 MST'
    multiLine: true
    urls:
      - 'ssh://db.example.com/var/log/postgresql/postgresql*'
    tags:
      - db
      - postgresql
  -
    description: local Apache access log
    type: combinedLog
    urls:
      - 'file:///path/to/httpd/access.log'
    tags:
      - httpd

You can use hrv configtest for config test.

$ hrv configtest -c config.yml

2. Fetch target log data via SSH/exec ( hrv fecth )

$ hrv fetch -c config.yml --tag=webproxy,db

3. Output log data ( hrv cat )

$ hrv cat harvest-20181215T2338+900.db --with-timestamp --with-host --with-path | less -R

🐞 Stream remote/local logs

1. Set config.yml

2. Stream target logs via SSH/exec ( hrv stream )

$ hrv stream -c config.yml --with-timestamp --with-host --with-path --with-tag

🐞 Copy remote/local raw logs

1. Set config.yml

2. Copy remote/local raw logs to local directory via SSH/exec ( hrv cp )

$ hrv cp -c config.yml

Architecture

hrv fetch and hrv cat

img

hrv stream

img

Installation

$ brew install k1LoW/tap/harvest

or

$ go get github.com/k1LoW/harvest/cmd/hrv

What is "middle-scale system"?

  • < 50 instances
  • < 1 million logs per hrv fetch

What if you are operating a large-scale/super-large-scale/hyper-large-scale system?

Let's consider agent-base log collector/platform, service mesh and distributed tracing platform!

Internal

Requirements

  • UNIX commands
    • date
    • find
    • grep
    • head
    • ls
    • tail
    • xargs
    • zcat
  • sudo
  • SQLite

WANT

  • hrv analyze
  • tag DAG
  • Viewer / Visualizer

References

  • Hayabusa: A Simple and Fast Full-Text Search Engine for Massive System Log Data
    • Make simple with a combination of commands.
    • Full-Text Search Engine using SQLite FTS.
You can’t perform that action at this time.