Skip to content

ryuken73/replay-log

Repository files navigation

Access Log Aggregator and Log Replay

Log replay : simple utility to replay access log

  • setup : set values in setup_env.sh
      LOG_DIR : source and target log directory
      SRC_FILE_NAME : from file
      DST_FILE_NAME : to_file
    
  • run : sh genLog.sh

Access Log Aggregator#1 : node.js

  • setup : set values in setup_env.sh
    DST_FILE_NAME : source access log (generated by genLog.sh)
    ENABLE_FAST_COLLECT : if true, then gathers only FAST_FIELDS.
    ACCESS_LOG_FAST_FIELDS : for fast and low resource usages, limit only fields.
    ACCESS_LOG_FAST_FIELD_POSITIONS : position of above fast fields in access log.
  • run : sh tail.sh

Access Log Aggregator#2 : awk script

** much faster and use less resources then node.js script

  • setup : set values in tail_awk.sh
    LOG_DIR : source log directory
    ACCESS_LOG : source access log file
    COL_TIME : column number of time in access log
    COL_IP : column number of ip in access log
    COL_STATUS : column number of http status code in access log
    AGGR_INTERVAL_SEC : Aggregate interval
    VIZ_SERVER : notify server (expose http post end point) url
  • run : sh tail_awk.sh
  • results
$ sh tail_awk.sh
TIMESTAMP                      200   300   400   500   rOther
[2021-07-08T15:00:21+09:00]    11915 0     30    0     0
[2021-07-08T15:00:27+09:00]    17539 0     38    0     0
[2021-07-08T15:00:33+09:00]    15175 0     53    0     0
[2021-07-08T15:00:39+09:00]    16942 0     102   0     0
[2021-07-08T15:00:45+09:00]    16579 0     102   0     0
[2021-07-08T15:00:50+09:00]    13173 0     48    0     0
[2021-07-08T15:00:56+09:00]    17538 0     54    0     0
[2021-07-08T15:01:01+09:00]    12286 0     30    0     0
[2021-07-08T15:01:07+09:00]    17347 0     41    0     0
[2021-07-08T15:01:13+09:00]    15211 0     41    0     0

Resource Usage (process 2000~2500 records per sec)

  • Aggregate by node.js : use 3% ~ 5% CPU
  • Aggregate by awk script : under 1% CPU