logstash elasticsearch kibana 日志分析系统 docker 镜像构建

本库地址 https://github.com/jixiuf/logstashdocker

官方docker 镜像

https://hub.docker.com/_/logstash/ https://hub.docker.com/_/kibana/ https://hub.docker.com/_/elasticsearch/

docker pull logstash
docker pull kibana
docker pull elasticsearch

生成包含 logstash elasticsearch kibana 的 docker 镜像

直接本目录下

sudo make

或

docker build -t najaplus/logstash:latest  .

运行镜像

sudo  docker run -t -i najaplus/logstash bash

logstash

logstash 主要功能，收集与解析数据，从日志文件或redis 队列中解析出日志数据，经过logstash 一些filter plugin 解析转换为格式化的内容(json) ,存储到某个地方( 可以是redis ,file,elasticseatch), 以便进行数据分析与统计

测试logstash 是否成功安装

在镜像中运行

logstash -e 'input { stdin { } } output { stdout {} }'

此时会进入交互式提示界面，然后输入任意字符如 “hello” 会有如下提示

2016-05-06T17:30:17.219Z 40b8994c31cb hello

logstash 官网快速入门

https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html

logstash demo

logstash_demo 目录下有几个demo

logstash -f logstash1.conf
logstash -f logstash1.conf  --auto-reload

Elasticsearch

主要功能，数据的存储与检索，基于json 格式存储，基于Lucene

elasticsearch 的几个主要概念

Cluster 集群

集群有个名字，防止不同集群间混淆

elasticsearch --cluster.name my_cluster_name --node.name my_node_name

默认名称 elasticsearch

Node 节点

一个cluster 下分多个node

elasticsearch --cluster.name my_cluster_name --node.name my_node_name

默认名称 elasticsearch 如果只启动一个node 相当于

elasticsearch  --cluster.name elasticsearch --node.name elasticsearch

Index

一些具有共性的documents 的集合，可以认为一个 index 就是一个类别,比如我把游戏1的信息存到index1上游戏2的信息存到index2上index 有名字一个cluster 可以有任意多个index

Type

Index 下可以定义多个Type 可以认为是编程语言里的struct/class a type is defined for documents that have a set of common fields.

Document

docuemnt 存储具体和信息，如果Type 当成java里的Class ,则Document 可以认为是这个Class 的实例在 elasticsearch中它实际是一个json

Shards&Replication

架构层面上的东西，可以进行分片，及主从这个可以参考mysql 的数据库分片与主从一个index 内的数据量可能极大，可以将index 分成多片(称为shard) 当定义index 时，可以指定将此index 分成n个shards

Each shard is in itself a fully-functional and independent “index” that can be hosted on any node in the cluster. 可以认为index 与shards是逻辑上的，而 cluster 与node 上架构上的。一个index 分为多个shards(s1,s2,s3) ，一个cluster 分为多个node(n1,n2,n3) 将shard s1,s2,s3 分别放到n1,n2,n3节点上

而Replication 可以认为是从库，存在的意义，

备份，及当某个node 挂了可以failover, 以保证高可用性(hight available)
查询可以在从库上进行
默认情况下一个index 有5个shard, 每个shard 有一个 replica shards,即共有10个shards 通常情况下 replica shard 肯定跟primary shard 不在同一个节点上(这样从库还真正有意义)

启动

elasticsearch 基于Lucene,而Lucene 使用java 编写，所以java jdk 是安装所必须的

elasticsearch
或
elasticsearch --cluster.name my_cluster_name --node.name my_node_name

启动之后9200端口会监听http 请求

检查节点状态

curl 'localhost:9200/_cat/health?v'

deployer@iZ94badqop7Z logstash_demo/demo1 (master) $ curl ‘localhost:9200/_cat/health?v’ epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent 1462637474 00:11:14 elasticsearch yellow 1 1 5 5 0 0 5 0 - 50.0%

获取node 列表

curl 'localhost:9200/_cat/nodes?v'

deployer@iZ94badqop7Z logstash_demo/demo1 $ host ip heap.percent ram.percent load node.role master name 120.24.77.58 120.24.77.58 7 92 0.17 d * zjh

查看集群上有哪个index

curl 'localhost:9200/_cat/indices?v'

health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open logstash-2016.05.07 5 1 6 0 17.3kb 17.3kb

可以看到index 的名字， primary个数，replica个数 ,docuemnts数量，

创建一个index

curl -XPUT 'localhost:9200/customer?pretty'

{ “acknowledged” : true }

删除某个index

curl -XDELETE 'localhost:9200/customer?pretty'

创建某个Type 的Documents

这里在index:customer上创建了一个type 为 external id=1的document 如果id=1的已经存在，则会替换之

curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
    {
    "name": "John Doe"
    }'

{ “_index” : “customer”, “_type” : “external”, “_id” : “1”, “_version” : 1, “_shards” : { “total” : 2, “successful” : 1, “failed” : 0 }, “created” : true }

实际情况上，在创建document 时，不必手动去创建相应的index,执行上述命令，如果没有index:customer,则会自动创建

curl -XPUT ‘localhost:9200/custome2r/external/1?pretty’ -d ’ { “name”: “John Doe” }’

查询某个document

curl -XGET 'localhost:9200/customer/external/1?pretty'

{ “_index” : “customer”, “_type” : “external”, “_id” : “1”, “_version” : 1, “found” : true, “_source” : { “name” : “John Doe” } }

update document

update 实际是先删除后增加

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '
{
  "doc": { "name": "Jane Doe","age":11 }
}'

通过script 修改age 的值 +5 script 文档 https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting.html

curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '
{
  "script" : "ctx._source.age += 5"
}'

目前的版本，script 操作只能会对一个docuemnt ,以后或许会支持类似于sql update 的操作，同时修改多个

delete document

curl -XDELETE 'localhost:9200/customer/external/2?pretty'

批量操作

同时创建id=1,2的 type:external

curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
    {"index":{"_id":"1"}}
    {"name": "John Doe" }
    {"index":{"_id":"2"}}
    {"name": "Jane Doe" }
    '

修改一个，同时删除另一个

curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}
'

批量从文件导入

假如有文件 account.json

{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
{"index":{"_id":"13"}}
{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"nanettebates@quility.com","city":"Nogal","state":"VA"}
{"index":{"_id":"18"}}
{"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","employer":"Boink","email":"daleadams@boink.com","city":"Orick","state":"MD"}
{"index":{"_id":"20"}}
{"account_number":20,"balance":16418,"firstname":"Elinor","lastname":"Ratliff","age":36,"gender":"M","address":"282 Kings Place","employer":"Scentric","email":"elinorratliff@scentric.com","city":"Ribera","state":"WA"}

curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary "@account.json"

curl 'localhost:9200/_cat/indices?v

curl 'localhost:9200/_cat/indices?v'
health index pri rep docs.count docs.deleted store.size pri.store.size
yellow bank    5   1       1000            0    424.4kb        424.4kb

Search

查所有

# 两种方式， 一种通过参数 ，一种通过request body 发送json内容
curl 'localhost:9200/bank/_search?q=*&pretty'
#或
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
 {
 "query": { "match_all": {} }
 }'

   {
"took" : 63,
"timed_out" : false,
"_shards" : {
  "total" : 5,
  "successful" : 5,
  "failed" : 0
},
"hits" : {
  "total" : 1000,
  "max_score" : 1.0,
  "hits" : [ {
    "_index" : "bank",
    "_type" : "account",
    "_id" : "1",
    "_score" : 1.0, "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
  }, {
    "_index" : "bank",
    "_type" : "account",
    "_id" : "6",
    "_score" : 1.0, "_source" : {"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
  }
  ...
  ]}}

查询语法

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

size 指定返回多少个结果

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
 "size": 1
}'

from and sort 返回结果集的第3，4，5条

from 控制从哪条记录起始(0based) sort：使用balance 降序排列

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
  "from":2,
 "size": 3,
  "sort": { "balance": { "order": "desc" } }
}'

只返回特定的字段 _source

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
  "from":2,
 "size": 3,
  "sort": { "balance": { "order": "desc" } },
  "_source":["account_number","balance"]
}'

      {
  "took" : 10,
  "errors" : true,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 8,
    "max_score" : null,
    "hits" : [ {
      "_index" : "bank",
      "_type" : "account",
      "_id" : "1",
      "_score" : null,
      "_source" : {
        "account_number" : 1,
        "balance" : 39225
      },
      "sort" : [ 39225 ]
    }, {
      "_index" : "bank",
      "_type" : "account",
      "_id" : "13",
      "_score" : null,
      "_source" : {
        "account_number" : 13,
        "balance" : 32838
      },
      "sort" : [ 32838 ]
    }, {
      "_index" : "bank",
      "_type" : "account",
      "_id" : "37",
      "_score" : null,
      "_source" : {
        "account_number" : 37,
        "balance" : 18612
      },
      "sort" : [ 18612 ]
    } ]
  }
}

根据字段查询选定条件的

查 account_number=37的

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match":{"account_number":37} },
"_source":["account_number","balance"]
}'

match_phrase似乎跟match 是一样的(返回结果好像是一样的)

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_phrase":{"account_number":37} },
"_source":["account_number","balance"]
}'

      {
  "took" : 33,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.30685282,
    "hits" : [ {
      "_index" : "bank",
      "_type" : "account",
      "_id" : "37",
      "_score" : 0.30685282,
      "_source" : {
        "account_number" : 37,
        "balance" : 18612
      }
    } ]
  }
}

bool 语法（must,must_not,should）

相当于 and not or

查 gender==M and age==32

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
  {
    "query": {
      "bool": {
          "must": [
                  { "match": { "gender": "M" } },
                  { "match": { "age": "32" } }
          ]
      }
    }
  }'

查 age==31 or age==32

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
  {
    "query": {
      "bool": {
          "should": [
                  { "match": { "age": "31" } },
                  { "match": { "age": "32" } }
          ]
      }
    }
  }'

查 (gender=M and age==32) and( balance!= 32838 and balance!= 18612 )

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
  {
    "query": {
      "bool": {
          "must": [
                  { "match": { "gender": "M" } },
                  { "match": { "age": "32" } }
          ],
          "must_not": [
                  { "match": { "balance" : 32838} },
                  { "match": { "balance" : 18612} }
          ]
      }
    }
  }'

filter 结果集的过滤

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
  {
    "query": {
      "bool": {
          "must": [{ "match": { "gender": "M" } }],
          "filter": {"range":{ "balance":{"gte":10,"lte":20000}}}
      }
      }
  }'

Aggregations(合计) ==sql group by

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
    "size": 0,
    "aggs": {
        "group_by_state_just_a_name": {
            "terms":{"field":"state"}
        }
    }
}'

这里terms 是按field:state 进行统计其数量，以计 {state:”mystate”,count:100} 的形式返回 size=0 意思是说只返回统计结果即下面的 aggregations里的数据,而hits 结果为空

SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC

      {
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 8,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_state" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "il",
        "doc_count" : 1
      }, {
        "key" : "in",
        "doc_count" : 1
      }, {
        "key" : "md",
        "doc_count" : 1
      }, {
        "key" : "ok",
        "doc_count" : 1
      }, {
        "key" : "pa",
        "doc_count" : 1
      }, {
        "key" : "tn",
        "doc_count" : 1
      }, {
        "key" : "va",
        "doc_count" : 1
      }, {
        "key" : "wa",
        "doc_count" : 1
      } ]
    }
  }
}

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state",
        "order": {
          "average_balance": "desc"
        }
      },
      "aggs": {
        "average_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}'

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 8,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_state" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "il",
        "doc_count" : 1,
        "average_balance" : {
          "value" : 39225.0
        }
      }, {
        "key" : "in",
        "doc_count" : 1,
        "average_balance" : {
          "value" : 48086.0
        }
      }, {
        "key" : "md",
        "doc_count" : 1,
        "average_balance" : {
          "value" : 4180.0
        }
      }, {
        "key" : "ok",
        "doc_count" : 1,
        "average_balance" : {
          "value" : 18612.0
        }
      }, {
        "key" : "pa",
        "doc_count" : 1,
        "average_balance" : {
          "value" : 40540.0
        }
      }, {
        "key" : "tn",
        "doc_count" : 1,
        "average_balance" : {
          "value" : 5686.0
        }
      }, {
        "key" : "va",
        "doc_count" : 1,
        "average_balance" : {
          "value" : 32838.0
        }
      }, {
        "key" : "wa",
        "doc_count" : 1,
        "average_balance" : {
          "value" : 16418.0
        }
      } ]
    }
  }
}

    curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "size": 0,
  "aggs": {
    "group_by_age": {
      "range": {
        "field": "age",
        "ranges": [
          {
            "from": 20,
            "to": 30
          },
          {
            "from": 30,
            "to": 40
          },
          {
            "from": 40,
            "to": 50
          }
        ]
      }
    }
  }
}'

        curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
  "size": 0,
  "aggs": {
    "group_by_age": {
      "range": {
        "field": "age",
        "ranges": [
          {
            "from": 20,
            "to": 30
          },
          {
            "from": 30,
            "to": 40
          },
          {
            "from": 40,
            "to": 50
          }
        ]
      }
    }
  }
}'

elasticsearch 启动参数

配置 elasticsearch 的JVM 参数

两个环境变量，尽量不要修改JAVA_OPTS,保持原样, 而修改 ES_JAVA_OPTS ES_HEAP_SIZE 设置 heap 大小(min =max=this) ES_MIN_MEM ES_MAX_MEM (分别设置min max )官方不推荐提前设置好 max-open-files ulimit -n 查看

Virtual memoryedit

Elasticsearch uses a hybrid mmapfs / niofs directory by default to store its indices. The default operating system limits on mmap counts is likely to be too low, which may result in out of memory exceptions. On Linux, you can increase the limits by running the following command as root:

sysctl -w vm.max_map_count=262144

To set this value permanently, update the vm.max_map_count setting in /etc/sysctl.conf.

If you installed Elasticsearch using a package (.deb, .rpm) this setting will be changed automatically. To verify, run sysctl vm.max_map_count.

停用 swap

sudo swapoff -a
或者
config/elasticsearch.yml
中配置 bootstrap.mlockall: true
禁止将内存中的 elasticsearch数据 交换出内存

mlockall

配置

/etc/elasticsearch/elasticsearch.yml 有些参数也可以在配置文件中配置

node.name: zjh cluster.name: my-application path.data: /data/zjh/es/data path.logs: /data/zjh/es/log network.host: 0.0.0.0 http.port: 9200

https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-service.html centos7 上配置成service 一些配置在这两个文件里 /usr/lib/systemd/system/elasticsearch.service /usr/lib/sysctl.d/elasticsearch.conf

kibana

数据的可视化，主要功能从 elasticsearch 中查询数据，将数据以图表的形式展示

kibana 加认证功能。

默认情况下kibana 没有任何认证，如果需要放到外网访问，很不安全。 elasticsearch 官网有个插件 shield 可以实现认证及权限控制等功能。但是需要付费，目前我只需要一个简单认证就可以。可以使用nginx来挡。可以参考 http://jixiuf.github.io/blog/nginx拾遗/

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
docker/kibana		docker/kibana
es_demo		es_demo
logstash_demo		logstash_demo
Dockerfile		Dockerfile
Makefile		Makefile
readme.org		readme.org

jixiuf/logstashdocker

Folders and files

Latest commit

History

Repository files navigation