https://hub.docker.com/_/logstash/ https://hub.docker.com/_/kibana/ https://hub.docker.com/_/elasticsearch/
docker pull logstash
docker pull kibana
docker pull elasticsearch
直接本目录下
sudo make
或
docker build -t najaplus/logstash:latest .
sudo docker run -t -i najaplus/logstash bash
logstash 主要功能,收集与解析数据, 从 日志文件或redis 队列中解析出日志数据,经过logstash 一些filter plugin 解析转换为格式化的内容(json) ,存储到某个地方( 可以是redis ,file,elasticseatch), 以便进行数据分析与统计
在镜像中运行
logstash -e 'input { stdin { } } output { stdout {} }'
此时会进入交互式提示界面 ,然后输入任意字符如 “hello” 会有如下提示
2016-05-06T17:30:17.219Z 40b8994c31cb hello
https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html
logstash_demo 目录下有几个demo
logstash -f logstash1.conf
logstash -f logstash1.conf --auto-reload
主要功能, 数据的存储与检索,基于json 格式存储, 基于Lucene
集群有个名字, 防止不同集群间混淆
elasticsearch --cluster.name my_cluster_name --node.name my_node_name
默认名称 elasticsearch
一个cluster 下分多个node
elasticsearch --cluster.name my_cluster_name --node.name my_node_name
默认名称 elasticsearch 如果只启动一个node 相当于
elasticsearch --cluster.name elasticsearch --node.name elasticsearch
一些具有共性的documents 的集合,可以认为 一个 index 就是一个类别,比如我把游戏1的信息存到index1上 游戏2的信息存 到index2上index 有名字一个cluster 可以有任意多个index
Index 下可以定义多个Type 可以认为是编程语言里的struct/class a type is defined for documents that have a set of common fields.
docuemnt 存储具体和信息,如果Type 当成java里的Class ,则Document 可以认为是这个Class 的实例 在 elasticsearch中它实际是一个json
架构层面上的东西 , 可以进行分片,及主从 这个可以参考mysql 的数据库分片 与主从 一个index 内的数据量可能极大,可以将index 分成多片(称为shard) 当定义index 时 ,可以指定将此index 分成n个shards
Each shard is in itself a fully-functional and independent “index” that can be hosted on any node in the cluster. 可以认为index 与shards是逻辑上的, 而 cluster 与node 上架构上的。 一个index 分为多个shards(s1,s2,s3) ,一个cluster 分为多个node(n1,n2,n3) 将shard s1,s2,s3 分别放到n1,n2,n3节点上
而Replication 可以认为是从库, 存在的意义,
- 备份,及当某个node 挂了 可以failover, 以保证 高可用性(hight available)
- 查询可以在从库上进行
默认情况下 一个index 有5个shard, 每个shard 有一个 replica shards,即共有10个shards 通常情况下 replica shard 肯定跟primary shard 不在同一个节点上(这样从库还真正有意义)
elasticsearch 基于Lucene,而Lucene 使用java 编写,所以java jdk 是安装所必须的
elasticsearch
或
elasticsearch --cluster.name my_cluster_name --node.name my_node_name
启动之后9200端口会监听http 请求
curl 'localhost:9200/_cat/health?v'
deployer@iZ94badqop7Z logstash_demo/demo1 (master) $ curl ‘localhost:9200/_cat/health?v’ epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent 1462637474 00:11:14 elasticsearch yellow 1 1 5 5 0 0 5 0 - 50.0%
curl 'localhost:9200/_cat/nodes?v'
deployer@iZ94badqop7Z logstash_demo/demo1 $ host ip heap.percent ram.percent load node.role master name 120.24.77.58 120.24.77.58 7 92 0.17 d * zjh
curl 'localhost:9200/_cat/indices?v'
health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open logstash-2016.05.07 5 1 6 0 17.3kb 17.3kb
可以看到index 的名字, primary个数 ,replica个数 ,docuemnts数量,
curl -XPUT 'localhost:9200/customer?pretty'
{ “acknowledged” : true }
curl -XDELETE 'localhost:9200/customer?pretty'
这里在index:customer上创建了一个type 为 external id=1的document 如果id=1的已经存在,则会替换之
curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
{
"name": "John Doe"
}'
{ “_index” : “customer”, “_type” : “external”, “_id” : “1”, “_version” : 1, “_shards” : { “total” : 2, “successful” : 1, “failed” : 0 }, “created” : true }
实际情况上 ,在创建document 时, 不必手动去创建相应的index,执行上述命令, 如果没有index:customer,则会自动创建
curl -XPUT ‘localhost:9200/custome2r/external/1?pretty’ -d ’ { “name”: “John Doe” }’
curl -XGET 'localhost:9200/customer/external/1?pretty'
{ “_index” : “customer”, “_type” : “external”, “_id” : “1”, “_version” : 1, “found” : true, “_source” : { “name” : “John Doe” } }
update 实际是先删除后增加
curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '
{
"doc": { "name": "Jane Doe","age":11 }
}'
通过script 修改age 的值 +5 script 文档 https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting.html
curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d '
{
"script" : "ctx._source.age += 5"
}'
目前的版本,script 操作只能会对一个docuemnt ,以后或许会支持类似于sql update 的操作 ,同时修改多个
curl -XDELETE 'localhost:9200/customer/external/2?pretty'
同时创建id=1,2的 type:external
curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
{"index":{"_id":"1"}}
{"name": "John Doe" }
{"index":{"_id":"2"}}
{"name": "Jane Doe" }
'
修改一个, 同时删除另一个
curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d '
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}
'
假如有文件 account.json
{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
{"index":{"_id":"13"}}
{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"nanettebates@quility.com","city":"Nogal","state":"VA"}
{"index":{"_id":"18"}}
{"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","employer":"Boink","email":"daleadams@boink.com","city":"Orick","state":"MD"}
{"index":{"_id":"20"}}
{"account_number":20,"balance":16418,"firstname":"Elinor","lastname":"Ratliff","age":36,"gender":"M","address":"282 Kings Place","employer":"Scentric","email":"elinorratliff@scentric.com","city":"Ribera","state":"WA"}
curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary "@account.json"
curl 'localhost:9200/_cat/indices?v
curl 'localhost:9200/_cat/indices?v'
health index pri rep docs.count docs.deleted store.size pri.store.size
yellow bank 5 1 1000 0 424.4kb 424.4kb
# 两种方式, 一种通过参数 ,一种通过request body 发送json内容
curl 'localhost:9200/bank/_search?q=*&pretty'
#或
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} }
}'
{
"took" : 63,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1000,
"max_score" : 1.0,
"hits" : [ {
"_index" : "bank",
"_type" : "account",
"_id" : "1",
"_score" : 1.0, "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
}, {
"_index" : "bank",
"_type" : "account",
"_id" : "6",
"_score" : 1.0, "_source" : {"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
}
...
]}}
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"size": 1
}'
from 控制从哪条记录起始(0based) sort:使用balance 降序排列
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"from":2,
"size": 3,
"sort": { "balance": { "order": "desc" } }
}'
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"from":2,
"size": 3,
"sort": { "balance": { "order": "desc" } },
"_source":["account_number","balance"]
}'
{
"took" : 10,
"errors" : true,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 8,
"max_score" : null,
"hits" : [ {
"_index" : "bank",
"_type" : "account",
"_id" : "1",
"_score" : null,
"_source" : {
"account_number" : 1,
"balance" : 39225
},
"sort" : [ 39225 ]
}, {
"_index" : "bank",
"_type" : "account",
"_id" : "13",
"_score" : null,
"_source" : {
"account_number" : 13,
"balance" : 32838
},
"sort" : [ 32838 ]
}, {
"_index" : "bank",
"_type" : "account",
"_id" : "37",
"_score" : null,
"_source" : {
"account_number" : 37,
"balance" : 18612
},
"sort" : [ 18612 ]
} ]
}
}
查 account_number=37的
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match":{"account_number":37} },
"_source":["account_number","balance"]
}'
match_phrase似乎跟match 是一样的(返回结果好像是一样的)
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_phrase":{"account_number":37} },
"_source":["account_number","balance"]
}'
{
"took" : 33,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.30685282,
"hits" : [ {
"_index" : "bank",
"_type" : "account",
"_id" : "37",
"_score" : 0.30685282,
"_source" : {
"account_number" : 37,
"balance" : 18612
}
} ]
}
}
相当于 and not or
查 gender==M and age==32
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"must": [
{ "match": { "gender": "M" } },
{ "match": { "age": "32" } }
]
}
}
}'
查 age==31 or age==32
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"should": [
{ "match": { "age": "31" } },
{ "match": { "age": "32" } }
]
}
}
}'
查 (gender=M and age==32) and( balance!= 32838 and balance!= 18612 )
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"must": [
{ "match": { "gender": "M" } },
{ "match": { "age": "32" } }
],
"must_not": [
{ "match": { "balance" : 32838} },
{ "match": { "balance" : 18612} }
]
}
}
}'
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"must": [{ "match": { "gender": "M" } }],
"filter": {"range":{ "balance":{"gte":10,"lte":20000}}}
}
}
}'
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_state_just_a_name": {
"terms":{"field":"state"}
}
}
}'
这里terms 是按field:state 进行统计其数量, 以计 {state:”mystate”,count:100} 的形式返回 size=0 意思是说只返回 统计结果 即下面的 aggregations里的数据,而hits 结果为空
SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 8,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"group_by_state" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "il",
"doc_count" : 1
}, {
"key" : "in",
"doc_count" : 1
}, {
"key" : "md",
"doc_count" : 1
}, {
"key" : "ok",
"doc_count" : 1
}, {
"key" : "pa",
"doc_count" : 1
}, {
"key" : "tn",
"doc_count" : 1
}, {
"key" : "va",
"doc_count" : 1
}, {
"key" : "wa",
"doc_count" : 1
} ]
}
}
}
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state",
"order": {
"average_balance": "desc"
}
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}'
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 8,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"group_by_state" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "il",
"doc_count" : 1,
"average_balance" : {
"value" : 39225.0
}
}, {
"key" : "in",
"doc_count" : 1,
"average_balance" : {
"value" : 48086.0
}
}, {
"key" : "md",
"doc_count" : 1,
"average_balance" : {
"value" : 4180.0
}
}, {
"key" : "ok",
"doc_count" : 1,
"average_balance" : {
"value" : 18612.0
}
}, {
"key" : "pa",
"doc_count" : 1,
"average_balance" : {
"value" : 40540.0
}
}, {
"key" : "tn",
"doc_count" : 1,
"average_balance" : {
"value" : 5686.0
}
}, {
"key" : "va",
"doc_count" : 1,
"average_balance" : {
"value" : 32838.0
}
}, {
"key" : "wa",
"doc_count" : 1,
"average_balance" : {
"value" : 16418.0
}
} ]
}
}
}
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_age": {
"range": {
"field": "age",
"ranges": [
{
"from": 20,
"to": 30
},
{
"from": 30,
"to": 40
},
{
"from": 40,
"to": 50
}
]
}
}
}
}'
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_age": {
"range": {
"field": "age",
"ranges": [
{
"from": 20,
"to": 30
},
{
"from": 30,
"to": 40
},
{
"from": 40,
"to": 50
}
]
}
}
}
}'
配置 elasticsearch 的JVM 参数
两个环境变量, 尽量不要修改JAVA_OPTS,保持原样, 而修改 ES_JAVA_OPTS ES_HEAP_SIZE 设置 heap 大小(min =max=this) ES_MIN_MEM ES_MAX_MEM (分别设置min max )官方不推荐 提前设置好 max-open-files ulimit -n 查看
Virtual memoryedit
Elasticsearch uses a hybrid mmapfs / niofs directory by default to store its indices. The default operating system limits on mmap counts is likely to be too low, which may result in out of memory exceptions. On Linux, you can increase the limits by running the following command as root:
sysctl -w vm.max_map_count=262144
To set this value permanently, update the vm.max_map_count setting in /etc/sysctl.conf.
If you installed Elasticsearch using a package (.deb, .rpm) this setting will be changed automatically. To verify, run sysctl vm.max_map_count.
sudo swapoff -a
或者
config/elasticsearch.yml
中配置 bootstrap.mlockall: true
禁止将内存中的 elasticsearch数据 交换出内存
/etc/elasticsearch/elasticsearch.yml 有些参数也可以在配置文件中配置
node.name: zjh cluster.name: my-application path.data: /data/zjh/es/data path.logs: /data/zjh/es/log network.host: 0.0.0.0 http.port: 9200
https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-service.html centos7 上 配置成service 一些配置 在这两个文件里 /usr/lib/systemd/system/elasticsearch.service /usr/lib/sysctl.d/elasticsearch.conf
数据的可视化, 主要功能 从 elasticsearch 中查询数据, 将数据以图表的形式展示
默认情况下kibana 没有任何认证,如果需要放到外网访问, 很不安全。 elasticsearch 官网有个插件 shield 可以实现认证及权限控制等功能。 但是需要付费, 目前我只需要一个简单认证就可以。 可以使用nginx来挡。 可以参考 http://jixiuf.github.io/blog/nginx拾遗/