-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added elasticsearch-http() destination #2509
Conversation
Build SUCCESS |
this is great, thanks!
|
a23cb07
to
010109d
Compare
Build SUCCESS |
Hi, Thanks for your feedback!
Yes, you can use template (macros) in type and index fields, but it's your risk if the resolved template contains some not-allowed character(s). There is no protection in syslog-ng for this case, so always make sure that the resolved template can only contains characters that are allowed by elastic.
In this solution, you can set it to empty string ("") that will make this field to empty string but you cannot remove it completely from the index. |
Can't we remove the previous elasticsearch destinations in favour of this
one? I mean the best would be if we only had one elasticsearch() and we
would change the implementation.
Or it can't be compatible?
Thanks
…On Thu, Jan 24, 2019 at 8:15 AM Kókai Péter ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In scl/elasticsearch/plugin.conf
<#2509 (comment)>:
> @@ -113,3 +113,27 @@ block destination elasticsearch2(
`__VARARGS__`
);
};
+
+block destination elasticsearch-http-bulk(
I would split this elasticsearch-http-bulk into a different file:
scl/elasticsearch/es-http-bulk.conf.
1. As the @requires are *global* for a file, meaning when you have a
@requires if that module does not present everything after it won't be
parsed. But the needed modules for the two elasticsearch is different.
2. They solve similar issue (hence dir) but in a different way
(different file)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2509 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AArldgKYMI9wWFIlsGbXkaZwmfdccQWcks5vGV2WgaJpZM4aOoju>
.
|
@bazsi Yes, this is my plan, but I want to do it in a different PR. @faxm0dem, Bazsi: In that case the url will be the next: instead of and when the ES7 or 8 is released, he can change it to (no type) What do you think about it? |
no, don't do that: most ppl use time-based indices, e.g. |
Hi,
I am not sure I follow, surely http() is capable of doing time based
indexes. Or is it he proposed SCL wrapper that misses that?
Thanks for your clarification.
Bazsi
…On Sun, Jan 27, 2019 at 9:06 PM Fabien Wernli ***@***.***> wrote:
no, don't do that: most ppl use time-based indices, e.g.
syslog-${YEAR}.${MONTH}.${DAY} so what would happen at day break with a
batch containing both messages from yesterday and today?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2509 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AArldrZTujmISkjIVPXwn_m9HeG0NZa2ks5vHgQQgaJpZM4aOoju>
.
|
010109d
to
d576968
Compare
d576968
to
57c3688
Compare
So the changes: |
Build SUCCESS |
I'd love to test this before it gets merged, if that's fine by you |
That would be great! Just don't forget to apply the #2519 before using this SCL. |
the current elastic-v2 destination allows for load-balancing. EDIT: sorry, I just saw that the load-balancing code is there since 3.19 so multiple urls are fine |
I just tried to set an url string list but syslog-ng doesn't like it:
The parser doesn't seem happy about the string list:
|
I just successfully tested using this config:
However, batch_lines doesn't seem to be honored:
|
This destination is based on the native http destination of syslog-ng and uses elasticsearch bulk api (https://www.elastic.co/guide/en/elasticsearch/reference/6.5/docs-bulk.html) Example: destination d_elasticsearch_http { elasticsearch-http(index("my_index") type("my_type") url("http://my_elastic_server:9200/_bulk")); }; Signed-off-by: Zoltan Pallagi <pzoleex@gmail.com>
57c3688
to
381ceb1
Compare
PR updated. |
Send messages faster :) If you want to flush exactly at batch_lines(), then set batch_timeout() parameter. |
I tried batch_timeout, doesn't seem to do anything |
Build SUCCESS |
Sorry, the value for batch-timeout() is in millisec. So set it to 60000 for 60 seconds. |
indeed, it works perfectly, thanks! I agree with @bazsi that there should be only one elastic destination. |
any numbers you can share? CPU usage, eps etc would be very interesting.
Thanks
…On Wed, Feb 6, 2019 at 9:31 AM Fabien Wernli ***@***.***> wrote:
indeed, it works perfectly, thanks!
I agree with @bazsi <https://github.com/bazsi> that there should be only
one elastic destination.
However, it's safer to first obsolete the other destinations in 3.20
So I suggest you add warning messages to all other elastic destinations,
that they're obsolete and will be removed in 3.21. Then we'll "promote"
this one by renaming it to elasticsearch, which will make sense as I'll
have tested it in production ;)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2509 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AArldmkL9eFVJH2YVlf01yygSjGpWXRmks5vKpJkgaJpZM4aOoju>
.
|
I'll test it using the production workload ASAP and report back here |
@bazsi I managed to test this in production (12 fairly recent PowerEdge machines with 10k spinning SATA disks running Elasticsearch, 1 large VM with 8VCPU running syslog-ng and loggen). The highest throughput I could get was 30'000 messages per second: loggen
The VM's load was pretty high during the longest tests: between 0,9 and 1.1 normalized. syslog-ng config
elasticIndex settings:
As you can see the index spans all 12 nodes:
Moreover, all documents were written, no lost message:
It seems the ES cluster isn't saturated, but the syslog-ng node is. Still, 30keps seems pretty decent, and that's 10 times more than we need in production. |
Thanks a lot for these numbers. Really appreciated.
I am somewhat disappointed by these numbers though. I think the primary
reasons are:
* your $(format-json) command line is pretty complex, unfortunately
value-pairs have performance issues, which I wanted to fix for a time now.
The command line you are using are a very nice realistic use-case to
improve performance on. Still you are running this on 8 cores, so if this
would be the bottleneck then the per-thread performance is roughly
3500-4000 msg/sec. Which is really disappointing.
* disk buffer also has an impact, but your config partitions the traffic
into 12 queues, thus the disk buffer performance issues are probably not
the bottleneck. disk-wise the 30MB/sec does not seem too demanding.
* TLS: shouldn't be an issue, 30MB/sec is not something a recent CPU
wouldn't be able to handle.
So, with all that said, I am pretty sure this can and should be improved in
the future and your config example will help me guide the performance
tuning.
Thanks
Bazsi
…On Wed, Feb 13, 2019 at 10:05 AM Fabien Wernli ***@***.***> wrote:
@bazsi <https://github.com/bazsi> I managed to test this in production
(12 fairly recent PowerEdge machines with 10k spinning SATA disks running
Elasticsearch, 1 large VM with 8VCPU running syslog-ng and loggen).
The highest throughput I could get was 30'000 messages per second:
loggen
$ loggen -r 50000 -I 120 -P -S 127.0.0.1 514 --active-connections=8
...
average rate = 31637.14 msg/sec, count=3846907, time=121.595, (average) msg size=260, bandwidth=8032.87 kB/sec
The VM's load was pretty high during the longest tests: between 0,9 and
1.1 normalized.
The Elasticsearch nodes' load was between 0.1 and 0.4 normalized.
The syslog-ng queue contained no more than 4000 messages and was very low.
syslog-ng config
elasticsearch-http(
workers(12)
batch_lines(1024)
batch_timeout(10000)
timeout(10)
index("test-syslog_ng-elastic-http")
url("https://node221.example.com:9200/_bulk" "https://node222.example.com:9200/_bulk" "https://node223.example.com:9200/_bulk" "https://node05.example.com:9200/_bulk" "https://node08.example.com:9200/_bulk" "https://node27.example.com:9200/_bulk" "https://node53.example.com:9200/_bulk" "https://node54.example.com:9200/_bulk" "https://node55.example.com:9200/_bulk" "https://node83.example.com:9200/_bulk" "https://node84.example.com:9200/_bulk" "https://node85.example.com:9200/_bulk")
template("$(format-json -s all-nv-pairs -x __* -x tmp.* -x SOURCE -x PROGRAM -x MESSAGE -x PID -x HOST_FROM -x HOST -x LEGACY_MSGHDR -p uniqid=$UNIQID --rekey timestamp --add-prefix @ --rekey .classifier.* --add-prefix pdb --rekey .SDATA.auto.* --shift 12 --rekey .SDATA.* --shift 7 --rekey .* --shift 1)")
time-zone("UTC")
type("syslog")
tls (
ca-file('/etc/elasticsearch/coloss/ca.pem')
cert-file('/etc/syslog-ng/coloss-analyzer.crt')
key-file('/etc/syslog-ng/coloss-analyzer.key')
peer-verify(yes)
)
disk-buffer(reliable(no) dir("/var/lib/syslog-ng-disq/") disk-buf-size(53687091200) mem-buf-length(200
`__VARARGS__`
);
elastic
Index settings:
# GET /test-syslog_ng-elastic-http/_settings
{
"test-syslog_ng-elastic-http" : {
"settings" : {
"index" : {
"creation_date" : "1550046604958",
"number_of_shards" : "12",
"number_of_replicas" : "0",
"uuid" : "mJQlJQerRhmUuEvzgc-flw",
"version" : {
"created" : "6030299"
},
"provided_name" : "test-syslog_ng-elastic-http"
}
}
}
}
As you can see the index spans all 12 nodes:
# GET /_cat/shards | grep test-syslog_ng
test-syslog_ng-elastic-http 8 p STARTED 320103 151.6mb 10.0.238.222 node222
test-syslog_ng-elastic-http 4 p STARTED 320175 151.5mb 10.0.105.59 node08
test-syslog_ng-elastic-http 2 p STARTED 320333 151.2mb 10.0.104.70 node53
test-syslog_ng-elastic-http 9 p STARTED 320612 151.5mb 10.0.108.89 node84
test-syslog_ng-elastic-http 6 p STARTED 320635 151.4mb 10.0.238.221 node221
test-syslog_ng-elastic-http 3 p STARTED 320948 151.6mb 10.0.108.2 node55
test-syslog_ng-elastic-http 11 p STARTED 320125 151.5mb 10.0.108.93 node85
test-syslog_ng-elastic-http 7 p STARTED 321130 151.4mb 10.0.108.88 node83
test-syslog_ng-elastic-http 1 p STARTED 321431 151.6mb 10.0.105.97 node27
test-syslog_ng-elastic-http 10 p STARTED 320441 151.4mb 10.0.238.223 node223
test-syslog_ng-elastic-http 5 p STARTED 320062 151.2mb 10.0.108.1 node54
test-syslog_ng-elastic-http 0 p STARTED 320912 151.4mb 10.0.104.140 node05
Moreover, all documents were written, no lost message:
# GET /test-syslog_ng-elastic-http/_count
{"count":3846907,"_shards":{"total":12,"successful":12,"skipped":0,"failed":0}}
It seems the ES cluster isn't saturated, but the syslog-ng node is.
I'm guessing this is due to the VM's size, and TLS overhead.
Still, 30keps seems pretty decent, and that's 10 times more than we need
in production.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2509 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AArldu7BE3PVySgr4DAaJ1LMKrqBkc9Pks5vM9TTgaJpZM4aOoju>
.
|
I did an additional test: simulate the failure of an url, and this had the effect to end up with duplicated documents in Elasticsearch : the difference between send messages and indexed messages was a multiple of the batch size. |
@bazsi I replaced the template with EDIT: there was a syntax error in my config. The correct conclusion is: there is little influence on the performance when simplifying the template, but disabling the unreliable disk-buffer significantly increases performance: I can easily get 48k/s |
@faxm0dem Fabien, thanks for the tests! With this configuration, the log-iw-size() of network source should be 12 * 1024 * number of active tcp conn (8), so log-iw-size(98304 or higher) is the optimal value. This is because batching has a side-effect when batch_timeout() is set, it won't flush until batch_lines() or batch_timeout() reached, but every message in the batch will decrease the log_iw of the source. And another tipp: you could try it without using syslog protocol, currently there is some issue related to the performance when syslog protocol is used. |
I updated my comment to add the network source options. I didn't set any @pzoleex what do you suggest to replace syslog() with? |
I managed to get 57k/s by just simplifying my log statement (removed many parsers, rewrites). |
The bottleneck seems to be the elasticsearch cluster then. Maybe it
throttles incoming HTTP requests somehow?
Can you increase the number of workers to a multiple of 12? You have 12
nodes, so increasing the threads to a multiple of that means that we would
send requests to each node from multiple threads.
Also, I guess elastic nodes would need to replicate the indexes somehow, so
the best would be if our routing to nodes would be the same as how shards
are split in ES. Do you know if there's any best practice how we should
target nodes in an ES cluster for better data locality?
…On Wed, Feb 13, 2019 at 10:57 AM Fabien Wernli ***@***.***> wrote:
@bazsi <https://github.com/bazsi> I replaced the template with format-json
-s all-nv-pairs and removed the disk-buffer.
The performance is more or less the same.
I'll try with a larger maybe physical node if needed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2509 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AArldj_OuZIdZ2Pz_X7B_QviHhvsbJGFks5vM-GFgaJpZM4aOoju>
.
|
Hm... first of all I wrote this before seeing PZolee's answer.
This is what I meant on routing:
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html
This means that the document _id will be used for routing, e.g. that will
determine where the document gets indexed. We should be able to route to
the proper node accordingly.
On Wed, Feb 13, 2019 at 5:40 PM Scheidler, Balázs <
balazs.scheidler@oneidentity.com> wrote:
… The bottleneck seems to be the elasticsearch cluster then. Maybe it
throttles incoming HTTP requests somehow?
Can you increase the number of workers to a multiple of 12? You have 12
nodes, so increasing the threads to a multiple of that means that we would
send requests to each node from multiple threads.
Also, I guess elastic nodes would need to replicate the indexes somehow,
so the best would be if our routing to nodes would be the same as how
shards are split in ES. Do you know if there's any best practice how we
should target nodes in an ES cluster for better data locality?
On Wed, Feb 13, 2019 at 10:57 AM Fabien Wernli ***@***.***>
wrote:
> @bazsi <https://github.com/bazsi> I replaced the template with format-json
> -s all-nv-pairs and removed the disk-buffer.
> The performance is more or less the same.
> I'll try with a larger maybe physical node if needed.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#2509 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AArldj_OuZIdZ2Pz_X7B_QviHhvsbJGFks5vM-GFgaJpZM4aOoju>
> .
>
|
the way I understand ES, if |
in what version
It is only available by using |
@nbari Hopefully, it will be included in next release (version 3.21).
|
wip flag removed because #2519 already merged |
@kira-syslogng test this please; |
Build SUCCESS |
This destination is based on the native http destination of syslog-ng
and uses elasticsearch bulk api (https://www.elastic.co/guide/en/elasticsearch/reference/6.5/docs-bulk.html)
Example:
destination d_elasticsearch_http {
elasticsearch-http(index("my_index")
type("my_type")
url("http://my_elastic_server:9200/_bulk"));
};
Signed-off-by: Zoltan Pallagi pzoleex@gmail.com