New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logscale: add logscale() destination #4472
Conversation
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
Tested with a real instanceInput
LogScale |
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise.
scl/logscale/logscale.conf
Outdated
|
||
# https://library.humio.com/falcon-logscale/api-ingest.html#api-ingest-structured-data | ||
block destination logscale( | ||
url() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we add a default here? (cloud.humio.com)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, added.
scl/logscale/logscale.conf
Outdated
headers( | ||
"Authorization: Bearer `token`" | ||
"Content-Type: `content_type`" | ||
"Connection: keep-alive" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep-alive is managed by curl (and it is on by default in HTTP/1.1), so I don't think we need this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, removed.
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
scl/logscale/logscale.conf
Outdated
timezone("") | ||
attributes("--scope rfc5424 --exclude MESSAGE --exclude DATE --leave-initial-dot") | ||
|
||
batch_lines(5000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets change these to be a bit safer
batch_lines(1000) #1000 events per post
batch_bytes(1024kB)
batch_timeout(1)
workers(20)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I have changed batch_lines()
, batch_bytes()
and batch_timeout()
.
Is there any particular reason for setting workers(20)
? In the source code there is a hard limit on the maximum number of workers, and we have multiple http()
SCLs and their workers add up. We are trying to balance a bit, so even if a user configures multiple SCL based drivers, they won't hit the hard limit. Because of this I would keep it 8, if you don't mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the response time from the http server is 500ms (round trip) we would get about 2 posts per second per thread limiting the throughput to around 20Mbs or 20k eps if we had 10 workers I was going for double that as a modest size firewall source alone can run around that level. http posts are mostly network wait state so a higher number of works is really needed to keep throughput up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are good points. I think that we should probably separate the concept of a worker and a batch within our HTTP destination. We might be building batches, some of which are being transferred to the server, while the others are still waiting for more data. This way threads would only be occupied while the HTTP transaction happens, thus the number of threads may be a lot lower than the number of batches being built.
If the total number of threads exceeds the compile time constant of 256, we will get this message:
msg_warning_once("Unable to allocate a unique thread ID. This can only "
"happen if the number of syslog-ng worker threads exceeds the "
"compile time constant MAIN_LOOP_MAX_WORKER_THREADS. "
"This is not a fatal problem but can be a cause for "
"decreased performance. Increase this number and recompile "
"or contact the syslog-ng authors",
evt_tag_int("max-worker-threads-hard-limit", MAIN_LOOP_MAX_WORKER_THREADS));
This only causes decreased performance in the source related threads case, so we might just get rid off this problem by not allocating a thread id for HTTP worker threads, which will never use the thread id at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged it with workers(8)
, as I would like to generate a container image from master and this PR already had an approve, but opened a PR to raise the workers to 20: #4488
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
@bazsi like what your thinking there. The vector agent thread pool concept
might be useful. They have a common pool for destinations and a max
outstanding batch concept
…On Thu, May 18, 2023 at 2:59 AM Balazs Scheidler ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In scl/logscale/logscale.conf
<#4472 (comment)>:
> +# Batching best practices: https://library.humio.com/falcon-logscale/api-ingest.html#api-ingest-best-practices
+
***@***.*** json-plugin
+
+
+# https://library.humio.com/falcon-logscale/api-ingest.html#api-ingest-structured-data
+block destination logscale(
+ url("https://cloud.humio.com")
+ token()
+
+ rawstring("${MESSAGE}")
+ timestamp("${S_ISODATE}")
+ timezone("")
+ attributes("--scope rfc5424 --exclude MESSAGE --exclude DATE --leave-initial-dot")
+
+ batch_lines(5000)
These are good points. I think that we should probably separate the
concept of a worker and a batch within our HTTP destination. We might be
building batches, some of which are being transferred to the server, while
the others are still waiting for more data. This way threads would only be
occupied while the HTTP transaction happens, thus the number of threads may
be a lot lower than the number of batches being built.
If the total number of threads exceeds the compile time constant of 256,
we will get this message:
msg_warning_once("Unable to allocate a unique thread ID. This can only "
"happen if the number of syslog-ng worker threads exceeds the "
"compile time constant MAIN_LOOP_MAX_WORKER_THREADS. "
"This is not a fatal problem but can be a cause for "
"decreased performance. Increase this number and recompile "
"or contact the syslog-ng authors",
evt_tag_int("max-worker-threads-hard-limit", MAIN_LOOP_MAX_WORKER_THREADS));
This only causes decreased performance in the source related threads case,
so we might just get rid off this problem by not allocating a thread id for
HTTP worker threads, which will never use the thread id at the moment.
—
Reply to this email directly, view it on GitHub
<#4472 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIN6WOGKPY4WHLQIYQTCDHTXGXCEVANCNFSM6AAAAAAX7NDABM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com> Signed-off-by: Hofi <hofione@gmail.com>
The
logscale()
destination feeds LogScale via the Ingest API.Minimal config:
Additional options include:
url()
rawstring()
timestamp()
timezone()
attributes()
extra-headers()
content-type()
Signed-off-by: Attila Szakacs attila.szakacs@axoflow.com