Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time_Format %s is not handled properly #2388

Closed
LionelCons opened this issue Jul 23, 2020 · 9 comments
Closed

Time_Format %s is not handled properly #2388

LionelCons opened this issue Jul 23, 2020 · 9 comments

Comments

@LionelCons
Copy link
Contributor

Time_Format could be %s which is documented in strptime as "The number of seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC)".

Unfortunately, Fluent Bit does not treat it as the number of seconds since the Epoch and it applies an offset.

Here is the configuration used to reproduce the issue:

[PARSER]
    Name         test
    Format       regex
    Regex        ^msg=audit\((?<time>\d+):(?<counter>\d+)\):\s+(?<message>.*?)\s*$
    Time_Key     time
    Time_Format  %s

[INPUT]
    Name   stdin
    Tag    test

[FILTER]
    Name          parser
    Match         *
    Parser        test
    Key_Name      raw
    Preserve_Key  On
    Reserve_Data  On

[OUTPUT]
    Name   stdout
    Match  *

Then:

-bash-4.2$ echo '{"raw":"msg=audit(1595492168:14751): pid=42558"}' |  fluent-bit -c test.conf 
Fluent Bit v1.5.1
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/07/23 14:07:19] [ info] [engine] started (pid=15929)
[2020/07/23 14:07:19] [ info] [storage] version=1.0.4, initializing...
[2020/07/23 14:07:19] [ info] [storage] in-memory
[2020/07/23 14:07:19] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/07/23 14:07:19] [ info] [sp] stream processor started
[2020/07/23 14:07:19] [ warn] [input:stdin:stdin.0] end of file (stdin closed by remote end)
[0] test: [1595499368.000000000, {"counter"=>"14751", "message"=>"pid=42558", "raw"=>"msg=audit(1595492168:14751): pid=42558"}]
[2020/07/23 14:07:19] [ warn] [engine] service will stop in 5 seconds
[2020/07/23 14:07:24] [ info] [engine] service stopped

You see that the event timestamp is 1595499368 and it differs from 1595492168 by two hours.

@edsiper
Copy link
Member

edsiper commented Jul 25, 2020

Thanks for reporting this issue.

I was able to isolate the problem in the following test case:

/* -*- Mode: C; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */


#define _XOPEN_SOURCE
#include <time.h>

#include <stdio.h>
#include <stdlib.h>

int main()
{
    char *p;
    char *secs = "1595492168";
    char out[32];
    struct tm tm;

    p = strptime(secs, "%s", &tm);

    strftime(out, sizeof(out) - 1, "%s", &tm);
    printf("input      => %s\n", secs);
    printf("strftime() => %s\n", out);
    printf("timegm()   => %lu\n", (long) timegm(&tm));
    printf("mktime()   => %lu\n", (long) mktime(&tm));

    return 0;
}

testing the output we get:

$ ./strptime 
input      => 1595492168
strftime() => 1595492168
timegm()   => 1595470568
mktime()   => 1595492168

In our code, we are using timegm(3) which looks incorrect, it should be mktime(3)

@fujimotos would you please re-validate my findings on this ?, the guilty function is this:

we need to evaluate any side effects of using mktime(3) and add a unit test to avoid this kind of problem

@fujimotos
Copy link
Member

In our code, we are using timegm(3) which looks incorrect, it should be mktime(3)

@eduardo IIUC, if we use mktime() here, it will break the more
normal cases such as:

'1970-01-01 02:00:00 +0100' ('%Y-%m-%d %H:%M:%S %z')

Indeed:

strptime("1970-01-01 02:00:00 +0100", '%Y-%m-%d %H:%M:%S %z', &tm)

// Current logic
ret = flb_parser_tm2time(&tm);
printf("%li", ret); // 3600 (correct)

// Replace timegm with mktime()
ret = flb_parser_tm2time_mktime(&tm);
printf("%li", ret); // -28800 (???)

Talking about an actionable mitigation measure, I think @LionelCons can set
TimeOffset to avoid the conversion problem. For example:

[PARSER]
    Name         test
    Format       regex
    Regex        ^(?<time>.+)$
    Time_Key     time
    Time_Format  %s
    TIme_Offset  +0400

This should resolve the issue described in OP.

@LionelCons
Copy link
Contributor Author

As discussed in #593, I cannot use TIme_Offset because of DST.

In any case, the fact that %s does not return the exact same timestamp as the source string is IMHO a bug.

@LionelCons
Copy link
Contributor Author

LionelCons commented Jul 27, 2020

Hmmm... I think I have found where the problem is.

flb_parser_create tries to guess whether the time format includes the TZ or not. This is at https://github.com/fluent/fluent-bit/blob/master/src/flb_parser.c#L212.

This logic fails to see that %s implicitly contains the TZ (since it is UTC as per its definition).

Adding something like || strstr(p->time_fmt, "%s") seems to fix the bug. It does in my test at least.

@LionelCons
Copy link
Contributor Author

Looking elsewhere in the same file, it seems that the "year detection" at https://github.com/fluent/fluent-bit/blob/master/src/flb_parser.c#L187 is also broken.

The code only checks for strstr(p->time_fmt, "%Y") || strstr(p->time_fmt, "%y") and fails to see that %s also does not need the year hack for dates that do not contain a year.

@edsiper
Copy link
Member

edsiper commented Aug 3, 2020

hmm, the main problem relies on the handling of tm_gmtoff field structure from struct tm by strptime(3), and it's a mess, for short:

"strptime(2) implementation works differently on glibc, bionic, freebsd...etc"

when a %s is specified, some implementations prefers to convert the parsed time internally with localtime_r(3) while others with gmtime_r(3):

and we are not alone on this:

@fujimotos : this is not new :) ! remember a while ago you added a BSD copy as flb_strptime.c. Looking at the code the %s translation is correct. Please check if moving from system strptime(3) to flb_strptime(3) is enough to fix the problem.

@edsiper
Copy link
Member

edsiper commented Aug 3, 2020

note: the year detection as pointed out by @LionelCons is required, so when %s is detected the field time_with_year must be set.

@fujimotos
Copy link
Member

I submit a patch to use our strptime(3) implementation to #2433.
With this, I can confirm that OP issue is resolved.

Before:

$ echo '{"raw":"10000"}' | ./bin/fluent-bit -R test.parser -c test.conf
...
[0] test: [42400.000000000, {"raw"=>"10000"}]

After:

$ echo '{"raw":"10000"}' | ./bin/fluent-bit -R test.parser -c test.conf
...
[0] test: [10000.000000000, {"raw"=>"10000"}]

The year hack issue mentioned by #2388 (comment) is also fixed by #2432.
I hope these patches help.

@fujimotos
Copy link
Member

Already fixed as of 1.6.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants