Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize struct epoll_event to all zeros #14353

Closed

Conversation

@mine260309
Copy link

mine260309 commented Dec 16, 2019

The struct epoll_event may contain padding bytes, depend on the
architecture:

struct epoll_event
{
      uint32_t events;      /* Epoll events */
        epoll_data_t data;    /* User data variable */
} __EPOLL_PACKED;

On x86-64, it defines __EPOLL_PACKED with packed attribute,
so it has no padding bytes.
On other achitectures, e.g. ppc64le, __EPOLL_PACKED is not defined
so it has 4 padding bytes internally.

With GCC bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992, the
below code snippet leaves the padding bytes un-initialized:

struct epoll_event ev;
ev = (struct epoll_event) {
        .events = EPOLLIN,
        .data.ptr = d,
};

So valgrind will report error:

Syscall param epoll_ctl(event) points to uninitialised byte(s)

Also it leaks information and is thus insecure.

Before GCC bug is fixed, workaround it by initialize the struct with 0
to fix this issue.

Signed-off-by: Lei YU mine260309@gmail.com

The struct epoll_event may contain padding bytes, depend on the
architecture:

    struct epoll_event
    {
          uint32_t events;      /* Epoll events */
            epoll_data_t data;    /* User data variable */
    } __EPOLL_PACKED;

On x86-64, it defines __EPOLL_PACKED with __packed__ attribute,
so it has no padding bytes.
On other achitectures, e.g. ppc64le, __EPOLL_PACKED is not defined
so it has 4 padding bytes internally.

With GCC bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992, the
below code snippet leaves the padding bytes un-initialized:

    struct epoll_event ev;
    ev = (struct epoll_event) {
            .events = EPOLLIN,
            .data.ptr = d,
    };

So valgrind will report error:

 Syscall param epoll_ctl(event) points to uninitialised byte(s)

Also it leaks information and is thus insecure.

Before GCC bug is fixed, workaround it by initialize the struct with 0
to fix this issue.

Signed-off-by: Lei YU <mine260309@gmail.com>
@@ -386,7 +386,7 @@ static int source_io_register(
int enabled,
uint32_t events) {

struct epoll_event ev;
struct epoll_event ev = {0};

This comment has been minimized.

Copy link
@poettering

poettering Dec 16, 2019

Member

hmm, why would this be any different then literally the same initialization a few lines down?

Also, epoll_ctl() doesn't really care about the padding. Isn't this something to fix in valgrind?

This comment has been minimized.

Copy link
@poettering

poettering Dec 16, 2019

Member

also, we generally use the GNU C extension that allows us to write = {} rather than = {0}

This comment has been minimized.

Copy link
@mine260309

mine260309 Dec 16, 2019

Author

valgrind is actually right, because in the epoll_ctl syscall, it uses copy_from_user to copy the whole struct into kernel space, and thus the uninitialized bytes are leaked into kernel.

    copy_from_user(&epds, event, sizeof(struct epoll_event)))

This comment has been minimized.

Copy link
@poettering

poettering Dec 16, 2019

Member

but that's pretty much OK, no? its pading, so noone should care. And the kernel is more privileged anyway, so you can't really say "leak into". If we'd leak the padding into less privileged stuff this would be a problem, but the other direction?

This comment has been minimized.

Copy link
@mine260309

mine260309 Dec 16, 2019

Author

also, we generally use the GNU C extension that allows us to write = {} rather than = {0}

OK, I could update the code to use = {}

but that's pretty much OK, no? its pading, so noone should care. And the kernel is more privileged anyway, so you can't really say "leak into". If we'd leak the padding into less privileged stuff this would be a problem, but the other direction?

I agree it's not really a bug, especially if GCC is doing right we do not need this change after all.

But isn't it better to fix such uninitialized bytes when valgrind helps to identify the issue? (if we accept that GCC is not going to fix it soon, the bug was opened in 2016)

This comment has been minimized.

Copy link
@keszybz

keszybz Dec 17, 2019

Member

why would this be any different then literally the same initialization a few lines down?

I think that this is the biggest issue. Because of some accident of implementation, one initialization has a different effect than the other one. We shouldn't base our work-around something so flimsy.

I'd rather see the declaration moved down to the point of initialization and the cast removed:

struct epoll_event ev = {
         .events = events | (enabled == SD_EVENT_ONESHOT ? EPOLLONESHOT : 0),
          .data.ptr = s,
};

Would this help with your issue?

This comment has been minimized.

Copy link
@mine260309

mine260309 Dec 19, 2019

Author

Nope, the problem is actually in GCC with bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992

It does not matter if we write

struct epoll_event ev;
ev = (struct epoll_event) {...};

or

struct epoll_event ev = {...};

In both cases, GCC generates uninitialized data if the struct has padding bytes.
Example on coliru: http://coliru.stacked-crooked.com/a/a9a907bda006d9cd

This comment has been minimized.

Copy link
@poettering

poettering Dec 23, 2019

Member

but shouldn't valgrind be fixed to not complain about non-initialized padding to be passed to kernel apis where it's known-safe?

I mean, i can see why complaining about that in write() makes sense, but i fail to see what could be wrong here with epoll. valrgind is just generating incorrect warnings.

This comment has been minimized.

Copy link
@mine260309

mine260309 Jan 2, 2020

Author

I agree that it's valid code and it should not cause anything wrong at runtime.
But I also appreciate Valgrind telling that there are non-initialized bytes that are accessed, which potentially may cause issues.

So I think manually initializing the padding to 0 is an acceptable workaround.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Jan 2, 2020

This is tracked here in valgrind btw:

https://bugs.kde.org/show_bug.cgi?id=415621

I am pretty sure we should treat this as valgrind bug and let them fix that. leaking uninitialized padding to the kernel is not really leaking, it can read all our data anyway hence i doubt we should work-around that.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Jan 2, 2020

another option might be to use VALGRIND_MAKE_MEM_DEFINED() to make valgrind assume the padding is initialized. We do this for the LOOP_GET_STATUS64 (though in the other direction) already, and this might be ok to (temporarily...) do here too, given that we don't have too many calls into epoll_wait()

@mine260309

This comment has been minimized.

Copy link
Author

mine260309 commented Jan 8, 2020

Thanks @poettering for the information.

@keszybz

This comment has been minimized.

Copy link
Member

keszybz commented Jan 8, 2020

use VALGRIND_MAKE_MEM_DEFINED()

I don't think this is a good idea. The memory isn't initialized, and doing this could obscure other bugs. In particular, the padding could be read by our code (in case of a programming mistake), and then the use of VALGRIND_MAKE_MEM_DEFINED() would hide this fact.

The other two solutions being kicked around (teaching gcc to optionally initialize padding bytes and teaching valgrind that it is OK to pass unitialized padding to the kernel) are both useful, separately and in combination. In particular, I think the first is important for kernel code, and the issue of leakage will not be comprehensively solved without gcc being smarter. But the solution in this PR only works because of some implementation detail, and I don't think we should merge this, sorry!

In fact, we should move the the code around to initialize the structure at declaration time to make the code shorter and more readable. I'll submit a patch to just that. But then the trick in this PR cannot be done, and neither can an explicit call to memset or secure_memset be inserted.

@keszybz keszybz closed this Jan 8, 2020
keszybz added a commit to keszybz/systemd that referenced this pull request Jan 8, 2020
keszybz added a commit to keszybz/systemd that referenced this pull request Jan 8, 2020
geissonator pushed a commit to openbmc/sdbusplus that referenced this pull request Jan 16, 2020
On latest OpenBMC, sdbusplus CI fails due to valgrind check like below
on ppc64le systems:

    ==5290== Syscall param epoll_ctl(event) points to uninitialised byte(s)
    ==5290==    at 0x4F2FB08: epoll_ctl (syscall-template.S:79)
    ==5290==    by 0x493A8F7: UnknownInlinedFun (sd-event.c:961)
    ==5290==    by 0x493A8F7: sd_event_add_time (sd-event.c:1019)
    ==5290==    by 0x190BB3: phosphor::Timer::Timer(sd_event*, std::function<void ()>) (timer.hpp:62)
    ==5290==    by 0x192B93: TimerTest::TimerTest() (timer.cpp:25)
    ==5290==    by 0x193A13: TimerTest_timerExpiresAfter2seconds_Test::TimerTest_timerExpiresAfter2seconds_Test() (timer.cpp:85)
    ==5290==    by 0x19E11F: testing::internal::TestFactoryImpl<TimerTest_timerExpiresAfter2seconds_Test>::CreateTest() (gtest-internal.h:472)
    ==5290==    by 0x4A52D3B: HandleSehExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*> (gtest.cc:2447)
    ==5290==    by 0x4A52D3B: testing::Test* testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*>(testing::internal::TestFactoryBase*, testing::Test
    * (testing::internal::TestFactoryBase::*)(), char const*) (gtest.cc:2483)
    ==5290==    by 0x4A40BCB: Run (gtest.cc:2693)
    ==5290==    by 0x4A40BCB: testing::TestInfo::Run() (gtest.cc:2676)
    ==5290==    by 0x4A40DC3: Run (gtest.cc:2825)
    ==5290==    by 0x4A40DC3: testing::TestCase::Run() (gtest.cc:2810)
    ==5290==    by 0x4A414AF: testing::internal::UnitTestImpl::RunAllTests() (gtest.cc:5216)
    ==5290==    by 0x4A5329B: HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (gtest.cc:2447)
    ==5290==    by 0x4A5329B: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::
    *)(), char const*) (gtest.cc:2483)
    ==5290==    by 0x4A416AF: testing::UnitTest::Run() (gtest.cc:4824)
    ==5290==    by 0x4A90917: RUN_ALL_TESTS (gtest.h:2370)
    ==5290==    by 0x4A90917: main (gmock_main.cc:69)
    ==5290==  Address 0x1fff00eafc is on thread 1's stack
    ==5290==  in frame #0, created by epoll_ctl (syscall-template.S:78)
    ==5290==

The root cause is:
1. GCC has a bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992,
   that the padding bytes are not initialized.
2. In systemd, the code in [libsystemd/sd-event/sd-event.c][1] using
   epoll_event hit the above bug:

       typedef union epoll_data
       {
         void *ptr;
         int fd;
         uint32_t u32;
         uint64_t u64;
       } epoll_data_t;

       struct epoll_event
       {
         uint32_t events;      /* Epoll events */
         epoll_data_t data;    /* User data variable */
       } __EPOLL_PACKED;

   In glibc, on x86, `__EPOLL_PACKED` is defined as `__attribute__
   ((__packed__))`[2], so it's packed and there are no internal padding
   bytes;
   On other architectures (e.g. ppc64le), __EPOLL_PACKED is not defined
   and thus there are 4 internal padding bytes between `events` and
   `data`, that are not initialized.
3. When epoll_ctl() is invoked, in [Linux kernel][3], it does a
   copy_from_user() to copy the whole struct into kernel space. That's
   why Valgrind reports "epoll_ctl(event) points to uninitialised
   byte(s)", only for non-x86 platforms.
4. The timer test in this repo invokes sd_event_add_time() and
   eventually hit the above code.

The GCC bug is not resolved from 2016.
The systemd code is actually correct. There is a [PR][4] sent to systemd
trying to workaround the issue, and it is being discussed.

Before GCC bug is resolved or systemd PR is accepted, suppress the
valgrind error to make it pass the CI.

[1]: https://github.com/systemd/systemd/blob/v242/src/libsystemd/sd-event/sd-event.c#L961
[2]: https://github.com/bminor/glibc/blob/f1a0eb5b6762b315517469da47735c51bde6f4ad/sysdeps/unix/sysv/linux/x86/bits/epoll.h#L29
[3]: https://github.com/torvalds/linux/blob/d1eef1c619749b2a57e514a3fa67d9a516ffa919/fs/eventpoll.c#L2095
[4]: systemd/systemd#14353

Signed-off-by: Lei YU <mine260309@gmail.com>
Change-Id: I3a29fd0e28e5c7b69037a7ef2ef9571f93d80df7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants
You can’t perform that action at this time.