Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Threading Ticker needs FAKETIME_DONT_FAKE_MONOTONIC #390

Closed
maurerle opened this issue May 5, 2022 · 12 comments
Closed

Python Threading Ticker needs FAKETIME_DONT_FAKE_MONOTONIC #390

maurerle opened this issue May 5, 2022 · 12 comments
Assignees

Comments

@maurerle
Copy link

maurerle commented May 5, 2022

I had the Problem that my background jobs did not execute in a docker container with mocked time through the great libfaketime library.

My MWE (minimum working example) to reproduce this issue looks like this:

import threading

import logging
from datetime import datetime
logging.basicConfig()
logging.getLogger().setLevel('INFO')
def print_something(something):
    ticker = threading.Event()
    while not ticker.wait(2):
        logging.info(f"{something} {datetime.now()}")

#Created the Threads
logging.info("starting")
t = threading.Thread(target=print_something, daemon=True, args=('hello world',))
t.start()
t.join()

This does not print anything.
Yet, if the environment variable FAKETIME_DONT_FAKE_MONOTONIC=1 is set, it works.

Exemplary Dockerfile:

FROM debian:stable-slim as builder
RUN apt-get update                                                       \
    && apt-get install --no-install-recommends -y git build-essential ca-certificates\
    && git clone https://github.com/wolfcw/libfaketime.git /libfaketime  \
    && cd /libfaketime                                                   \
    && make                                                              \
    && make install

FROM python:3.10-slim
COPY --from=builder /usr/local/lib/faketime/libfaketimeMT.so.1 /faketimeMT.so

ENV LD_PRELOAD=/faketimeMT.so
ENV FAKETIME="@2021-01-01 00:00:00"
ENV FAKETIME_DONT_RESET=1
ENV FAKETIME_DONT_FAKE_MONOTONIC=1 # <- without this it does not work

COPY test.py test.py
CMD ["python", "test.py"]

This was an issue I struggled with, so I wanted to document it so that others can find the solution.
Maybe its worth noting that FAKETIME_DONT_FAKE_MONOTONIC is not only needed for Java

@wolfcw
Copy link
Owner

wolfcw commented May 5, 2022

Thanks for reporting and posting a workaround!

Since you clone the repository and compile, have you checked whether "make test" also hangs in the container? If so, compiling with FORCE_MONOTONIC_FIX enabled in the Makefile might help.

@amurzeau
Copy link

amurzeau commented May 9, 2022

I also get this with just:

from threading import Event

event = Event()
print("Waiting")
event.wait(1.0)
print("Done")

I get this, hanging after "waiting":

doc@debian64:~/test$ cat test_sem_clockwait.py
from threading import Event, Lock

event = Event()
print("Waiting")
event.wait(1.0)
print("Done")

doc@debian64:~/test$ time python3 test_sem_clockwait.py
Waiting
Done

real    0m1,042s
user    0m0,030s
sys     0m0,011s
doc@debian64:~/test$ faketime -f '@2021-01-01 00:00:00' python3 test_sem_clockwait.py
Waiting

gdb stacktrace:

(gdb) bt
__futex_abstimed_wait_common64 (futex_word=futex_word@entry=0x55e1d61fbb20, expected=expected@entry=0, clockid=clockid@entry=1, abstime=abstime@entry=0x7fffd671d8b0, private=<optimized out>, cancel=cancel@entry=true) at ../sysdeps/nptl/futex-internal.c:87
87      ../sysdeps/nptl/futex-internal.c: Aucun fichier ou dossier de ce type.
(gdb) bt
#0  __futex_abstimed_wait_common64 (futex_word=futex_word@entry=0x555555b3b370, expected=expected@entry=0, clockid=clockid@entry=1,
    abstime=abstime@entry=0x7fffffffdfb0, private=<optimized out>, cancel=cancel@entry=true) at ../sysdeps/nptl/futex-internal.c:87
#1  0x00007ffff7dfa15b in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x555555b3b370, expected=expected@entry=0, clockid=clockid@entry=1,
    abstime=abstime@entry=0x7fffffffdfb0, private=<optimized out>) at ../sysdeps/nptl/futex-internal.c:123
#2  0x00007ffff7df6740 in do_futex_wait (sem=sem@entry=0x555555b3b370, clockid=clockid@entry=1, abstime=abstime@entry=0x7fffffffdfb0) at sem_waitcommon.c:112
#3  0x00007ffff7df67ee in __new_sem_wait_slow64 (sem=0x555555b3b370, clockid=1, abstime=0x7fffffffdfb0) at sem_waitcommon.c:184
#4  0x00007ffff7df685d in __sem_clockwait (sem=sem@entry=0x555555b3b370, clockid=clockid@entry=1, abstime=abstime@entry=0x7fffffffdfb0) at sem_clockwait.c:46
#5  0x000055555569f187 in PyThread_acquire_lock_timed (lock=0x555555b3b370, microseconds=1000000, intr_flag=1) at ../Python/thread_pthread.h:488

See https://sources.debian.org/src/python3.10/3.10.4-3/Python/thread_pthread.h/#L488

I'm using libfaketime v0.9.10 and python 3.10.4 on Debian sid.

Python 3.10.4 uses sem_clockwait which seems not implemented in libfaketime, I don't find it in https://github.com/wolfcw/libfaketime/blob/v0.9.10/src/libfaketime.c.

Using FAKETIME_DONT_FAKE_MONOTONIC=1 faketime -f '@2021-01-01 00:00:00' python3 test_sem_clockwait.py works (no hang).

@wolfcw wolfcw self-assigned this May 9, 2022
@amurzeau
Copy link

amurzeau commented May 9, 2022

I also get the hang with faketime +0minutes python3 test_sem_clockwait.py (so without time offset).
When compiling a custom program to print clock_gettime, I see that the monotonic and real time became the same with faketime:

doc@debian64:~/debian$ cat faketimetest.c
#include <semaphore.h>
#include <time.h>
#include <unistd.h>
#include <stdio.h>

int main() {
    struct timespec ts;
        clock_gettime(CLOCK_REALTIME, &ts);
    printf("gettime CLOCK_REALTIME: %d.%09d\n", ts.tv_sec, ts.tv_nsec);
    
        clock_gettime(CLOCK_MONOTONIC, &ts);
    printf("gettime CLOCK_MONOTONIC: %d.%09d\n", ts.tv_sec, ts.tv_nsec);
}
doc@debian64:~/debian$ gcc -o faketimetest faketimetest.c 
doc@debian64:~/debian$ ./faketimetest 
gettime CLOCK_REALTIME: 1652131778.217399576
gettime CLOCK_MONOTONIC: 31483.274837271
doc@debian64:~/debian$ faketime +0minutes ./faketimetest 
gettime CLOCK_REALTIME: 1652131783.856929605
gettime CLOCK_MONOTONIC: 1652131783.856986596
doc@debian64:~/debian$ FAKETIME_FORCE_MONOTONIC_FIX=1 faketime +0minutes ./faketimetest 
gettime CLOCK_REALTIME: 1652131924.381097987
gettime CLOCK_MONOTONIC: 1652131924.381167182

With sem_clockwait with CLOCK_MONOTONIC, the glibc is calling the kernel directly with a monotonic clock, and as the kernel is expected something in the lines of 31483.274837271 (a small number of seconds), it "hangs".

Python 3.10 does this:

  • The argument is a relative timeout
  • Python converts this relative time into an absolute time using clock_gettime(CLOCK_MONOTONIC, &ts)
  • Then it adds the relative timeout
  • Then calls sem_clockwait with the result (which is around 31483s without faketime, but around 1652131924 with faketime, thus it "hangs")

@wolfcw
Copy link
Owner

wolfcw commented May 11, 2022

Thanks for your detailed analysis, all plausible. Let's hope it's related to the yet missing sem_clockwait() support. I can't reproduce with Python 3.10.4 on Linux, but likely just my glibc is too old on that system.

f706373 experimentally adds support for sem_clockwait() with no testing procedure yet, but eventually you could give it a shot.

@amurzeau
Copy link

amurzeau commented May 14, 2022

I will try the commit now.

@amurzeau
Copy link

I've tested commit f706373 with FAKETIME="+200d" LD_PRELOAD=$PWD/lib/faketime/libfaketime.so.1 python3.10 test_sem_clockwait.py

And it doesn't hang, the delay is the expected one.
When I'm using x2 or x0.5 I get the reversed behavior, where x2 slows down the real delay:

doc@debian64:faketime$ time env FAKETIME="+200d x2" LD_PRELOAD=$PWD/libfaketime.so.1 python3.10 ~/test/test_sem_clockwait.py
Waiting
Done

real    0m2,127s
user    0m0,018s
sys     0m0,022s
doc@debian64:faketime$ time env FAKETIME="+200d x0.5" LD_PRELOAD=$PWD/libfaketime.so.1 python3.10 ~/test/test_sem_clockwait.py
Waiting
Done

real    0m0,527s
user    0m0,048s
sys     0m0,000s

Compared to sleep:

doc@debian64:faketime$ time FAKETIME="+200d x2" LD_PRELOAD=$PWD/libfaketime.so.1 sleep 1

real    0m0,505s
user    0m0,000s
sys     0m0,004s
doc@debian64:faketime$ time FAKETIME="+200d x0.5" LD_PRELOAD=$PWD/libfaketime.so.1 sleep 1

real    0m2,004s
user    0m0,001s
sys     0m0,003s

So it doesn't hang anymore, but the clock speed feature is not totally ok.

@wolfcw
Copy link
Owner

wolfcw commented May 14, 2022

Thanks for testing and pointing this out! This latest commit should address the reversed factor.

@amurzeau
Copy link

amurzeau commented May 15, 2022

Good, I've not found any issues now with 859751e:

doc@debian64:faketime$ time env FAKETIME="+200d" LD_PRELOAD=$PWD/libfaketime.so.1 python3.10 ~/test/test_sem_clockwait.py
Waiting
Done

real    0m1,065s
user    0m0,035s
sys     0m0,013s
doc@debian64:faketime$ time env FAKETIME="+200d x2" LD_PRELOAD=$PWD/libfaketime.so.1 python3.10 ~/test/test_sem_clockwait.py
Waiting
Done

real    0m0,563s
user    0m0,041s
sys     0m0,000s
doc@debian64:faketime$ time env FAKETIME="+200d x0.5" LD_PRELOAD=$PWD/libfaketime.so.1 python3.10 ~/test/test_sem_clockwait.py
Waiting
Done

real    0m2,058s
user    0m0,023s
sys     0m0,019s

@maurerle
Copy link
Author

maurerle commented May 16, 2022

Hi, currently my initial test case does not work.
This broke my build, as currently there isn't any waiting time at all between the calls

leaving away FAKETIME_DONT_FAKE_MONOTONIC resolves this.
Otherwise nothing is outputted or everything is outputted very fast without watining.

@maurerle
Copy link
Author

Somehow this does also not work if

FAKETIME= is set but empty

@wolfcw
Copy link
Owner

wolfcw commented May 16, 2022

The latest commit should fix that the intercepted sem_clockwait() also honors when "don't fake monotonic" is set.

If the issue persists for you, does it happen no matter whether FAKETIME_DONT_FAKE_MONOTONIC is set or not? Does it happen with both negative and positive faketime offsets? We'd then probably need to strace a bit to figure out whether libfaketime uses wrong values or something goes wrong when the real sem_clockwait() is called.

@maurerle
Copy link
Author

Great. That works great now.

Empty faketime works too.
Increasing the multiplier works too.
Thank you very much for the fast fixes!

I will close this issue as nothing else is left to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants