Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revised timer code due to crashes on some iPads. #3667

Open
wants to merge 1 commit into
base: 1.14
from

Conversation

@singalen
Copy link
Contributor

commented Oct 28, 2018

It's mostly a request for help.

Frankly, I still don't see how the mutex a timer_callback() takes can be already taken by the SAME thread. If anyone can see why else the lock guard could abort in timer_callback(), please let me know.

What's done here:

  • Bury dead code,
  • the lock cam be R/W,
  • use atomic counter to avoid locking it,
  • std::unordered_map is faster than a std::ordered_map.

@singalen singalen force-pushed the singalen:1.14 branch from 2e3abbe to c9f5573 Oct 28, 2018

@jyrkive
Copy link
Member

left a comment

I recommend dealing with timer IDs in a different way.

{
std::lock_guard<std::mutex> lock(timers_mutex);
read_lock_guard lock(timers_mutex);

This comment has been minimized.

Copy link
@jyrkive

jyrkive Oct 28, 2018

Member

There is a possibility that the ID is acquired by another thread while you aren't holding either read or write lock.

You need to hold a write lock for the entire duration of this function.

This comment has been minimized.

Copy link
@singalen

singalen Oct 28, 2018

Author Contributor

Thanks, but how can this happen? It's a) only accessed in a single protected block, and b) it's incremented before it's read, so each new thread will acquire a new value.

I'm specifically trying to reduce the time spent in the lock.

This comment has been minimized.

Copy link
@jyrkive

jyrkive Oct 29, 2018

Member

Pretty much nothing in this function takes a notable amount of time. IMHO, not holding a write lock for the whole duration of this function is severe premature optimization.

This comment has been minimized.

Copy link
@singalen

singalen Oct 31, 2018

Author Contributor

Basically, what I'm trying to do is to guess where the crash is coming from and minimize the opportunities. The crash is reported on a quite old hardware, iPad 2. So maybe it's not that premature.

I would agree to "severe" if it made the function twice as long, but it merely adds two lines and two pairs of curly brackets.

On the other hand, the crash is supposed to mean that the mutex is already being held by the same thread. I cannot see how timer_callback() can be called from other functions in this file... except for, maybe, SDL_AddTimer(). That's highly unlikely, but this implementation is testing that hypothesis.

This comment has been minimized.

Copy link
@jyrkive

jyrkive Oct 31, 2018

Member

SDL_AddTimer() is, in no uncertain terms, documented to run the callback in a separate thread: "The callback is run on a separate thread."

I'm afraid reducing the time this function holds the write lock looks just too dangerous to me. I'm opposed to this pull request until/unless it actually turns out to fix crashes on iOS.

This comment has been minimized.

Copy link
@singalen

singalen Nov 19, 2018

Author Contributor

I see your point.
I don't think this can happen because:
The next_timer_id is incremented by 1 in get_next_timer_id() whose body is a guarded block. For every ID generated, any ID generated after it will be greater (or wrapped around size_t).

A danger I can see here is if someone changes the code later and misses the point of locking... but this can happen to any concurrent code.

This comment has been minimized.

Copy link
@jyrkive

jyrkive Nov 19, 2018

Member

Yeah, I understand that. But my view isn't going to change - I'm not going to accept this change to the upstream repository until/unless it's actually shown to fix crashes on iOS.

This comment has been minimized.

Copy link
@singalen

singalen Nov 20, 2018

Author Contributor

Even given that get_next_timer_id() is faithful to its contract and thread safe, and is not going to wrap around in any reasonable time? I really don't understand the technical necessity for a whole-function lock then.

This comment has been minimized.

Copy link
@jyrkive

jyrkive Nov 20, 2018

Member

Yes, regardless of it. I understand that the code should be safe even without holding the lock for the whole function, but I'm not willing to trust it.

This comment has been minimized.

Copy link
@singalen

singalen Nov 22, 2018

Author Contributor

Okay, it will remain in in-progress iOS branch. I will see if the crashes in timer_callback() go, but even that is only an indirect evidence.

@@ -37,73 +40,35 @@ struct timer
};

/** Ids for the timers. */
static size_t next_timer_id = 0;
static std::atomic<size_t> next_timer_id(0);

This comment has been minimized.

Copy link
@jyrkive

jyrkive Oct 28, 2018

Member

Doesn't need to be atomic. timers_mutex can (and should IMO) protect this variable.

This comment has been minimized.

Copy link
@singalen

singalen Oct 28, 2018

Author Contributor

Agree.

This comment has been minimized.

Copy link
@jostephd

jostephd Oct 30, 2018

Member

You might need volatile then?

This comment has been minimized.

Copy link
@jyrkive

jyrkive Oct 30, 2018

Member

@jostephd No, it doesn't need volatile if it's protected by a mutex.

This comment has been minimized.

Copy link
@GregoryLundberg

GregoryLundberg Oct 30, 2018

Contributor

Volatile really shouldn't be used in application code. Its purpose is for kernels and device drivers, to tell the compiler you're reading a device interface which can change on its own, so you cannot make any assumptions about its contents in order to optimize accesses.

It really has nothing to do with threads and can actually make matters worse. Although, I suppose you might be able to use it properly if the volatile value is read-only in all except one thread where it can be read-write.

@@ -133,17 +98,18 @@ size_t add_timer(const uint32_t interval,
DBG_GUI_E << "Adding timer.\n";

timer timer;
size_t next_id;

This comment has been minimized.

Copy link
@jyrkive

jyrkive Oct 28, 2018

Member

Unnecessary. Just use next_timer_id directly like the existing code.

This comment has been minimized.

Copy link
@singalen

singalen Oct 28, 2018

Author Contributor

That's given we need to gold a lock during whole function body, but we don't really need it.

Revised timer code due to crashes on some iPads.
Bury dead code, the lock cam be R/W, use atomic counter to avoid locking it, and std::unordered_map is faster than a std::ordered_map.

@singalen singalen force-pushed the singalen:1.14 branch from c9f5573 to 3ef8ae5 Nov 17, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.