Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper use of clockDateCacher in scaffolding #15

Merged
merged 2 commits into from Jan 27, 2014
Merged

Proper use of clockDateCacher in scaffolding #15

merged 2 commits into from Jan 27, 2014

Conversation

paul-rouse
Copy link

wai-extra-2.0.3.1 fixes the most serious manifestation of failing to call the updater returned by clockDateCacher, but the same issue applies to the logger saved in the foundation in makeFoundation. This can easily be shown by adding '$(logWarn)' to the getHomeR handler in a freshly scaffolded site.

The fix here updates the cache every time the date/time getter is called, and I suppose this may not suit all situations. Ideally we might want to use the same cache as the RequestLogger, so that the update gets done for us, but that would require lots of reorganisation to expose it. Alternatively, we might write another updating thread, and start it, in makeFoundation, but that seems overly complex if the appLogger is used only rarely. Feel free to reject if you think this needs more thought!

@snoyberg
Copy link
Member

I think forking a separate update thread isn't too complex an operation to include here. Would you be able to update the pull request to take that approach?

@paul-rouse
Copy link
Author

Sure - can't do it for a few hours now, but it's easy enough (it's one of the variants I've already been through)

@paul-rouse paul-rouse closed this Jan 26, 2014
@paul-rouse paul-rouse reopened this Jan 26, 2014
@snoyberg
Copy link
Member

Nice update, and great comment as well.

I wonder if it would make sense to have something a bit more complicated which used the garbage collector to automatically reap the update thread if it's not being used, and then automatically restart it if X number of requests come in within a given time period. Pinging @kazu-yamamoto about the idea.

snoyberg added a commit that referenced this pull request Jan 27, 2014
Proper use of clockDateCacher in scaffolding
@snoyberg snoyberg merged commit 64d2e6b into yesodweb:postgres Jan 27, 2014
@paul-rouse
Copy link
Author

OK, the fix is now to use a thread to do the update. I've included a comment pointing out the alternative, since some people (eg me!) might prefer it in some situations, although it is not a big thing.

@paul-rouse
Copy link
Author

Yes, I feel that this is bit of a hack at the moment, and something more
integrated would be nicer!

  • Paul

On 2014-01-27 08:31, Michael Snoyman wrote:

Nice update, and great comment as well.

I wonder if it would make sense to have something a bit more complicated which used the garbage collector to automatically reap the update thread if it's not being used, and then automatically restart it if X number of requests come in within a given time period. Pinging @kazu-yamamoto [1] about the idea.

Reply to this email directly or view it on GitHub [2].

Links:

[1] https://github.com/kazu-yamamoto
[2]
#15 (comment)

@kazu-yamamoto
Copy link

To make code to consume less power, such automatic reaping is necessary. I noticed this issues one year ago since one guy pointed out. @snoyberg do you have any ideas to implement this?

Anyway, I will consider this seriously when I will have spare time.

@snoyberg
Copy link
Member

I have some vague ideas, but I'd need to actually write some code to come up with something worth discussing. I'll try to get to this some time next week.

@snoyberg
Copy link
Member

OK, I've written up an initial stab at this. It's on FP Haskell Center:

https://learning-site-staging.fpcomplete.com/user/snoyberg/auto-update

I didn't end up using the garbage collector at all for this. I went with IORef and atomicModifyIORef exclusively, instead of something nicer like TVar, for performance, since this code will be in a type loop. I believe this code has the right properties we're looking for:

  • There should only be zero or one spawned thread running at a time.
  • There's a manual update count maintained before a thread is spawned; once that threshold is passed, an update thread is forked.
  • That update thread will run as long as someone has requested an update during its last cycle.
  • The update thread, in the success case, only performs two actions: the user supplied "get the next value" action, and updating the IORef with the new value.
  • The getCurrent action, in the success case, performs just one action, to get the current value from the IORef.

Possible improvement: keep track of time between manual updates as well as count to determine whether or not we should spawn a thread.

@kazu-yamamoto What do you think?

@kazu-yamamoto
Copy link

Sorry for the delay. I read your code. Since it is a little bit difficult for me to understand the nested code, I made it flatten. I added comments starting with "FIXME":

https://gist.github.com/kazu-yamamoto/8949722

@snoyberg
Copy link
Member

Thanks, that's certainly clearer.

FIXME: In the case of Spawn, is AutoUpdated possible?

Yes, it's possible with a race condition: two threads demand a value at the same time, both get a Spawn constructor, and both fork threads. We want the second thread to cause the first thread to be killed.

FIXME: the first argument is False. So, the second argument is meaningless, isn't it?

Assertions are only checked for non-optimized builds. By putting assert in there, I'm stating "this code path should never occur in a well behaved program, assuming everything else I wrote works as I expected. However, in case I made a mistake, here's what you should do during an optimized build."

FIXME: Is this for assert?

That's to deal with another race condition. In the race condition described above, suppose thread A is running, and is killed by thread B. Thread B then updates the IORef to refer to thread B. Then thread A's exception handler fires. We don't want to modify the IORef at all, since it refers to thread B already. Solution: only switch back to manual updates if the IORef is pointing at the current thread.

@kazu-yamamoto
Copy link

That's to deal with another race condition. In the race condition described above, suppose thread A is running, and is killed by thread B. Thread B then updates the IORef to refer to thread B. Then thread A's exception handler fires. We don't want to modify the IORef at all, since it refers to thread B already. Solution: only switch back to manual updates if the IORef is pointing at the current thread.

This is unclear to me. Thread B throwTos Replaced. So, doesn't the error handler get Just Replaced? If so, clear is not called.

@kazu-yamamoto
Copy link

What will happen in the following case?

  • Thread A is spawn.
  • Thread B is spawn.
  • Thread A turns ManualUpdated to AutoUpdated.
  • Thread A turns AutoUpdated to ManualUpdated
  • Then, Thread B starts working.

@kazu-yamamoto
Copy link

I just updated the code anyway:

https://gist.github.com/kazu-yamamoto/8949722

@kazu-yamamoto
Copy link

Based on @snoyberg 's code, I created a simpler code. I believe this is free from race conditions thanks to SemiAuto state:

https://gist.github.com/kazu-yamamoto/8951304

@snoyberg what do you think?

@snoyberg
Copy link
Member

SemiAuto is an improvement, but there's a small problem: it's not async exception safe. If an exception is thrown between line 41 and line 47, the SemiAuto state will never be replaced.

The other issue is that onErr no longer accounts for the possibility of the user function throwing an exception. The code I had in previously would reset back to manual if the spawned thread died. I think we need to get that back, otherwise the first time an exception is thrown the current value will never be updated again.

@kazu-yamamoto
Copy link

You are right. How about this?

https://gist.github.com/kazu-yamamoto/8951304

@snoyberg
Copy link
Member

There's still a possibility for an async exception between the call to forkIO and the atomicModifyIORef inside. I think you'll need to mask before the forkIO call and then restore the masking state inside.

I'm also a bit nervous about the handle on line 41. If getCurrent is sent an async exception, it will force dereferencing any currently running update thread. But that update thread itself won't be killed. So it's possible to run into a situation where two separate update threads have been spawned and are stepping on each other's toes. Perhaps that can be addressed by, in spawn, checking if the thread held in the IORef matches the current thread ID and, if not, exiting immediately.

@kazu-yamamoto
Copy link

There's still a possibility for an async exception between the call to forkIO and the atomicModifyIORef inside. I think you'll need to mask before the forkIO call and then restore the masking state inside.

I agree. I added mask_ and forkIOWithUnmask.

I'm also a bit nervous about the handle on line 41. If getCurrent is sent an async exception, it will force dereferencing any currently running update thread. But that update thread itself won't be killed. So it's possible to run into a situation where two separate update threads have been spawned and are stepping on each other's toes. Perhaps that can be addressed by, in spawn, checking if the thread held in the IORef matches the current thread ID and, if not, exiting immediately.

Since current code ensures that the internal atomicModifyIORef' is called without interruptions from asynchronous exeptions and the status is set to AutoUpdated, I don't think multiple threads would run.

@snoyberg what do you think?

https://gist.github.com/kazu-yamamoto/8951304

@snoyberg
Copy link
Member

Imagine if an async exception was thrown between lines 43 and 44 (i.e., just when update gets called). The status will be set to ManualUpdates, even if there's already a thread spawned.

@kazu-yamamoto
Copy link

Things are more complicated than I expected. :-(
I hope this version is safe. I'm not confident that this version is more readable than the original of yours.

https://gist.github.com/kazu-yamamoto/8951304

@snoyberg
Copy link
Member

This still seems a bit problematic. Consider this scenario:

  1. Thread A enters update, and gets a Manual/ManualUpdates result.
  2. Thread B enters update, and gets a Spawn/SemiAuto result.
  3. Thread A calls action, which throws an exception.
  4. Thread A's error handler is triggered, and sets the state back to ManualUpdates.
  5. Thread B spawns a thread which no one knows about.

@kazu-yamamoto
Copy link

Right. I would like to continue this discussion with a repo on github, not with a gist. Should I create a sub-directory on wai? Or another place?

@snoyberg
Copy link
Member

Subdirectory of wai is fine. I originally thought you'd want to include it directly in fast-logger or wai-logger, but wherever's most convenient for you works for me.

@kazu-yamamoto
Copy link

I noticed that I don't understand your whole picture. Date cache is not a single issue. Warp also uses threads to manage timers and Fd caches. How do you stop/restart them?

@snoyberg
Copy link
Member

You're absolutely right, it's much bigger than just date caching. Making it a subfolder of wai does make perfect sense.

@meteficha
Copy link
Member

Actually, coming to this discussion while it's happening it seems more like this should be a completely independent package.

@kazu-yamamoto
Copy link

I have just added the auto-update sub-directory to wai:

https://github.com/yesodweb/wai/tree/master/auto-update

Since I'm busy in this week and the next week, and things are more complicated than I expected, I would like to resume this project in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants