Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests with threads #38

Merged
merged 1 commit into from
Jan 7, 2016
Merged

Conversation

james-ward
Copy link
Contributor

This patch makes the tests against robots including TimerTasks work. Details in patch comments for first patch.

Second patch pylints the changed files - mainly trailing whitespace removal.

@virtuald
Copy link
Member

virtuald commented Jan 4, 2016

Awesome, I'm glad to hear that works.

My primary complaint is that this isn't generic enough. If a user decides to implement some thread that spins in a loop and calls timer.delay(), they'll run into the same issue.

Is there a reason that you can't just do something along the lines of detecting that you're not in the main thread, and do the same kind of waiting? You could use threading.local to store the event, and a WeakSet to store the list of events that get waked up?

@james-ward
Copy link
Contributor Author

The reason that this won't work with more generic threads is that you need to monkey patch the task_fn member of the TimerTask to get it to make the blocking wait call on the Event.
If I could see a way to do that in all threads, we'd have a solution.

@virtuald
Copy link
Member

virtuald commented Jan 4, 2016

But it's calling timer.delay(), which is already a blocking wait. Why not block there?

@virtuald
Copy link
Member

virtuald commented Jan 4, 2016

I suspect the tricky part with a block there is that you would need to release the main lock in order to wait on the event, but perhaps a condition variable would do the trick? Then, instead of storing a list of waiters, you just call notifyAll() and they'll all wake up.

@james-ward
Copy link
Contributor Author

Let me see what I can do with intercepting the call at the timer.delay() point.
A condition might work, but sometimes you don't want to wake them all up because they aren't due to fire in the current timestep.
The switching between the event blocking the main thread and those on the child threads can be tricky, yes.

@virtuald
Copy link
Member

virtuald commented Jan 4, 2016

Yeah, but if they wake up, just go back to sleep. Maybe something like...

self.cond = threading.Condition(self.lock)

.. 

with self.cond:
  if in main thread:
    # increment time first, then notify
    self.cond.notifyAll()
  else:
    wait_until = xx
    while current_time < wait_until:
      self.cond.wait()

@james-ward
Copy link
Contributor Author

Tweaked the previous work to be more generic.
It now works for any thread that isn't the main thread.
I couldn't use Conditions as I still needed a way to force the main thread to wait until all the child threads had finished running (sometimes more than once). What would happen was that one thread would wake and acquire the lock, but the main thread would often reacquire it subsequently before the next thread could.
This method with Events is deterministic.

@@ -59,6 +59,9 @@ def pytest_runtest_setup(self):
"If your robot class has an __init__ function, it must call super().__init__()!"

def pytest_runtest_teardown(self, nextitem):
# Let any child threads run in realtime to allow cancelling if it
# has been implemented.
self._fake_time.freerun_children()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if it hasn't been implemented? Maybe throw an exception the next time they try to delay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting issue - the necessity to let the threads run free is because TimerTasks have a call made to their cancel() method by the free() method in PIDController.
https://github.com/robotpy/robotpy-wpilib/blob/f489e60a373351904c13efba71e47e1606b74198/wpilib/wpilib/pidcontroller.py#L138-L144

If the thread isn't running at this point, this call will block on cancel()'s call to join(). But it won't necessarily be stopped on the next run through of the loop, so throwing an exception when it calls delay() would likely cause it to throw everytime.

For a thread that doesn't have a terminate mechanism, there is no method to call after the test terminates to stop it - so the reference to the object just gets dropped and the next test begins. If the thread doesn't have a way to stop and join it, I don't think there is much we can do.
What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do a timed join on the threads, and fail the test with an explanation if the thread doesn't die?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to leave all the code in the wpilib library itself alone for this issue.
A timed call to join inside the pyfrc harness will always fail because it will wait here until the timeout, because the cancel() call on the TimerTask class is made later in the teardown process. If that cancel call was never made, the current builtin tests would pass for the example code.
I'm not sure what action we can take here in the test harness that is generic for all child threads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then do the teardown before the join?

I agree that we probably can't solve this for everyone, but it seems like we'll have a list of a bunch of threads that we can join after whatever teardown is done. If the join takes too long, then we fail the test with an explanation on how they can make sure that the test doesn't fail (provide a free method? register a resource?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thinking about it on the way home I agree with you.
Let's make the tests fail if code spawns a thread but doesn't clean it up nicely at the end - enforce good housekeeping where we can.
I'll try to write some unit tests over this stuff to demonstrate the behaviours. Not sure how easy it will be, but I'll give it a go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reliably detect still-running threads and throw an exception.
A side effect is that the threads throw other exceptions anyway (like trying to access the HAL after it has been torn down), and subsequent tests also fail because the test harness is no longer initialised properly.
The good bit is that you do get an exception spelling out the root cause - unterminated threads.
Is this behaviour acceptable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that seems fine to me, provided that there's a way that a thread can still successfully shutdown.

Now the question becomes -- what is the best way for someone to detect that the test has ended and tear themselves down? I suppose a user could do it with a try... finally block in their test. For things that are inside WPILib, we use the Resource class. It's not public API, but we could make it so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make Resource._add_global_resource() part of the public API. Then any custom threads (or their containing class) just need to have a free() method to clean themselves up, and register with Resource.
We'd need to document this requirement, of course.
Happy with that approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Bonus points if you mention Resource.add_global_resource in the error message when the test fails.

Fixes robotpy#29
Fixes robotpy/robotpy-wpilib#116

Because of the faked calls to Timer.delay any child threads
(used by PID loops, for example) would run their target functions
repeatedly with no delays.
This patch intercepts the delay call and uses a blocking call to wait
for a threading.Event. This makes the threads sleep until woken by
the main thread incrementing the time.
When a child thread requests a positive timestep that means it should
fire in the future, so it is made to sleep and woken if the time
advancement goes past this time. Threads that request negative time
should have fired in the previous increment of the timer, so they
are allowed to continue.

This has the nice effect that the child thread will fire the correct
number of times over the course of the test.
@james-ward
Copy link
Contributor Author

Current commit checks at end of tests for still running threads, and throws if it finds any.

@virtuald
Copy link
Member

virtuald commented Jan 6, 2016

Awesome, I'll review this and try it out tomorrow.

virtuald added a commit that referenced this pull request Jan 7, 2016
@virtuald virtuald merged commit 5bfd70d into robotpy:master Jan 7, 2016
@virtuald
Copy link
Member

virtuald commented Jan 7, 2016

LGTM, thanks.

@james-ward james-ward deleted the tests-with-threads branch January 7, 2016 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants