Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CircleCI build #213

Merged
merged 1 commit into from
Mar 7, 2018
Merged

Conversation

thughes
Copy link
Contributor

@thughes thughes commented Mar 5, 2018

As discussed in #212.

Some notes:

  • Open source projects get 4 Linux containers by default (4 parallel builds)
  • Open source projects can request macOS executors. I sent in a request and it looks like they gave me a single one; you may be able to request more if necessary.

Still seeing timing issues with some of the lock measurement timing, especially on macOS VMs. I think something like this would fix it if it's measuring the same thing you were originally intending to measure (time to acquire the lock): thughes@d9536b8. The issue that I've seen is that the boost::this_thread::sleep_for can oversleep (as allowed) on the VMs, so just subtracting the requested sleep time of 250ms in the calculation will not give an accurate result.

@viboes
Copy link
Collaborator

viboes commented Mar 6, 2018

You are right the 250ms in this case wrong as the thread is sleeping for 250ms, so at least we should add 50ms.

However your changes will measure the time for the unlock()+join() not the try_to_lock.

@thughes
Copy link
Contributor Author

thughes commented Mar 6, 2018

Hmm, I don't think so. I'm just doing the calculation in the main thread after the join to avoid concurrent access to t1.

t0 is taken just before the main thread calls unlock and t1 is taken in the second thread immediately after it acquires the lock; it's not waiting for the join.

@viboes
Copy link
Collaborator

viboes commented Mar 6, 2018

The test measured the time take by a successful call to

boost::unique_lock<boost::mutex> lk(m, boost::try_to_lock)

Your measure the time for

   m.unlock();
   t.join();

@thughes
Copy link
Contributor Author

thughes commented Mar 6, 2018

Yes, technically it includes the m.unlock as well as the time for the context switch to the other thread triggered by join.

You are right the 250ms in this case wrong as the thread is sleeping for 250ms, so at least we should add 50ms.

Where does the 50 ms come from?

@viboes
Copy link
Collaborator

viboes commented Mar 6, 2018

I use to give 50ms to the cost of an operation. It is often too much, but this is what I've taken as delta.

@thughes
Copy link
Contributor Author

thughes commented Mar 6, 2018

How about trying to measure the sleep time and subtract that?

time_point t3 = Clock::now();
boost::this_thread::sleep_for(ms(250));
time_point t4 = Clock::now();

It also adds overhead, but it seems like it could be more precise than than a constant 50 ms.

@viboes
Copy link
Collaborator

viboes commented Mar 7, 2018

Yes, t4-t3could replace the 250ms. We will need then to check for t4-t3+50ms.
This will measure the time for the unlock() plus boost::try_to_lock

@thughes
Copy link
Contributor Author

thughes commented Mar 7, 2018

Could we could move t4 after the m.unlock() in the main thread to try to account for it directly instead of using 50ms?

I'm just trying to avoid having semi-arbitrary constants that vary by both OS and processor in there combined with making the test case rely on another semi-arbitrary threshold (max_diff) to determine success or failure; it seems destined to have problems (or just have so much additional arbitrary time buffer as to not be useful in actually testing anything).

@viboes
Copy link
Collaborator

viboes commented Mar 7, 2018

I understand your position.

The 50ms is something like a QoI. If an operation takes more than that, there is surely a problem.
On some platforms (e.g. VM) I believe those tests should be warnings.

I don't think moving the measure will change anything. The operation you want to measure should take some time. This is what these tests are measuring. How long do you think is acceptable for a call to a try_lock once the mutex is unlock? It is clear that this depends on how busy the machine is.

A PR that allows to configure at compile time whether these kind of timing checks are errors or warnings would be a good idea.

An alternative idea is to configure at compile time the time acceptable for an operation.

With these PR, we could be able to change the delta acceptable or move those checks to warnings.

Would you be interested in such a PR?

@viboes
Copy link
Collaborator

viboes commented Mar 7, 2018

Someone in the boostorg group will need to enable the build on CircleCI for it to start building against the boostorg/thread repo: https://circleci.com/gh/boostorg/thread

It seems that we have already do that, as the Boost release are built already on CircleCI.
What I could do?

@viboes
Copy link
Collaborator

viboes commented Mar 7, 2018

Boost.Geometry is already using CircleCI.
https://circleci.com/gh/boostorg/geometry/tree/master

What should I do from my side?

@viboes
Copy link
Collaborator

viboes commented Mar 7, 2018

CircleCi doc says to create a .circleci/config.yml file, not a circle.yml file

@viboes
Copy link
Collaborator

viboes commented Mar 7, 2018

I'll merge it to see if I reaches to setup the account.

@viboes viboes merged commit 8764a5b into boostorg:develop Mar 7, 2018
@thughes
Copy link
Contributor Author

thughes commented Mar 7, 2018

"boostorg/geometry" is a separate project from the perspective of CircleCI (each repo in the "boostorg" is a separate project), so it will have to specifically be enabled on this page with someone who is part of the "boostorg" team/org: https://circleci.com/gh/boostorg/thread.

CircleCi doc says to create a .circleci/config.yml file, not a circle.yml file
Correct, though circle.yml is also allowed as you can see in my repo: https://circleci.com/gh/thughes/thread/tree/feature%2Fadd_circleci

@viboes
Copy link
Collaborator

viboes commented Mar 7, 2018

Ok. I'll try it.

https://circleci.com/gh/boostorg/workflows/thread

@viboes
Copy link
Collaborator

viboes commented Mar 7, 2018

It works. The duration is very good.

Now it is time to fix the timing issues.

Thanks a lot,
Vicente

@viboes
Copy link
Collaborator

viboes commented Mar 9, 2018

Do you have a PR where you apply only once this change?
I will be interested in.
I'll be out for 1 week.

@viboes
Copy link
Collaborator

viboes commented Mar 9, 2018

BTW, my changes are not enough. I have yet some issues on some tests that I'll look at later on.
I've been forced to re-run 1 job to get a green CI :(

https://circleci.com/workflow-run/ab368099-02d9-4030-a82c-6cc0e3988204

@thughes
Copy link
Contributor Author

thughes commented Mar 20, 2018

Here's an example of the change to subtract the measured sleep time (instead of exactly 250ms) in one of the tests: thughes@8cac50b

@thughes
Copy link
Contributor Author

thughes commented Apr 10, 2018

@viboes ping ^

@viboes
Copy link
Collaborator

viboes commented Apr 11, 2018

I see what you are doing, but I want to measure the time of the tested operation, not the whole test.

@thughes
Copy link
Contributor Author

thughes commented Apr 11, 2018

You could move t4 to after the m.unlock to get closer.

I guess the problem is that since on CircleCI the sleep time is highly variable (50 - 100 ms over requested time) by always subtracting exactly 250 ms, the calculation is even more wrong since that extra sleep time swamps the actual locking time.

@thughes
Copy link
Contributor Author

thughes commented Apr 11, 2018

Actually, I'm a bit confused as to how the two calculations are actually different. In the case where boost::this_thread::sleep_for(ms(250)); actually sleeps for exactly 250 ms, the two calculations are exactly the same: t1 - t0 - 250. There is some overhead from calling Clock::now() twice; perhaps that is non-negligible?

Note that I only moved the location of t1 because in the original code is actually measuring the time for lock + unlock since it measures when the unique_lock is no longer in scope. For the purposes of accurately accounting for thread sleep time, you could ignore this part of the change.

@viboes
Copy link
Collaborator

viboes commented Apr 11, 2018

Oh, I mixed where t1,t0 are stored and where they are used.

Your algo is good.

Please, could you provide a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants