Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing bezier surface test on OS X #202

Closed
rainman110 opened this issue Jan 19, 2016 · 20 comments
Closed

Failing bezier surface test on OS X #202

rainman110 opened this issue Jan 19, 2016 · 20 comments

Comments

@rainman110
Copy link
Collaborator

The test case test_bezier_surfaces segfaults in OSX. A valgrind analysis shows invalid read in BSplCLib::LocateParameter.

I suspect the already known OCE issue: tpaviot/oce#576 (comment)

The root seems to be some clang specific optimizations: http://tracker.dev.opencascade.org/view.php?id=26042

We could provide these fixes as conda patches.

@rainman110
Copy link
Collaborator Author

Hi @tpaviot, @jf---

I backported the aforementioned patch to oce-0.16. As some header files changed slightly, I also had to change the pythonocc interface files (using the generator).

I can not test this on OS X at the moment. I think this will work though. On Linux and Windows, everything is fine.

The problem was, that clang was optimizing out code like

if (&myRef == NULL)

This is totally valid, as references should never be NULL. In OpenCASCADE, they were using (evil) code like

 myRef = (*((MyRefType*) NULL ));

to create NULL refs. This fix changes some interfaces to use pointers instead of references.

You can build and test oce and pythonocc for yourself with these patches using my conda recipes from https://github.com/DLR-SC/tigl-conda.

@tpaviot If this patch is working, how should we integrate it into pythonocc-core? A new release? Or as conda patches? Please keep in mind, that we need a patched oce-0.16 library.

@tpaviot
Copy link
Owner

tpaviot commented Jan 20, 2016

@rainman110 I reproduced the OSX test fail Bezier Surface on my osx machine. The Gabrage Collector pull request fixes the test (it actually passes) on my machine, but not on the travis OSX vm. I'm a bit lost.
No time this week to work on that topic, will adress this issue on the coming week

@rainman110
Copy link
Collaborator Author

@tpaviot What xcode are you using?

The root of this issue has nothing to do with garbage collection but compiler optimizations since XCode 6.3.

If you have an older XCode, you "fixed" this issue locally 😏

I am gonna check this issue today in the evening. If my checks are successful, I will push the fixed oce conda package to our dlr-sc account into a dev channel. Then you could use this oce from the dev channel to run this on travis.

I'll keep you informed!

@tpaviot
Copy link
Owner

tpaviot commented Jan 20, 2016

@rainman110 If I understand correctly your work, it's an oce patch, right ? It's a huge patch, I think it's worth submitting it to the oce repository. It's not a good idea to release a new pythonocc-core that would require the dlrsc patched oce to work. dlrsc oce would actually become a fork, we should take care about that. IMO this patch should apply on oce-0.17.1, release oce-0.17.2 and upgrade pythonocc to oce-0.17.2.

@tpaviot
Copy link
Owner

tpaviot commented Jan 20, 2016

@rainman110 an old old XCode (my mac machine is an old macbook pro from 2009!)

@rainman110
Copy link
Collaborator Author

an old old XCode (my mac machine is an old macbook pro from 2009!)

@tpaviot This explains, why it's working locally on your machine.

You are right. It is best to integrate it into oce-0.17.2. The problem with this approach is, that we'll have a broken pythonocc until oce 0.17.2 and pythonocc-core 0.17 is released.

The bad thing with this bug is, that it sits at the heard of the NURBS engine - i.e. in the BSpline evaluation routines. Every higher level CAD functionality uses this code.

A pragmatic workaround: Compile the oce conda package with your old xcode. This requires no code changes. Also, the travis builds should work afterwards.

@tpaviot
Copy link
Owner

tpaviot commented Jan 20, 2016

I love pragmatic workarounds. Let's do that.

Integrate the patch into oce-0.17.2 may not be possible. We should have to increase the oce ABI number, thus breaking compatiblity into 0.17.x series.

@rainman110
Copy link
Collaborator Author

By the way, this patch will be integrated into OCCT 7.0. What is your release roadmap? Will you target OCCT 6.9.0 with OCE or will you directly jump to OCCT 7.0?

If you target to 6.9.0 first, we could integrate this patch there or provide at least a workaround that prevents optimization of the critical lines.

@tpaviot
Copy link
Owner

tpaviot commented Jan 20, 2016

We did not discuss that point yet on oce. My opinion is that we should target OCCT6.9.0. OCCT7.0 is a major refactoring, I'm quite sure that the port to OCCT7.0 will not be so easy.

@tpaviot
Copy link
Owner

tpaviot commented Jan 20, 2016

@rainman110 conda is building oce-0.16.1 on my machine (I used the OCE016 receipe you pushed to pythonocc-contrib). What should I do in order to upload the binaries to the dlr-sc repo ?

@rainman110
Copy link
Collaborator Author

@tpaviot Sorry, the recipe from pythonocc-contrib is a bit outdated and suffers from the missing system libs problem we discussed. You will have problems using this package with travis.

It is best to use it with the recipe from the DLR-SC/tigl-conda repo for now. I'll make a PR as soon as possible to integrate my fixes with respect to relocation. But don't use the oce-nullrefs-fix.patch .

@rainman110
Copy link
Collaborator Author

@tpaviot I will build this when I get home using an old xcode.

@tpaviot
Copy link
Owner

tpaviot commented Jan 20, 2016

@rainman110 Where should I find the recipe you mention ? DLR-SC/tigl-conda provides binaries I can't find any recipe

@tpaviot
Copy link
Owner

tpaviot commented Jan 20, 2016

@rainman110 Ah, ok, I let you compile your own oce with an old xcode, it will surely be better for both os us

@rainman110
Copy link
Collaborator Author

Just for the next time. Here are the recipes: https://github.com/DLR-SC/tigl-conda

@rainman110
Copy link
Collaborator Author

Hi @tpaviot

I was right, with my fixes the bezier surface test does not fail anymore.

Instead of compiling with an old xcode, I made another small patch that prevents compiler optimizations of the crucial lines. This patch is now part of my conda recipe:
https://github.com/DLR-SC/tigl-conda/blob/master/oce/oce-nullrefs-fix.patch

I uploaded the oce build to anaconda/dlr-sc. Please test it with travis.

Cheers!

@rainman110
Copy link
Collaborator Author

@tpaviot All tests are passing now: https://travis-ci.org/rainman110/pythonocc-core/builds/103700070

If you can reproduce it, we can close this issue.

@tpaviot
Copy link
Owner

tpaviot commented Jan 21, 2016

@tpaviot tpaviot closed this as completed Jan 21, 2016
@rainman110
Copy link
Collaborator Author

Cool! I just made a PR with a full matrix job including mac and linux, clang + gcc, python 2.7 and python 3.5: #203

@tpaviot
Copy link
Owner

tpaviot commented Jan 21, 2016

Yes I see I've been following the build, I will merge as soon as the build has passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants