Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alarm broken on cygwin #17650

Closed
sagetrac-gouezel mannequin opened this issue Jan 19, 2015 · 22 comments
Closed

alarm broken on cygwin #17650

sagetrac-gouezel mannequin opened this issue Jan 19, 2015 · 22 comments

Comments

@sagetrac-gouezel
Copy link
Mannequin

sagetrac-gouezel mannequin commented Jan 19, 2015

On cygwin (32 and 64), the alarm mechanism is broken:

sage: alarm(0.1); sum(xrange(100000000))
4999999950000000

instead of the expected interrupt.

I traced the bug by dichotomy to a ppl component named watchdog. Its role is similar to alarm, involving signals and interrupts. I guess that, due to the peculiarities of cygwin's implementation of signals, watchdog somehow intercepts the python signals and does dot send them back.

watchdog is not an optional component of ppl, so it can not be disabled by a configure flag. However, it is disabled on systems that don't have setitimer. Cygwin has setitimer, but a simple hack of ppl's makefile can hide it, thereby disabling watchdog and fixing the sage alarm mechanism. This is done in the proposed patch.

A better fix would be to dig into the ppl and cygwin signal mechanism, of course...

CC: @vbraun @jpflori

Component: porting: Cygwin

Keywords: ppl

Author: Sebastien Gouezel

Branch/Commit: u/gouezel/alarm_broken_on_cygwin @ 457bfa5

Issue created by migration from https://trac.sagemath.org/ticket/17650

@sagetrac-gouezel sagetrac-gouezel mannequin added this to the sage-6.5 milestone Jan 19, 2015
@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 19, 2015

comment:1

Should it be considered as a PPL bug that the methods are not defined in the header? Or a cython bug that this suffices to break the interrupts mechanism? Or a cygwin bug since it does not seem to be a problem on other platforms?

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 20, 2015

comment:2

The PPL code seems to be correct, I am stuck on this one...

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 20, 2015

Upstream: Reported upstream. No feedback yet.

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 20, 2015

Changed upstream from Reported upstream. No feedback yet. to none

@sagetrac-gouezel

This comment has been minimized.

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 25, 2015

Branch: u/gouezel/alarm_broken_on_cygwin

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 25, 2015

New commits:

1a8ecd2disable ppl watchdog on cygwin

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 25, 2015

Changed keywords from none to ppl

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 25, 2015

Commit: 1a8ecd2

@sagetrac-gouezel

This comment has been minimized.

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 30, 2015

Author: Sebastien Gouezel

@jdemeyer
Copy link

comment:8

What I don't like about this ticket is that it's not clear that the "ppl watchdog" is not needed. What does it do and why does Sage not need it?

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 31, 2015

comment:9

Excellent question. There is essentially no documentation on this ppl watchdog that I could find. It seems to be a classical watchdog mechanism, i.e., it detects if some computation loops forever, and in this case it interrupts it automatically and resets things to a nicer state. In particular, it should not play any role in non-buggy situations.

It is not necessary to PPL (since it is disabled on some platforms, those which don't have setitimer), and I do not see how it could be relevant to sage. So, my impression is that it is safe to disable it (note that the patch only disables it on cygwin, so it will not break anything on officially supported platforms). On cygwin, with the patch, sage -t all does not show any ppl-related failure. By the way, without the patch, sage -t all hangs forever and eats all available memory, since some tests rely on alarm to be interrupted! So, in my opinion, the patch gives a definitive improvement on cygwin.

@jdemeyer
Copy link

comment:11

Replying to @sagetrac-gouezel:

note that the patch only disables it on cygwin, so it will not break anything on officially supported platforms

For me, that's actually a reason to be against this patch. Either we need it and we shouldn't disable it on Cygwin, or we don't need it and we can just disable it everywhere.

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Jan 31, 2015

comment:12

I get your point. Unfortunately, I can't say for sure that watchdog is not helpful to ppl in rare situations (for instance, maybe it uses some algorithms whose termination is not guaranteed, counting on watchdog to exit bad situations), so I would rather keep watchdog wherever possible (i.e., not on cygwin, where setitimer is not fully functional).

I leave the ticket as needs_work until someone more knowledgeable on ppl steps up (or someone who really needs the patch on cygwin pushes for it)

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 13, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

457bfa5 #17650: touch files in ppl spkg_install to avoid reconfiguration

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 13, 2015

Changed commit from 1a8ecd2 to 457bfa5

@jpflori
Copy link

jpflori commented Sep 2, 2015

comment:14

I'm just posting random links potentially related:

According to #10039 comment:49 there used to be a --disable-watchdog configure flag, too bad it disappeared.

I guess we should open a bug on ppl bugtracker (if one exists).

Unfortunately it seems cygwin does not ship ppl anymore, so we cannot look at what they would have done.
It could still be useful to post on the cygwin ml's.

@sebastien: can you provide a minimal C file reproducing the issue?
Maybe something inspired in the aforementioned 2008 post can help.

@jdemeyer
Copy link

jdemeyer commented Sep 2, 2015

comment:15

The post does indeed look related. I don't understand why it was posted on PPL-devel, since it seems to have nothing to do with PPL really.

@sebastien: can you provide a minimal C file reproducing the issue?

That would be great, especially to send to the upstream bug trackers (PPL and/or Cygwin).

@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Oct 11, 2015

comment:16

Replying to @jpflori:

@sebastien: can you provide a minimal C file reproducing the issue?
Maybe something inspired in the aforementioned 2008 post can help.

I just tried, but unfortunately I was not able to reproduce the issue with plain C or C++ files (for instance, taking the file in the link you give and adding ppl headers and initialization does not break the itimer mechanism).

For the record, steps to reproduce the issue using python:

File alarm.pyx:

import signal

def essai(n):
  signal.setitimer(signal.ITIMER_REAL, 0.5 , 0)
  return sum(xrange(n))

File alarme_c.cpp

#include <ppl.hh>

File setup.py

from distutils.core import setup
from Cython.Build import cythonize
from distutils.extension import Extension

setup(
    ext_modules = cythonize(Extension("alarme", ['alarme.pyx', 'alarme_c.cpp'], libraries = ['ppl']))
)

File essai.py

import alarme

print alarme.essai(100000000)

Compile the extension with

python setup.py build_ext --inplace

Then python essai.py results in 4999999950000000, no interruption.

Commenting out the header inclusion in alarme_c.cpp, then one gets Alarm clock instead.

@embray embray self-assigned this Apr 8, 2016
@embray embray removed this from the sage-6.5 milestone Apr 13, 2017
@sagetrac-gouezel
Copy link
Mannequin Author

sagetrac-gouezel mannequin commented Apr 14, 2017

comment:19

duplicate of #21190

@embray
Copy link
Contributor

embray commented Jul 13, 2017

comment:20

Closing tickets in the sage-duplicate/invalid/wontfix module with positive_review (i.e. someone has confirmed they should be closed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants