New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pexpect on Solaris via cron (/dev/tty issue) #44

Closed
ksalman opened this Issue Mar 10, 2014 · 22 comments

Comments

Projects
None yet
3 participants
@ksalman

ksalman commented Mar 10, 2014

I am using the pexpect (version 3.1, installed via pip) on OmniOS (Solaris fork), and it works fine on the interactive shell. But when I try to use it via cron it has issues with /dev/tty. I thought this was an issue for Solaris but had been fixed?
Specifically, the error is:

OSError: [Errno 6] No such device or address: '/dev/tty'

Rest of the trace:

p=pexpect.spawn('ssh -oUserKnownHostsFile=/dev/null -oStrictHostKeyChecking=no %s@%s' % ('root', host), timeout=60)
  File "/usr/lib/python2.6/site-packages/pexpect/__init__.py", line 485, in __init__
    self._spawn(command, args)
  File "/usr/lib/python2.6/site-packages/pexpect/__init__.py", line 607, in _spawn
    self.pid, self.child_fd = self.__fork_pty()
  File "/usr/lib/python2.6/site-packages/pexpect/__init__.py", line 668, in __fork_pty
    self.__pty_make_controlling_tty(child_fd)
  File "/usr/lib/python2.6/site-packages/pexpect/__init__.py", line 722, in __pty_make_controlling_tty
    fd = os.open("/dev/tty", os.O_WRONLY)
OSError: [Errno 6] No such device or address: '/dev/tty'
Traceback (most recent call last):
  File "/root/nologify.py", line 25, in <module>
    p.expect('\r\n.+]# ')
  File "/usr/lib/python2.6/site-packages/pexpect/__init__.py", line 1418, in expect
    timeout, searchwindowsize)
  File "/usr/lib/python2.6/site-packages/pexpect/__init__.py", line 1433, in expect_list
    timeout, searchwindowsize)
  File "/usr/lib/python2.6/site-packages/pexpect/__init__.py", line 1521, in expect_loop
    raise EOF(str(err) + '\n' + str(self))
pexpect.EOF: End of File (EOF). Very slow platform.
<pexpect.spawn object at 0x8181d6c>
version: 3.1
command: /usr/bin/ssh
args: ['/usr/bin/ssh', '-oUserKnownHostsFile=/dev/null', '-oStrictHostKeyChecking=no', 'root@host1']
searcher: <pexpect.searcher_re object at 0x8181dec>
buffer (last 100 chars): ''
@takluyver

This comment has been minimized.

Member

takluyver commented Mar 10, 2014

The comment on that block says "Verify we now have a controlling tty.". It's part of a method which detaches from the tty of the parent, if there is one, and connects to the pseudoterminal through which pexpect talks to it.

I don't know much about Solaris, and I don't have a Solaris box handy to test on. I'll try to replicate it on Linux with a cron job, though.

@ksalman

This comment has been minimized.

ksalman commented Mar 10, 2014

I tried it in a cron job on Linux and it appeared to work

@takluyver

This comment has been minimized.

Member

takluyver commented Mar 10, 2014

There's a code path (__fork_pty and __pty_make_controlling_tty) that's only taken under Solaris normally. I'm tweaking the code so that it tests that path on Linux as well, but it looks like it's still working.

@takluyver

This comment has been minimized.

Member

takluyver commented Mar 10, 2014

If you comment out that block, lines 722-726, does the rest of the code behave as expected? Maybe that check's unnecessary.

@jquast

This comment has been minimized.

Member

jquast commented Mar 10, 2014

I've got a partially completed OpenSolaris VM I'll finish up soon for the purpose of reproducing this. Haven't used OmniOS, but I'm pretty familiar with Solaris from a past life, I'll take this issue and debug this within the week. It certainly won't replicate on Linux, this is within the "only when solaris" control structure.

Agree with @takluyver this is some kind of verification step which has failed. A curious verification -- it failed to open /dev/tty in the first place -- understood, we haven't got one. The child pty should now be a tty (A pseudo one after openpty & fork) at this step.

I think this may also be reproduced without cron by simply running:

nologify.py < /dev/null > /tmp/log 2>&1

Which is the beginning of a good test case for Solaris, it ensures that none of stdin, out, or err is a controlling tty.

@jquast jquast added the bug label Mar 10, 2014

@jquast jquast self-assigned this Mar 10, 2014

@takluyver

This comment has been minimized.

Member

takluyver commented Mar 10, 2014

My understanding is that a process started from a terminal still has that as a controlling tty, even if its std* streams are not connected to it. cron jobs, on the other hand, are presumably started without a tty at all.

@ksalman

This comment has been minimized.

ksalman commented Mar 10, 2014

This works without any issue

/root/nologify.py < /dev/null > /tmp/log 2>&1
@ksalman

This comment has been minimized.

ksalman commented Mar 10, 2014

I tried commenting out lines 722-726 as suggested by @takluyver and it works

@takluyver

This comment has been minimized.

Member

takluyver commented Mar 10, 2014

Thanks @ksalman . I'm wary about just removing a check that presumably was written for good reason, but it's a useful data point that it seems to work without that. @jquast , when you've got your Solaris VM running, can you dig into it - is there a more robust way to do the check? What's the potential downside of removing it?

@jquast

This comment has been minimized.

Member

jquast commented Mar 10, 2014

I'm not at all surprised that, by commenting it out, it continues to work, I considered the same -- At least @ksalman can move on with his daily work :-)

My second thought mirrors @takluyver -- there is probably a good reason for this verification step, it resolved the original (Solaris 8 or 9-era?) bug from 8 years ago without such regression. This verification step was part of the original "native pty fork" commit,
SHA: d57237e

I'll take ownership and begin by reproducing and discovering one of two routes:

  1. we are missing a step to properly re-introduce /dev/tty in the child process for at least opensolaris-forks, or
  2. the verification step is not needed at all.
@jquast

This comment has been minimized.

Member

jquast commented Mar 11, 2014

I have been able to reproduce the bug using OmniOS.

vagrant@omnios-vagrant:~$ cat /export/home/vagrant/cron.py
#!/export/home/vagrant/.virtualenvs/pexpect/bin/python
import pexpect

p = pexpect.spawn("/usr/bin/ssh -oUserKnownHostsFile=/dev/null"
                  " -oStrictHostKeyChecking=no vagrant@localhost")
p.expect("[pP]assword:", timeout=3)
p.sendline("vagrant")
p.expect("OmniOS", timeout=3)
p.sendline("export PS1='[my beautiful prompt]# '")
p.expect('\r\n.+]# ')
p.expect('\r\n.+]# ')
print(p.match.group())

vagrant@omnios-vagrant:~$ crontab -l
* * * * * /export/home/vagrant/cron.py > /export/home/vagrant/cron.log 2>&1

vagrant@omnios-vagrant:~$ tail -f cron.log
Traceback (most recent call last):
  File "/export/home/vagrant/cron.py", line 4, in <module>
    p = pexpect.spawn("/usr/bin/ssh -oUserKnownHostsFile=/dev/null"
  File "/export/home/vagrant/.virtualenvs/pexpect/lib/python2.6/site-packages/pexpect/__init__.py", line 485, in __init__
    self._spawn(command, args)
  File "/export/home/vagrant/.virtualenvs/pexpect/lib/python2.6/site-packages/pexpect/__init__.py", line 607, in _spawn
    self.pid, self.child_fd = self.__fork_pty()
  File "/export/home/vagrant/.virtualenvs/pexpect/lib/python2.6/site-packages/pexpect/__init__.py", line 668, in __fork_pty
    self.__pty_make_controlling_tty(child_fd)
  File "/export/home/vagrant/.virtualenvs/pexpect/lib/python2.6/site-packages/pexpect/__init__.py", line 722, in __pty_make_controlling_tty
    fd = os.open("/dev/tty", os.O_WRONLY)
OSError: [Errno 6] No such device or address: '/dev/tty'
Traceback (most recent call last):
  File "/export/home/vagrant/cron.py", line 6, in <module>
    p.expect("[pP]assword:", timeout=3)
  File "/export/home/vagrant/.virtualenvs/pexpect/lib/python2.6/site-packages/pexpect/__init__.py", line 1418, in expect
    timeout, searchwindowsize)
  File "/export/home/vagrant/.virtualenvs/pexpect/lib/python2.6/site-packages/pexpect/__init__.py", line 1433, in expect_list
    timeout, searchwindowsize)
  File "/export/home/vagrant/.virtualenvs/pexpect/lib/python2.6/site-packages/pexpect/__init__.py", line 1521, in expect_loop
    raise EOF(str(err) + '\n' + str(self))
pexpect.EOF: End Of File (EOF). Braindead platform.
<pexpect.spawn object at 0x5fa150>
@jquast

This comment has been minimized.

Member

jquast commented Mar 17, 2014

Spent the day on Solaris. Some differences in pty.openpty on OmniOS.
Need yet to create a test case to reproduce @ksalman's issue.

Both the master and slave pairs should be ttys, and termios.tcgetattr() works on either pair, but not so on OmniOS (platform "sunos5"). The master_fd end of the pty (which is returned as child_fd after fork) is not a tty, wether using the native to non-native workaround in pexpect.

osx example session, https://gist.github.com/jquast/9594450
omnios example session, https://gist.github.com/jquast/9594459

Made a few workarounds to allow sendintr() and sendctrl() to pass additional tests in branch 'solaris-workarounds' for those cases where tcgetattr fails. setecho() is still failing for this reason..

omnios test results, https://gist.github.com/jquast/9594840

@ksalman

This comment has been minimized.

ksalman commented Mar 17, 2014

Maybe this is how it behaves in all Illumos distributions, not just OmniOS. There's a bunch of them http://wiki.illumos.org/display/illumos/Distributions

@takluyver

This comment has been minimized.

Member

takluyver commented Apr 14, 2014

@jquast - I'm getting ready to do a 3.2 release, as there's a couple of bugfixes waiting to go out. Do you think you're going to have something for this in the next few days, or should it wait until 3.3?

@jquast

This comment has been minimized.

Member

jquast commented Apr 14, 2014

It will not be ready in time, sadly.

@takluyver

This comment has been minimized.

Member

takluyver commented Apr 14, 2014

No worries.

taiyangc pushed a commit to quatanium/pexpect that referenced this issue Apr 18, 2014

@jquast

This comment has been minimized.

Member

jquast commented May 25, 2014

Giving this another go with SmartOs VMWare image.

@jquast

This comment has been minimized.

Member

jquast commented May 26, 2014

working on a fix that uses ctypes to call the missing functions. So noah really did post a patch to python back in python 2.4 release days that was rejected because it caused a failing buildbot on hpux or some such and noah couldn't gain access to one.

*.openpty() and pty.fork() are using a legacy SGI and Linux hack-form, but the good form i am soon submitting is not in cpython. I have implemented it using ctypes -- but lo, ctypes is broken on solaris, have submitted a patch for that also, https://bugs.python.org/issue20664

@jquast

This comment has been minimized.

Member

jquast commented May 26, 2014

This is now fully understood. It simply is not possible to call tcgetattr/tcsetattr on the master_fd side of a pty pair:

p.sendline('1234') # Should see this twice (once from tty echo and again from cat).
p.setecho(0) # Turn off tty echo
p.setecho(1) # Turn on tty echo

This simply is not possible on Solaris.

Therefor, none of setecho(), getecho(), and waitnoecho() are possible on Solaris.

I'm going to push up a branch with more correct svr4_pty_fork and _svr4_openpty methods over the existing __fork_pty, which uses ctypes to achieve absolute assurance that this isn't a python issue .. this won't make it into the mainline, mainly because ctypes + libc is also not working on SmartOS or OpenIndiana in python2.7.5 and python3.5 (see previous comment for bugfix).

Goals:

  • add echo=True to class constructors spawn, spawnu.
  • modify the many uses of setecho/getcho in test cases to use API constructor where feasable.
  • where get/set/waitnoecho() themselves are tested, skip for Solaris.
  • set tty mode as such in child process after fork().
  • in parent, raise an exception when setecho() is called and tcsetattr fails, with a helpful wrapper error that describes it is not possible on this OS. Also document in API.
  • Similarly for tcgetattr in getecho().
  • Also, document only in waitforecho() (an exception will be thrown and described by getecho()).
  • allow isatty to fail for (at least) solaris in test case.
  • Identify: what version was get/set/waitnoecho() added? I don't imagine it has ever worked, and I also imagine it is breaking HP-UX and AIX, should anybody notice ..

Notes: unfortunately, even the ioctl(fd, I_PUSH, "ttcompat"), the "V7, 4BSD and XENIX STREAMS compatibility module" does not allow the master_fd to send or receive terminal attributes.

I'm having a very difficult time citing anything in particular. There was however an Opensolaris bug 6824625 mentioned https://blogs.oracle.com/weixue/entry/tip_differece_master_pty_regards#comment-1241055415000 but its been lost in the great washing Oracle has done to erase Solaris documentation.

I do however find many other examples of tc-get/setattr that are portable across many systems -- Usually a program knows it does or does not want to echo or other such tty modes and sets accordingly in the child_fd side. I've only found a gentoo portage complaint about it that wasn't well understood, but did discover something along the lines (paraphrasing) "if I reverse master_fd and slave_fd it works, but I don't know why".

jquast added a commit that referenced this issue May 27, 2014

Issue #44: Resolve (most remaining) Solaris issues
Only one final issue remains, setecho/noecho/waitnoecho still has
calls to tcgetattr(master_fd) which causes a failure on Solaris.
Remaining work:

- catch and decorate these exceptions with "not possible on your
  platform."

- as a workaround, provide echo=True/False to spawn().

- document the inability to hide passwords from password prompts and the
  like somewhere in the documentation. Recommend to use echo=False on
  such platforms, and not to depend on waitnoecho() for such prompts.

Example remaining failing test case:

======================================================================
ERROR: test_expect_echo_exact (test_unicode.UnicodeTests)
Like test_expect_echo(), but using expect_exact().
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/zones/pexpect/pexpect/tests/test_unicode.py", line 48, in test_expect_echo_exact
    self._expect_echo(p)
  File "/zones/pexpect/pexpect/tests/test_unicode.py", line 56, in _expect_echo
    p.setecho(0) # Turn off tty echo
  File "/zones/pexpect/pexpect/pexpect/__init__.py", line 997, in setecho
    attr = termios.tcgetattr(self.child_fd)
error: (22, 'Invalid argument')

jquast added a commit that referenced this issue May 27, 2014

jquast added a commit that referenced this issue May 27, 2014

jquast added a commit that referenced this issue May 27, 2014

jquast added a commit that referenced this issue May 27, 2014

@jquast

This comment has been minimized.

Member

jquast commented May 27, 2014

Finished the alternative _svr4_pty_fork(), which works on Linux (travis), MacOS (local), and Solaris (local) -- TODO: testing on OpenBSD, FreeBSD, and cygwin. This work done on branch issue-44-solaris-support. I also wish to open issues for acquiring and testing AIX and HP-UX (there is an AIX specific implementation that is untested). Also TODO, a new 'echo' kwarg to spawn() and a helpful "Not supported on your platform" exceptions and previously described.

@jquast

This comment has been minimized.

Member

jquast commented Jun 1, 2014

Remaining work in branch issue-44-solaris-support:

  • setwinsize() does not appear to work on Solaris, investigate.
  • testing of this branch on cygwin -- _svr4_openpty() might work there
  • ctypes.util.find_library('c') fails on Solaris, submitted fix http://bugs.python.org/issue20664
  • per above, should be able to work around using libc.

jquast added a commit that referenced this issue Jun 2, 2014

Solaris support. Fixes issue #44.
1. Adds ``echo=True`` keyword argument to spawn*.
2. Pre-fetch VINTR/VEOF, falling back to CINTR and
   CEOF. Failing that, fallback to (3, 4).
3. use '== pty.CHILD' instead of '== 0' when
   appropriate, such as after fork().
4. use pty.STDIN_FILENO instead of sys.stdout's
   fileno(). You may use any of stdin, stdout, or
   stderr file no when interacting with your tty,
   however after fork as child, we are guaranteed
   that fd 0 is our tty. Make it clear.
5. Explicitly catch IOError for child's call to
   setwinsize(). Interestingly, some platforms do
   not allow changing the window size from master
   (HP-UX, AIX, Solaris), where others do not
   allow from slave (Linux, others?).
6. setecho() is similar, so this is done in both
   the slave and master, ignoring all exceptions
   in either -- on Solaris, only a general
   Exception is raised, not IOError.
7. Use os.closerange(3, max_fd). instead of the
   custom-implemented for loop. python docs also
   claim this is faster, but it is more brief.
8. Complimentary to above, there is no need for
   the "if child_fd > 2: os.close(child_fd)"
   check.
9. No need to check if pid < 0 -- Python naturally
   raides an OSError (fe. Resource temporarily
   unavailable, too many open files, etc.).
10. Allow re-opening of /dev/tty by child process
    to fail -- this explicitly fixes issue #44.
11. Throw custom "may not be called on this
    platform" exceptions for getecho(), setecho(),
    setwinsize().
12. Remove old comment on, "how do i sent an EOF?"
@jquast

This comment has been minimized.

Member

jquast commented Jun 24, 2014

closed by branch issue-44-solaris-try-3, soon to be merged for next release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment