Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype for Cygwin support module (not ready to merge) #998

Closed
wants to merge 86 commits into from

Conversation

embray
Copy link
Contributor

@embray embray commented Mar 23, 2017

Here is my prototype implementation for a Cygwin support module for psutil (#82).

This has been tested on Python 2.7 and 3.4 and all the tests pass on those versions (Pythons >= 3.5 are not supported on Cygwin yet). There are a few tests that are skipped on Cygwin--mostly for corner cases that don't work on Cygwin the way they do, say, on Linux.

The main TODO items as I see it in order of effort / importance are:

  • Refactoring--there are large chunks of code copied from the Linux and Windows modules to support different features on Cygwin. The copied code has either no, or only minor modifications. I'm thinking of moving such code around so that it can be shared by all modules that use it (to be clear, the full Windows module does not work on Cygwin and shouldn't be compiled there, but bits of it do; likewise for the Linux module).
  • Documentation needs to be updated to reflect the new platform.
  • Fix testing on Python 3.4 for Cygwin on AppVeyor
  • There are still a number of compilation warnings that should probably be cleaned up if possible.

Erik M. Bray added 30 commits March 14, 2017 17:24
Some common utilitiy functions were moved out of _pslinux into _common.

Most of cygwin's Process implementation is directly copied from the
linux Proces, minus a few very Linux-specific parts.  I hope to
eliminate the code duplication later with further refactoring.

Most tests fail due to some key Process methods not having
implementations yet (mostly because their implementations for Linux do
not work on Cygwin).
I left this unimplemented originally just because I wanted to take the
time to understand what I was doing before continuing.  Turns out it can
be implemented almost the same as on Linux, except that the process
start time is in the 22nd field of /proc/<pid>/stat instead of the 21st.
I can't find enough documentation for Cygwin's /proc filesystem, but I
confirmed this in the source.  Other than that the semantics are the
same as on Linux.

So much of this was copied agained from the Linux module, but I will
refactor later.
These are much simpler than on Linux, and effectively the same as the
existing Windows implementations of these functions, though easier to
implement on Cygwin via /proc/meminfo
expect to find.

Currently it is identical to pmem; will return later to seeing what
additional memory info we can provide on Cygwin
Later this should be able to share its implementation with the Linux
module but we'll come back around to that.
Cygwin

Copied almost verbatim from the Linux module.
cygwin module so that the functions in it can be reused there.

This required some tweaks to ntextapi.h, since a few of the
types defined there are already included in the winimpl.h included
with Cygwin.

This includes are own implementation of PyErr_SetFromWindowsErr which
is not defined in Python for Cygwin.

TODO: Because this PyErr_SetFromWindowsErr is lifted, only slightly
modified, from the Python source it should include a copyright notice
to the effect.  This implementation probably does not work quite right
on Python 2 either--it should return an actual WindowsError there.
Copied exactly from the Linux module and seems to work; will refactor
later.
Using the posix implementation directly seems to just work.
The only obvious issue currently is that rather than the human-friendly interface
names returned by this function in the normal Windows implementation, the
interface names are instead some UUID looking things for each interface (the same
as appear in the registry in various places to represent the interface).  This is
fine for now, but not ideal.  Should be able to map these to friendlier names.
renaming it to 'encode', for symmetry with the 'decode' utility.

This utility may also be useful in the Cygwin module, and is otherwise
not Windows-specific, having more to do with Py2/3 compatibility.
This is mostly copied from the Windows module with a few small tweaks, but compiling this
is a bit of a pain and somewhat fragile.  Will have to come back to this for a closer look
later, but for now I just want to get all of the API implemented as best I can.
As with net_if_stats, this is a bit of a mess right now to compile, and copies from the Windows module, but will clean up later.
Once again, mostly copied from the Windows module, with much refactoring needed.
…ited) API.

This will allow methods that return paths to return Cygwin paths instead of native Windows paths.
I have yet to go through and determine all the places this should be used, but there are probably
at least a few.
Whenever calling a C extension method that calls out to the Windows API make sure to use
the Windows PID.  This should fix a lot of issues where the wrong PID was being used.
Most of this is copied from the windows module with a few minor tweaks, necessitating later
refactoring.

This is also the first method to make proper use of the new winpath_to_cygpath function.
… implementations from the windows module (rather than the linux module) but using the winpid.
…module.

This is copied mostly from the Linux module, though it needs to use the utmpx APIs.  This seems to 'work'
insofar as the API is available on cygwin, but it's not clear to me that the information returned is fully
correct or useful.  Will return to this if needed.
…ows module

(The linux module's method of looking in /proc doesn't work on cygwin's more limited /proc namespace)
…itches, mostly

supported by the proc_info function copied from the windows module.
(This is just a thin wrapper around proc_info())
Erik M. Bray added 3 commits November 6, 2017 14:44
…rror

handling from Python wrappers for Windows APIs.

Instead of attempting to reimplement PyErr_SetFromWindowsError, on Cygwin we just
replace those calls with a macro wrapping PyErr_SetFromErrno which uses the
cygwin_internal API to map Windows error codes to POSIX error codes just in the same
way Cygwin would.

This appears to work correctly (for example trying to get info on a non-existent
process returns errno ESRCH (No such process).
their pure Python implementations replaced with more efficient (and less
racy) C implementations.
Although I considered adding a new module for this purpose, for now it
suffices to use the _common module for this (albeit a bit hackishly).

In this case the C wrapper for this function makes sense to put in
arch/windows.
@ghost
Copy link

ghost commented Dec 29, 2017

Just curious if there's been any progress with this. I'm very interested in having psutil available under Cygwin.

@@ -32,10 +32,14 @@
else:
enum = None


PY3 = sys.version_info[0] == 3
Copy link

@joaoe joaoe Jan 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest >= else sometime in the future Python 4 is released and no one knows why psutil is broken in cygwin (e.g., the str/bytes conversions below).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree; thanks for pointing that out. In fact instead of a PY3 variable it'd be better to use PY2 for Python < 3.

@embray
Copy link
Contributor Author

embray commented Jan 3, 2018

@Kssea Yes, I actually have some updates to this that I haven't pushed yet. I think last I left off I had rebased and done some refactoring and fixed some of my TODO items, but some of the tests weren't passing, and then I just got side-tracked.

@giampaolo
Copy link
Owner

It turns out that apparently there's quite some demand for cygwin support:
https://www.google.it/search?q=psutil+cygwin&oq=psutil+cygwin&aqs=chrome..69i57j69i60l2j69i61j0.1753j0j7&sourceid=chrome&ie=UTF-8

@embray
Copy link
Contributor Author

embray commented Jan 3, 2018

Ugh, rebasing this on the current master was quite a mess. It's going to take me a bit to untangle.

@yselkowitz
Copy link

Any chance to finish this port? I just came across something that requires psutil, and can certainly help get it the Cygwin distribution.

@embray
Copy link
Contributor Author

embray commented Apr 25, 2018

In fact one of the main issues that's been slowing this down is that discussed in #1247. I'm working on a separate branch to kind of decouple some of the C code from Python in a way that makes refactoring easier.

That said, I know a lot of people are asking for this, so maybe if I can finish just some basic refactoring first the rest we can worry about later. At this stage the biggest impediment is just getting caught up with master: I started this work more than a year ago, and when I first posted this PR it had all tests passing, but not anymore.

Incidentally I was just working on this last week. It's not abandoned, just, low on my plate and something I poke at every now and then. Which means it gets further behind master, and so on.... :)

@giampaolo
Copy link
Owner

I may be wrong but it appears to me the main and first goal here should be avoiding to duplicate code from _psutil_windows.c, aka do not have _psutil_cygwin.c at all. IMO, setup.py should just do this (aka compile _psutil_windows.c and _psutil_posix.c):

diff --git a/setup.py b/setup.py
index fffefd2..0d55eb6 100755
--- a/setup.py
+++ b/setup.py
@@ -29,6 +29,7 @@ HERE = os.path.abspath(os.path.dirname(__file__))
 sys.path.insert(0, os.path.join(HERE, "psutil"))
 
 from _common import BSD  # NOQA
+from _common import CYGWIN  # NOQA
 from _common import FREEBSD  # NOQA
 from _common import LINUX  # NOQA
 from _common import NETBSD  # NOQA
@@ -103,19 +104,32 @@ def silenced_output(stream_name):
         setattr(sys, stream_name, orig)
 
 
-if WINDOWS:
+if WINDOWS or CYGWIN:
     def get_winver():
-        maj, min = sys.getwindowsversion()[0:2]
-        return '0x0%s' % ((maj * 100) + min)
-
-    if sys.getwindowsversion()[0] < 6:
+        if WINDOWS:
+            maj, min = sys.getwindowsversion()[0:2]
+            return '0x0%s' % ((maj * 100) + min)
+        else:
+            import re
+            winver_re = re.compile(r'CYGWIN_NT-(?P<major>\d+)\.(?P<minor>\d+)')
+            verstr = os.uname()[0]
+            m = winver_re.search(verstr)
+            maj = int(m.group('major'))
+            min = int(m.group('minor'))
+            return '0x0%s' % ((maj * 100) + min)
+
+    if WINDOWS and sys.getwindowsversion()[0] < 6:
         msg = "warning: Windows versions < Vista are no longer supported or "
         msg = "maintained; latest official supported version is psutil 3.4.2; "
         msg += "psutil may still be installed from sources if you have "
         msg += "Visual Studio and may also (kind of) work though"
         warnings.warn(msg, UserWarning)
 
-    macros.append(("PSUTIL_WINDOWS", 1))
+    if WINDOWS:
+        macros.append(("PSUTIL_WINDOWS", 1))
+    else:
+        macros.append(("PSUTIL_CYGWIN", 1))
+
     macros.extend([
         # be nice to mingw, see:
         # http://www.mingw.org/wiki/Use_more_recent_defined_functions

How feasible would it be to restart from scratch and just compile _psutil_windows.c and _psutil_posix.c? With that in place I have a pretty clear idea on how to refactor the py code in order to reuse _pslinux.py.

@embray
Copy link
Contributor Author

embray commented Sep 20, 2018

I'm working on refactoring _psutil_cygwin.c so that there is less duplication. However, it does not make sense to compile _psutil_window.c on Cygwin. I don't think it will even build without significant #ifdefs, and even then much of the module is not applicable to Cygwin.

On the other hand, there is some Cygwin-specific code in _psutil_cygwin.c that really wouldn't make sense to have anywhere else, so it doesn't make sense to get rid of it entirely, with one exception: I have since released a package called pycygwin which includes basically the same Cygwin-specific functionality that I added in _psutil_cygwin.c. If I could make that cygwin-only dependency of psutil then that code could be reused. Otherwise it makes sense to copy it. I should note that for pycygwin I used Cython, whereas the code in _psutil_cygwin.c of course just uses the C-API directly.

@embray
Copy link
Contributor Author

embray commented Sep 21, 2018

I think the biggest problem with this PR, and @giampaolo I think you'll agree, is that it was too ambitious. I tried to get every, or almost every interface working just to see if I could.

But for the purposes of getting something that at least partially works as a first pass, I think I might start over again (although still reuse a lot) with my sights set a bit smaller.

The thing about Cygwin is that it is supposed to be mostly POSIX compatible. There are a few POSIX APIs that are not implemented (except as stubs), and a few narrow areas where it is necessarily not POSIX compatible. There are also a few non-POSIX interfaces borrowed from Linux. So I believe that a lot of ground can be covered by just a little bit of refactoring to _pslinux.py (I would like to add something like a _pslinux_common.py that contains bits that can be used by other Linux-like platforms; alternatively it could just go in the existing _common.py). Given that, I can add a simpler _pscygwin.py that only supports what is possible using bits from the posix and linux modules.

In general one should not use the Windows APIs on Cygwin; the whole point is that it's supposed to abstract away the Windows API. But of course you can do it, and as my prototype demonstrates you can still add a lot of functionality if you dig in. In some small cases it might still be unavoidable. Though I also think a lot of code from psutil, as long as the license allows it, could form the basis for a new features in Cygwin to implement a lot of currently missing interfaces (I'm especially thinking /proc/net in this case).

@giampaolo
Copy link
Owner

giampaolo commented Mar 1, 2019

So, here's some comments in light of the recent discussion occurred in #1342. As I said in there the main goal here should be to avoid code duplication. Since you're going to restart from scratch I propose to proceed step by step. The goal of the first step IMO should be solving the import time issues and get to a point where import psutil just works. That will help us have a clearer view on how to proceed next. As part of the first step I suggest to leave the C alone and just focus on the python/import part. I suggest the following:

  1. define a psutil/_pscygwin.py which will include the Python logic that differs from Linux (no Linux logic should be copied in here)
  2. on Cygwin psutil/__init__.py will import psutil/_pslinux.py:
# __init__.py
if CYGWIN:
    from . import _pslinux as _psplatform
  1. in psutil/_pslinux.py you can work with if CYGWIN: ... in order to exclude linux-specific logic which is loaded on import. More or less:
# _pslinux.py

if CYGWIN:
    import _pscygwin
else:
    from . import _psutil_linux as cext

...

HAS_PRLIMIT = not CYGWIN and hasattr(cext, "linux_prlimit")
if not CYGWIN:
    try:
        set_scputimes_ntuple("/proc")
    except Exception:
        # Don't want to crash at import time.
        traceback.print_exc()
        scputimes = namedtuple('scputimes', 'user system idle')(0.0, 0.0, 0.0)

...

# at the end of the module, replace the APIs which differ
class Process:   # <-- the linux process
    ...

if CYGWIN:
    class Process(_pscygwin.Process):
        pass

virtual_memory = _pscygwin.virtual_memory
net_io_counters = _pscygwin.net_io_counters
...
  • psutil/_pscygwin.py will more or less look like this
import _psutil_cygwin as cext

def virtual_memory:
    ...

def net_io_counters():
    ...

class Process:
    def threads(self):
        return cext.proc_threads(self.pid)
...

The impact on _pslinux.py is affordable because you'll only change a couple of things which are loaded at import time. The rest will remain unchanged as it will keep living in the body functions/methods. These parts will never be invoked on Cygwin because you'll overwrite _pslinux.py namespace with _pscygwin.py definitions.

This IMO is a reasonable approach as long as we're not clear how the different bits should be moved around and refactored.

Once you're done with that the next logical step would be identifying what C functions from _psutil_windows.c. That is more sensitive and less clear to me. We may proceed by reusing _psutil_windows.c and add #ifdefs all around, or we may have a separate _psutil_cygwin.c file. But that's not clear at this point (to me at least) so let's leave it for later and let's set the milestone for step 1 only.

@embray hope this helps and tell me if the above makes sense to you.

@giampaolo
Copy link
Owner

giampaolo commented Mar 1, 2019

Actually, on a second though I think it's better to do the other way around: to have _pscygwin.py import _pslinux.py:

psutil/__init__.py:

if CYGWIN: 
    from . import _pscygwin as _psplatform

psutil/_pslinux.py (just solve the import time issues):

if not CYGWIN:
    from . import _psutil_linux as cext

HAS_PRLIMIT = not CYGWIN and hasattr(cext, "linux_prlimit")
if not CYGWIN:
    try:
        set_scputimes_ntuple("/proc")
    except Exception:
        # Don't want to crash at import time.
        traceback.print_exc()
        scputimes = namedtuple('scputimes', 'user system idle')(0.0, 0.0, 0.0)

psutil/_pscygwin.py (redefine what's needed):

from . import _pslinux
from . import _psutil_posix as cext_posix

net_if_addrs = cext_posix.net_if_addrs
disk_usage = _psposix.disk_usage

class Process(_pslinux.Process):

    def threads(self):
        return cext.proc_threads(self.pid)

     # delete methods which are not supported on cygwin
     del _pslinux.Process.memory_maps
     ...

@embray
Copy link
Contributor Author

embray commented Mar 1, 2019

@giampaolo I'm sorry you took the time to write all this, because I'm way ahead of you on most of that. I have a different approach that I think you'll appreciate that was designed to avoid littering the code with if (not) CYGWIN blocks everywhere, which ends up looking quite a bit cleaner, for now.

I don't believe it's good to use _pslinux.py at all in the Cygwin implementation as there are too many differences between the platforms (Cygwin borrows from Linux in some areas that it likes, such as the layout of the /proc filesystem, but it is not Linux). You'll just wind up with tons of the code in _pslinux.py bracketed by if not CYGWIN. No good.

There is still a tiny bit of duplication for now, but not much, and it can be avoided with some other refactoring ideas I have.

I'm just making sure that the applicable tests pass, and that all other tests for interfaces that are not implemented yet are skipped as appropriate. Please in the meantime don't give it further thought until I post some actual code.

@giampaolo
Copy link
Owner

giampaolo commented Mar 1, 2019

You may have misunderstood what I have in mind. What I propose only adds a couple of if (not) CYGWIN in _pslinux.py in order to avoid crashing at import time, but all the rest is going to live in _pscygwin.py and reused via aliasing or subclassing. As such I don't aim at littering _pslinux.py body functions with if (not) CYGWIN. Judging from what I see in the current PR there are different things in common with Linux:

cpu_count_logical()
cpu_count_physical()
disk_partitions()
pids()
Process.name()
Process.terminal()
Process.cpu_times()
Process.wait()
Process.cpu_affinity()
Process.ppid()
statuses, clock ticks, ...

I am not sure what you have in mind but I am not keen on moving those parts out of _pslinux.py. Linux is the major platform here (and in psutil in general) so it should have precedence over cygwin.

@giampaolo
Copy link
Owner

@embray after re reading your diff I noticed there are some tiny differences. I missed that on my first run looking back at this PR. I read your last message without giving it the proper attention and I was also a bit harsh in dismissing it. I am sorry about that.

@embray
Copy link
Contributor Author

embray commented Mar 5, 2019

No problem. I'm still working on getting the tests in a sane state, but to spare some suspense what I'm doing is I created a _pslinux_common.py module, moved some of the common functionality there (there's not as much yet as you might assume), and both _pscygwin.py and _pslinux.py use it.

@giampaolo
Copy link
Owner

giampaolo commented Mar 5, 2019

Thanks for the update @embray. OK, this time I will shut up and wait to see some code and then we'll elaborate on that. =) FWIW, I Googled a bit and it seems Cygwin is not so "minor" as I thought. It appears there's quite some demand for it so I'm all for getting this in as a major feature. Should we close this PR?

@embray
Copy link
Contributor Author

embray commented Apr 3, 2019

Still making progress on this. I've had various other fires to put out but I'm close to having the new prototype ready. It's been mostly a progress of adding skipIfs to the right tests, as my current approach is to support just bare minimum functionality on Cygwin to start with so that we at least have a stub on which to add additional functionality (much of which can still be preserved from this PR, but added a bit at a time).

@giampaolo
Copy link
Owner

Closing this out as outdated.

@giampaolo giampaolo closed this Apr 10, 2019
Repository owner locked as resolved and limited conversation to collaborators Apr 10, 2019
@embray embray deleted the cygwin branch November 18, 2020 11:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants