Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python: add bindings for libflux-idset,hostlist #3341

Merged
merged 9 commits into from
Nov 18, 2020

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Nov 14, 2020

This PR is pretty straightforward. It adds basic IDset and Hostlist Python classes which wrap libflux-idset.so and libflux-hostlist.so.

I was also going to make automated bindings for librlist to make an Rv1 class, however I decided to stop here and make the Rv1 class a separate PR.

@lgtm-com
Copy link

lgtm-com bot commented Nov 14, 2020

This pull request introduces 3 alerts when merging 26b22c2 into 5945183 - view on LGTM.com

new alerts:

  • 2 for __iter__ method returns a non-iterator
  • 1 for Missing call to __del__ during object destruction

@grondo
Copy link
Contributor Author

grondo commented Nov 14, 2020

Force-pushed fixes for the python errors caught by LGTM.

@garlick
Copy link
Member

garlick commented Nov 16, 2020

Just playing with this a little. I can't seem make hostlist.decode() raise an exception on bad input. Instead I get a zero length hostlist. Is that expected?

@grondo
Copy link
Contributor Author

grondo commented Nov 16, 2020

No, and I thought I had tests for that but I must have accidentally dropped them at some point.

Nice catch. I'll work on a fix.

@lgtm-com
Copy link

lgtm-com bot commented Nov 16, 2020

This pull request introduces 1 alert when merging 5132efb into 02ee7a1 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@grondo grondo force-pushed the python-hostlist-idset branch 2 times, most recently from 1069ec8 to bad84d9 Compare November 16, 2020 18:27
@@ -167,6 +201,6 @@ def copy(self):
return Hostlist(handle=self.pimpl.copy())


def decode(string):
def decode(arg):
"""Decode a string in RFC 29 hostlist format to a Hostlist object"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this comment be updated to string or iterable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thanks. I'm actually not sure of Python idioms, if hostlist.decode() just directly calls hostlist.Hostlist() is it even useful to separately provide that function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. can't speak for whats common, but since there is an encode, i think it's nice to have a decode, minimally so people can search for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point! It doesn't do any harm at least...

@grondo
Copy link
Contributor Author

grondo commented Nov 16, 2020

Ok, pushed some fixup commits to address comments and improve interface and tests:

  • Fixed missing exceptions on hostlist_decode(3) failures, added tests
  • Improved Hostlist() constructor, now takes any valid RFC 29 hostlist string or iterable of said strings
  • Improved hostlist.append() to take Hostlist, or iterable or plain string of RFC 29 strings
  • add a few more tests

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! That fixed my problem and offhand I'm not finding anything else.
It's really cool to be able to use these "natively" in python!

I'll give my approval although someone with stronger python-fu like @chu11 probably ought to give a nod also.

@grondo
Copy link
Contributor Author

grondo commented Nov 16, 2020

Thanks. As I noted above in a comment, I'm not sure if the separate hostlist.decode() and idset.decode() functions are the "pythonic" way to do it, or if people expect to just use the class constructors anyway.

I was going to stop here for now and then see if an extra interfaces or minor interface changes popped out of work in Python-based flux utilities like flux-resource.py, and add those at that time.

@grondo
Copy link
Contributor Author

grondo commented Nov 16, 2020

Ok, perhaps I'll squash the fixup commits unless there is any objection?

Copy link
Member

@chu11 chu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but my python-fu is probably not much better than @garlick

@grondo grondo force-pushed the python-hostlist-idset branch 2 times, most recently from 96e9929 to 5099f14 Compare November 16, 2020 20:08
@grondo
Copy link
Contributor Author

grondo commented Nov 16, 2020

Actually, I'm not sure the approach used here (idset and hostlist each under their own ffi instance) will work.

The Rv1 or similar implementation will need an ffi instance that includes both the idset and hostlist instances, and I'm not exactly sure how to do that, but I wonder if what we really want here is to build one cffi source for all of hostlist, idset, and rlist, or even just put these directly into the _core.c ffi?

I really don't know anything about cffi yet, so I wonder if either @trws or @SteVwonder can give a hint on what might be the right approach here. Note that the use case is that an Rv1 class would have methods to return Hostlist or IDset objects directly from Rv1.nodelist or Rv1.ranks methods or properties. (These methods would call the underlying rlist_nodelist(struct rlist *rl) call and then instantiate a Hostlist with something like Hostlist(handle=result)

Right now when I attempt that I get the error:

TypeError: initializer for ctype 'struct hostlist *' appears indeed to be 'struct hostlist *', but the types are different (check that you are not e.g. mixing up different ffi instances)

@grondo
Copy link
Contributor Author

grondo commented Nov 16, 2020

Ah, just after the call I found that using ffi.include() "seems" to work, e.g. in the new _idset_build.py the hostlist and idset FFIs are included via

from _hostlist_build import ffi as hostlist_ffi
from _idset_build import ffi as idset_ffi

ffi = FFI()

ffi.include(hostlist_ffi)
ffi.include(idset_ffi)

Then I am able to instantiate Hostlist and IDset objects in the rlist.py module with simply:

   def nodelist(self):
        return Hostlist(handle=self.pimpl.nodelist())

    def ranks(self):
        return IDset(handle=self.pimpl.ranks())

@grondo
Copy link
Contributor Author

grondo commented Nov 16, 2020

Ok, since there is a path forward for composing FFI objects from multiple sources, I think this one is ok to go in as is. We can add minor tweaks as we come up with users of these new classes.

I added one more commit here. I noticed the asan builder was timing out when it seemed like the build would have completed given more time. Perhaps with the addition of extra tests the 40m timeout was not quite enough, so I've added a commit to increase the timeout to 50m.

@garlick
Copy link
Member

garlick commented Nov 16, 2020

Sounds good! Set MWP?

@trws
Copy link
Member

trws commented Nov 16, 2020 via email

Copy link
Member

@SteVwonder SteVwonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @grondo for putting this together. LGTM! Two totally optional comments below:

src/bindings/python/flux/hostlist.py Show resolved Hide resolved
src/bindings/python/flux/hostlist.py Show resolved Hide resolved
@grondo
Copy link
Contributor Author

grondo commented Nov 17, 2020

Ok, I've pushed an update that replaces the use of range()[index] with a new rangeslice() function, and also removed the unnecessary expand method as suggested by @SteVwonder.

@codecov-io
Copy link

Codecov Report

Merging #3341 (538173f) into master (0e51db1) will increase coverage by 0.11%.
The diff coverage is 97.98%.

@@            Coverage Diff             @@
##           master    #3341      +/-   ##
==========================================
+ Coverage   81.84%   81.96%   +0.11%     
==========================================
  Files         298      300       +2     
  Lines       46146    46394     +248     
==========================================
+ Hits        37767    38025     +258     
+ Misses       8379     8369      -10     
Impacted Files Coverage Δ
src/bindings/python/flux/idset.py 97.97% <97.97%> (ø)
src/bindings/python/flux/hostlist.py 98.00% <98.00%> (ø)
src/modules/job-info/guest_watch.c 75.00% <0.00%> (-1.15%) ⬇️
src/broker/broker.c 73.97% <0.00%> (+0.11%) ⬆️
src/broker/module.c 75.62% <0.00%> (+0.45%) ⬆️
src/broker/state_machine.c 82.18% <0.00%> (+0.53%) ⬆️
src/modules/job-archive/job-archive.c 59.68% <0.00%> (+0.79%) ⬆️
src/common/libhostlist/hostlist.c 97.74% <0.00%> (+0.90%) ⬆️
... and 2 more

@SteVwonder
Copy link
Member

LGTM! Ready for the MWP label?

@grondo
Copy link
Contributor Author

grondo commented Nov 18, 2020

Ok, sure, thanks!

@grondo
Copy link
Contributor Author

grondo commented Nov 18, 2020

Ah, this will have to be manually merged since I've modified .github/workflows

Let me rebase first.

Problem: There is no Python interface for RFC29 hostlist format,
which is now used in Rv1 in flux-core.

Make a simple Hostlist class which wraps Flux's libhostlist, so that
Python programs can operate on hostlists with exact compatibility with
C code.

Fixes flux-framework#3250
Problem: There is no Python interface for RFC22 Idset encoding.

Add a simple Python IDset class which wraps libidset so that Python
programs can encode/decode and manipulate idsets with full compatibility
with the C library.
Add unit tests for the Python hostlist bindings
Add unit tests for the Python idset bindings.
Problem: The cffi emit_c_code() function does not update the target
if it believes the generated .c file is already up-to-date. This causes
make confusion because the file is never updated if one of its other
prerequisites changes (e.g. Makefile), so the rule will continually
be rerun on each make invocation.

Ensure the target file mtime is updated in the *_build.py scripts by
"touching" the file after ffi.emit_c_code() is complete. This way make
won't unnecessarily re-run the rule to update the target.
Wrap rules in src/bindings/python/_flux/Makefile.am in necessary
automake variables to make the output quiet by default.

As with the rest of the build system `make V=1` will restore the
extra-chatty build.
Problem: mypy complains about the _flux.* generated bindings with the
error "Cannot find implementation or library stub for module".

Add 'ignore_missing_imports' for _flux._hostlist and _flux._idset.
Problem: As we add more tests to the full `make check`, the ASan
github workflow build is taking longer. The current time limit is 40m,
but sometimes this is not quite long enough.

Add another 10 minutes to the asan builder timeout to avoid premature
timeouts of this workflow run.
@SteVwonder
Copy link
Member

All the tests pass, so I'm going to push the button. Thanks @grondo!

@SteVwonder SteVwonder merged commit f1ce3dc into flux-framework:master Nov 18, 2020
@grondo
Copy link
Contributor Author

grondo commented Nov 18, 2020

Thanks!!

@grondo grondo deleted the python-hostlist-idset branch November 18, 2020 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants