-
-
Notifications
You must be signed in to change notification settings - Fork 10
Conversation
- files is a list for uploading, and should include any zipped python env. Things ending in .zip will be treated as archives with a name like the filename minus the .zip. Existing resources in hdfs can also be passed (no upload). - environment is passed as a dictionary. - for now, dask launch command is simply derived from the environment passed, as before. Meantime: changed application and client kill to atexit.
Notes:
|
Fixes #100 |
Explicit environment variables should override what `lang` may set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the interface put forth here looks good to me.
@@ -3,7 +3,7 @@ | |||
|
|||
from .utils import * | |||
from .core import * | |||
from .env import CondaCreator | |||
from .env import CondaCreator, zip_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this to the top-level namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like a generally useful function, e.g., for what you are doing with folder zipping. The lines above tend to import everything.
knit/core.py
Outdated
@@ -248,6 +263,7 @@ def preexec_func(): | |||
# preexec_fn not supported on Windows | |||
proc = Popen(args, stdin=PIPE) | |||
self.proc = proc | |||
atexit.register(self.__del__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is what you want to do, as __del__
will be called twice for every object then. Instead the following might be better:
- On
__init__
add theknit
object to aweakref.WeakSet
- Add a
atexit.register
call to a function that will iterate over this set at close and remove any lingering instances.
The only reason there would be lingering objects is if there is a reference cycle, which leads to no __del__
in the cycle being called. These objects will still be in the weakset, allowing the exit handler to finish them up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be enough to set self.app_id = None
, no? del does nothing after that happens.
The error messages you sometimes get come from py4j's watcher thread, which communicates with the java process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registering the method will hold on to a reference to the object, preventing actual cleanup until exit in all cases.
@@ -423,7 +444,7 @@ def logs(self, shell=False): | |||
def print_logs(self, shell=False): | |||
"""print out a more console-friendly version of logs()""" | |||
for l, v in self.logs(shell).items(): | |||
print('Container ', l, ', id ', v.get('id', 'None'), '\n') | |||
print('\n### Container ', l, ', id ', v.get('id', 'None'), ' ###\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes the printout easier to see, purely visual.
knit/dask_yarn.py
Outdated
|
||
app_id = self.knit.start(command, env=self.env, | ||
bn = os.path.basename(self.env) | ||
pref = bn + '/' + bn[:-4] # like myenv.zip/myenv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this? It seems to me the environment zip file shouldn't unzip to a directory with .zip
in the name. I'd expect unzip myenv.zip
to result in a folder myenv
, meaning pref
should be myenv/bin/python
. This seems to imply that unzip myenv.zip
results in a folder myenv.zip
with a nested folder named myenv
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way you probably should be using os.path.join
and os.path.splitext
instead of +
and [:-4]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The zip files are always expanded inside a directory, so if you pass in my.zip
made from directory my/
, you will get my.zip/my/files
. The top level could be named anything, but it seems pretty clear to me.
I wouldn't use path.join
, since the containers will always run on a unix system, whereas it should be possible to launch applications from windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't expand a zip file in the worker root folder, or expand all of them in a folder and then cd
into that folder? The behavior here is very unexpected (at least to me).
I wouldn't use path.join, since the containers will always run on a unix system, whereas it should be possible to launch applications from windows.
I'd argue that you should always write what's intended, even if the system it's running on will never need to be cross platform. os.path.join
means "join some paths together", while x + '/' + x2...
means concatenate some strings. Only in context can I see that the string concatenation is joining paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't expand a zip file in the worker root folder, or expand all of them in a folder and then cd into that folder?
No, all you do is give a name to YARN, which takes care of doing the unzipping.
if the system it's running on will never need to be cross platform.
I mean that we are testing linux/unix and everything is fine, but when this will be used on windows/unix, then path.join will break. In fact, yarn - the server - could be run on windows, I suppose, but probably no one does that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, ok. I found very little documentation about this. There's a stack-overflow post saying that spark-submit
does the same thing, and there's the yarn docs which indicate how the local cwd is setup for files but not for zip files (first answer in FAQs
, can't link to it directly). I still find this behavior surprising and non-intuitive.
I mean that we are testing linux/unix and everything is fine, but when this will be used on windows/unix, then path.join will break.
Since this is meant to be used on the edge node, I'd expect the resourcemanager and edge node to share the same OS. I've run into enough buggy code doing path processing using string operations instead of path operations that maybe I'm overly wary of this.
@jcrist - is this how it's done? I'm not experienced with weakref. |
knit/core.py
Outdated
@@ -87,6 +90,7 @@ class Knit(object): | |||
|
|||
JAR_FILE = "knit-1.0-SNAPSHOT.jar" | |||
JAVA_APP = "io.continuum.knit.Client" | |||
instances = weakref.WeakSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably move the instances
set to a global, and the cleanup
class method to a function, but that's just a style thing. If left on the class I think they both should be private (_
prefixed) at least. Otherwise this looks good.
I stuck with the class attribute and method as, in my mind, this is better for possible subclassing - not that I expect much of that. |
ending in .zip will be treated as archives with a name like the filename minus the
.zip. Existing resources in hdfs can also be passed (no upload).
before.
Meantime:
changed application and client kill to atexit.