-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Reader that does not need open file handles (#239) #926
Conversation
Also fixes #875 |
So XYZ currently fails a single test to do with the StreamIO source of data. This probably requires a little thought because a stream by definition will always remain "open", so can't work with this open/close style reader. |
Pushed some changes for DCD. This isn't working yet because the file handle isn't used at a Python level, so again more thought in how to make manipulations to the file handle work with C extensions (see also XDR). |
Can you do the XDR fix first. I set it up to act as close as possible to a filehandle object. |
@richardjgowers why do you create the extra super filehandle class here? Wouldn't it be enough to create a decorator def with_open_file(self, f):
with self.open():
f()
@with_open_file
def __iter__(self):
... I might be missing some edgecases with this idea but I don't see any reason why it shouldn't be possible to do it like that. |
Why does this change need a function like |
@kain88-de I need to save the file handle position somewhere (ie the last I had a read, the current I'll have a look if I can simplify things, the main reason there's more functions is so that the files are only opened once, for example |
Good news.
Good point I haven't thought about that issue. |
@richardjgowers, @kain88-de I'm so happy to see this. :D |
@@ -1330,6 +1330,95 @@ def __del__(self): | |||
self.close() | |||
|
|||
|
|||
class NewReader(ProtoReader): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan to have this class completely replace Reader
? If so, is it suppose to take over the Reader
name? If not, having "new" in the name is not a smart move, at some point it will become the old one...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a temporary thing while I try and see what's possible... hopefully it will be the new Reader class across everything
cb73fcf
to
5535e3b
Compare
Added NewReader which doesn't maintain an open filehandle XYZReader changed to use NewReader
c9503b5
to
05d6470
Compare
So I'm taking another swing at this. I've got the XYZReader working so it only opens a filehandle when you use it, and only opens the file a single time when iterating. I'll try and extend this to the XDR stuff next for a more complicated case. |
@kain88-de is there a reason that the XTCFile object has to always remain open? I think I'm going to have to change this so that it doesn't keep the file handle open Can I take out the ability to open different files with the same XTCFile object? It seems to complicate things and isn't how python's regular file object work either. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave the is_open int private and add a property is open that returns a true python bool.
I copied most file system behavior. Once you open a file it stays open until closed explicitly. That allows filesystems to work with buffers on files. A close/open always causes a flush to the filesystem. Which is slow and can even stop filesystem operations all together on large parallel cluster file systems. But your current change for an explicit open is OK.
Sure thing that can be removed. |
@@ -318,7 +322,10 @@ cdef class _XDRFile: | |||
set_offsets | |||
""" | |||
if not self._has_offsets: | |||
# wrap this in try/except to ensure it gets closed? | |||
self.open(self.fname, self._mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not
with self:
...
that might work and should close the file when an IOError is thrown
@@ -308,7 +308,7 @@ def __init__(self, filename, **kwargs): | |||
# etc. | |||
# (Also cannot just use seek() or reset() because that would break | |||
# with urllib2.urlopen() streams) | |||
self._read_next_timestep() | |||
self.next() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For python 3 compatibility this should be next(self)
.
Adds a new base Reader class which doesn't need to maintain an open file handle. Should fix problems related to too many open file handles.
Explanation
Readers have 4 "front doors", (
__getitem__
,next
,__iter__
,rewind
) which are publicly accessible methods to (indirectly) manipulate the file handle.These front doors filter through to various backend methods in the base Reader (
_sliced_iter
,_full_iter
and_goto_frame
)These backend methods then create the open file handle and call things in the individual implementations (eg
XYZReader._read_frame
)Individual implementations are oblivious to the file handle manipulation above them, they just assume that a handle exists in a known location (ie this needs to be unified in API).
EDIT kain88-de: issue #239 (comment)