Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.listdir-alike that includes file type #37269

Closed
donut mannequin opened this issue Oct 6, 2002 · 4 comments
Closed

os.listdir-alike that includes file type #37269

donut mannequin opened this issue Oct 6, 2002 · 4 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@donut
Copy link
Mannequin

donut mannequin commented Oct 6, 2002

BPO 619222
Nosy @loewis

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2008-02-05.17:45:52.497>
created_at = <Date 2002-10-06.12:22:50.000>
labels = ['type-feature', 'library']
title = 'os.listdir-alike that includes file type'
updated_at = <Date 2008-02-05.17:45:52.383>
user = 'https://bugs.python.org/donut'

bugs.python.org fields:

activity = <Date 2008-02-05.17:45:52.383>
actor = 'draghuram'
assignee = 'none'
closed = True
closed_date = <Date 2008-02-05.17:45:52.497>
closer = 'draghuram'
components = ['Library (Lib)']
creation = <Date 2002-10-06.12:22:50.000>
creator = 'donut'
dependencies = []
files = []
hgrepos = []
issue_num = 619222
keywords = []
message_count = 4.0
messages = ['61100', '61101', '61102', '62072']
nosy_count = 3.0
nosy_names = ['loewis', 'donut', 'draghuram']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue619222'
versions = []

@donut
Copy link
Mannequin Author

donut mannequin commented Oct 6, 2002

I propose to add two new functions, say os.listdirtypes
and os.llistdirtypes. These would be similar to
os.listdir except they would return a list of tuples
(filename, filetype). This would have the advantage
that on oses that support the d_type entry in the
dirent struct the type could be calculated without
extra calls and harddrive reading. Even on
non-supporting os/filesystems, it could emulate it with
a call to stat/lstat in the func, still saving some
work of calling stat and interpreting its result in
python code or using os.path.isX.

Filetype would indicate wether the entry was a file,
directory, link, fifo, etc. This could either be a
char (like ls -l gives) ('-', 'd', 'l', 'p', etc), or
some sort of constant (os.DT_REG, os.DT_DIR, os.DT_LNK,
os.DT_FIFO, etc). Personally I think the string method
is simpler and easier, though some (non-*ix) people may
be confused by '-' being file rather than 'f'. (Of
course, you could change that, but then *ix users would
be confused ;)

listdirtypes would be equivalent to using stat, ie.
symlinks would be followed when determining types, and
llistdirtypes would be like lstat so symlinks would be
indicated as 'l'.

An app I'm working on right now that reads in a
directory tree on startup got about a 2.2x speedup when
I implemented this as an extension module, and about
1.6x speedup when I tested it without d_type support.
(The module was written using Pyrex, so its not a
candidate for inclusion itself, but I would be willing
to work on a C implementation if this idea is accepted..)

@donut donut mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Oct 6, 2002
@loewis
Copy link
Mannequin

loewis mannequin commented Oct 13, 2002

Logged In: YES
user_id=21627

I'm in favour of exposing more information received from
readdir. I'm not sure whether adding new functions is the
right API, perhaps adding a flag to the existing listdir is
sufficient.

I don't think listdir should perform stat calls itself; if
the system has some information available, fine, if it
doesn't, return nothing.

What is the proposed difference between listdirtypes and
llistdirtypes?

On the return type of the "verbose" listdir, I think it
should return structs with named fields, such as d_ino,
d_name, and d_type. Callers can then find out themselves
what information they got, and augment this with information
from stat that they also need. In particular, d_type should
be returned as presented in the system, since it might have
slight semantic difference to what os.stat would tell about
the file.

This should extend to other systems as well. E.g. on
Windows, it is possible to learn the modification times from
listdir, with no extra overhead.

There should also be a way to use this with os.path.walk.

So, in short, I'm in favour of this idea. Would you
volunteer to write a PEP, and provide the Unix implementation?

@donut
Copy link
Mannequin Author

donut mannequin commented Oct 13, 2002

Logged In: YES
user_id=65253

Adding a flag to the existing listdir as opposed to adding
more functions would be fine I think.

There are two reasons I suggest adding the stat calls in
listdir. The first is purely practical, and that is even
without a filesystem that supports the d_type field, you can
still get a decent speed up merely by performing the stat
call in C rather than python.

The second is from a usability point of view. If listdir
would not do the stat for you, your code would always have
to have a seperate case to handle the non-d_type using
filesystems, so it would not really make listdir any easier
to use, whereas if listdir did the stat itself, you could
simplify a huge amount of code out there that always follows
an os.listdir by os.stat or os.path.isX.

Perhaps the d_type field could be returned verbatim, but a
seperate field could be added that, if d_type was something
useful would just be set by that, or otherwise would be set
by a call to stat, that way you could still see if you
really wanted to whether the filesystem actually gave you
the d_type.

The difference between listdirtypes and llistdirtypes is
just like the difference between os.stat and os.lstat, that
is in the case of symlinks the first will return the data of
the linked-to file while the second will return the data of
the symlink it self. Again, this is mostly for user
convenience.

As for os.path.walk, a flag could be added to that which
would replace the "names" argument with the same return type
as the new verbose-listdir.

Sure, I'll volunteer. I'll start reading up on the PEP process.

@draghuram
Copy link
Mannequin

draghuram mannequin commented Feb 5, 2008

No activity for long time.

@draghuram draghuram mannequin closed this as completed Feb 5, 2008
@draghuram draghuram mannequin closed this as completed Feb 5, 2008
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

0 participants