Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude columns by index in io.ascii.read() #7451

Open
Gabriel-p opened this issue May 10, 2018 · 10 comments
Open

Exclude columns by index in io.ascii.read() #7451

Gabriel-p opened this issue May 10, 2018 · 10 comments

Comments

@Gabriel-p
Copy link
Contributor

Currently the read() module allows to exclude columns from the output table via the exclude_names parameter.

I'd like to be able to exclude columns by index number also. Can this be done at all? If not, could it be added?

@Gabriel-p Gabriel-p changed the title Exclude columns by index in astropy's io.ascii.read() Exclude columns by index in io.ascii.read() May 10, 2018
@pllim
Copy link
Member

pllim commented May 11, 2018

Just to understand the use case, why is excluding by name not useful for you?

@pllim pllim removed the table label May 11, 2018
@Gabriel-p
Copy link
Contributor Author

Gabriel-p commented May 11, 2018

I have a code that processes data files in batch mode. Many of these files have either incomplete headers, or no header at all. But most (or all) of them share the column positioning.

I understand the importance of a proper header, and I plan to enforce this in future versions of my code. But in the meantime, accessing columns by index is almost the only way.

@pllim
Copy link
Member

pllim commented May 11, 2018

I thought Table assigns default column names like col_0, col_1, etc? Maybe you can exclude based on those for now until someone has time to implement this?

@Gabriel-p
Copy link
Contributor Author

It assigns names as col1, col2, etc.. (yes, it starts from 1 not 0 as you'd expect)

I already do something like that by creating the column names manually:

col_names = ['col' + str(_ + 1) for _ in col_ids]

@pllim
Copy link
Member

pllim commented May 11, 2018

Glad to know you have a workaround. You are always welcome to contribute if you want to see this feature implemented sooner than later. 😄

@Gabriel-p
Copy link
Contributor Author

I'd love to but I think my Python skills are nowhere near the level of the guys developing this (great) package.

@hamogu
Copy link
Member

hamogu commented May 13, 2018

@Gabriel-p Thanks for the suggestion, however, I don't see much benefit given that it only saves you one line of code (the col_names=... that you showed in the comment).

@taldcroft
Copy link
Member

I think this could be done in a relatively clean way (from the API perspective) by allowing for exclude_names (and include_names and fill_include_names and fill_exclude_names) to accept integer values along with string values. The integer values would then be an index into the table column names list.

From a quick glance at the pure-Python and fast-C reader code this would not be too messy, but that is just a quick glance. To do this in a consistent way will require making the interface uniform for all those exclude/include options for read and write. So with all the implementation, testing, and docs this would be a moderate effort patch. I would propose to leave this open if someone wants to tackle it.

@taldcroft
Copy link
Member

BTW @Gabriel-p, there is no better way to improve your Python skills than trying to make a patch to astropy. You will learn a lot and get plenty of free feedback on writing better code!

@taldcroft
Copy link
Member

taldcroft commented May 14, 2018

And you just start by grepping for exclude_names in every file in io.ascii. That will show you where to think about applying a patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants