Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved input validation for meta, names and dtype in Table. #3549

Closed
wants to merge 1 commit into from

Conversation

sornars
Copy link

@sornars sornars commented Feb 26, 2015

Addresses #3511. I have 4 failing tests locally but I do not believe them to be related to these changes.

if isinstance(dtype, six.string_types):
raise TypeError('Dtype type {0} not allowed to init Table'
.format(type(dtype)))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like overkill to me. Also there are plenty of things that should be allowable for dtype other than strings.

I think as far the messages themselves it's less useful to tell users what isn't allowed "Type <bla> not allowed" than it is to tell them what they should do "meta argument must be a Mapping", etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, on the messages. I considered using a message informing users what is allowed but I stuck with the same style errors as was used for the rest of the parameter validation in Table. I can change them to something more useful but I think I'd need clarification on what is valid input.

You comments suggests that you may have gotten what the dtypes and names validation is doing in this PR backwards, both are allowing everything except strings as input.

From my understanding everything iterable would be vallid input for dtype and names; the only iterable that could fail are strings. Currently an error about column mismatch is thrown when an attempt to use names and dtypes of different lengths, eg. names='a', dtype='f4'. Theoretically you could pass in a table with matching lengths eg. names='abc' and dtype='fff' which would work but I don't think is intended behaviour as it relies on the user desiring single character column names matched to single character dtypes. There is a check that both names and dtype are iterable in the _check_names_dtype function so there is no need to repeat that check as it is called further down in the constructor. I simply excluded strings as I thought that was the intended behaviour.

What further input validation would you expect to see? The only thing I could think of would be validating that dtypes are valid dtypes; however, that is currently handled by the Column class.

@embray
Copy link
Member

embray commented Feb 27, 2015

@taldcroft can elaborate, but I think #3511 had less to do with type validation and more to do with overall consistency of the inputs.

@taldcroft
Copy link
Member

Thanks for the contribution @sornars. I'm closing since in the end we probably overreacted to the corner case that started this whole thread (which was just a user not appreciating the difference between ('a') and ('a',)). This PR does catch that case, but I think it's too far in the corner.

@taldcroft taldcroft closed this Sep 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants