Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

munge_filename isn't idempotent for filenames containing spaces #2041

Merged
merged 8 commits into from Jan 13, 2015

Conversation

brew
Copy link
Member

@brew brew commented Nov 10, 2014

Applying lib.munge.munge_filename multiple times to its own output doesn't always produce the same result for filenames that originally have spaces.

In the first pass spaces are replaced with hyphens. In subsequent passes, hyphens are removed.

@brew
Copy link
Member Author

brew commented Nov 10, 2014

87fd74c just adds a failing test that demonstrates the problem.

brew added a commit to ckan/ckanext-showcase that referenced this pull request Nov 10, 2014
@davidread
Copy link
Contributor

Is this a problem? The input is English, the output is munged, so why does it need to be idempotent?

@wardi
Copy link
Contributor

wardi commented Nov 10, 2014

+1 for idempotent transformations. We should be able to call this multiple times and get the same result as a single call.

@davidread
Copy link
Contributor

@wardi @brew Why so shy about giving a reason? ;-)

@brew
Copy link
Member Author

brew commented Nov 10, 2014

A reason to fix bad code? I was calling it twice (once from uploader, once from an extension) and I wasted the morning trying to debug what amounted to a missing hyphen in an image filename. Whether I should have been calling it twice is another argument.

@wardi
Copy link
Contributor

wardi commented Nov 10, 2014

idempotence is a great thing for validators because it's often not possible to tell if it will be called more than once. For example if the name transformation is done on a create or update validation step, but then passed back out to the user in a form, when they submit the form again the transformation will be called on the already-transformed data.

I think it's better to use idempotent transformations than, say, implement some kind of tracking to prevent a transformation from being called more than once or have code that fails mysteriously.

@davidread
Copy link
Contributor

@wardi Yes good justification. That makes sense for munge_filename, munge_name and munge_tag. I guess we've just found an edge case for munge_filename that needs fixing up.

However munge_title_to_name is a transformation that is not needed to be idempotent because it is from one field to another, and that's what I was thinking about when I flinched on this.

@brew I'm afraid I take issue at you calling the code 'bad'. Better to be constructive.

@joetsoi joetsoi added the WIP label Nov 18, 2014
@davidread davidread changed the title munge_filename isn't idempotent for filenames containing spaces [WIP] munge_filename isn't idempotent for filenames containing spaces Nov 18, 2014
@brew brew removed the WIP label Nov 20, 2014
@brew brew changed the title [WIP] munge_filename isn't idempotent for filenames containing spaces munge_filename isn't idempotent for filenames containing spaces Nov 21, 2014
@davidread davidread assigned davidread and joetsoi and unassigned davidread Nov 25, 2014
munge = munge_title_to_name(org)
nose_tools.assert_equal(munge, exp)

def test_munge_title_to_name_multiple_pass(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Title to name' is the one identified as not designed to be idempotent. Could you remove the multiple pass test?

@joetsoi joetsoi merged commit ec9b0b8 into ckan:master Jan 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants