Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final changes before end of Phase 1 Development #25

Merged
merged 9 commits into from
Sep 10, 2015

Conversation

tanderegg
Copy link
Collaborator

@sleitner Can you take a look? There is one last pending issue to address, although we may end up skipping it, and it's minor. I'll remove the WIP when we've decided on that, but in the mean time you can go ahead and review the existing commits if you have bandwidth. Thanks!

except ValueError as e:
# Invalid JSON, so don't remove old data
error_message = "Error saving data dictionary: {0}. Data was: {1}".format(e, record)
flash_error(error_message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, but we probably want more explicit instructions, so that people know to avoid special characters and limit the amount of text?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this happen when adding a data dictionary to a new dataset, no matter what I put into the fields. I think this is because the two stage dataset creation puts the data dictionary update logic into a weird state. I will look into it more tomorrow to see if there's a better fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Check out the logs. Ultimately it seems to occur when CKAN attempts some text conversion before sending the data to the Datastore. It chokes on special characters that result from things like new line formatting, and it also seems to happen when there's too much text, which, for some reason, results in what might be a warning that includes special characters. Lots of implied question marks in the above. Happy hunting :)

@sleitner
Copy link
Member

sleitner commented Sep 8, 2015

Other than the error message for broken dictionaries, it looks good to me.

@tanderegg tanderegg changed the title WIP: Final changes before end of Phase 1 Development Final changes before end of Phase 1 Development Sep 9, 2015
@tanderegg
Copy link
Collaborator Author

Latest commit has fixed the remaining outstanding issues, so I've removed the WIP label and we're good to merge. @sleitner could you do a quick review first?


if record:
# Cleanse of errant unicode characters
record = unicodedata.normalize('NFKD', record).encode('ascii', 'ignore')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Does this fix or just patch the problem for release? When ckan reads the data dictionary back from the database, does the formatting remain or does it get stripped out? Do very long entries still get cut off and have some special characters appended?

If not probably worth a comment somewhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It normalizes non-ascii characters to their ascii equivalent, if one exists, so formatting should be kept. As far as I can tell, this fixes the long entry problem as well, I created one with 500 or so words that worked fine. New lines are still not being retained though, but that was existing behavior. Let me know if you know where that is happening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but it sounds like this is good to go.

@sleitner
Copy link
Member

Nice work, Tim!

sleitner added a commit that referenced this pull request Sep 10, 2015
Final changes before end of Phase 1 Development
@sleitner sleitner merged commit 6f18ad5 into cfpb:master Sep 10, 2015
@tanderegg
Copy link
Collaborator Author

Thanks! And thanks for writing easy to understand code :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants