Skip to content

Fix UTF-8 representation of non-ASCII characters#252

Merged
simleo merged 3 commits intoResearchObject:masterfrom
orviz:non_ascii_chars_support
Mar 25, 2026
Merged

Fix UTF-8 representation of non-ASCII characters#252
simleo merged 3 commits intoResearchObject:masterfrom
orviz:non_ascii_chars_support

Conversation

@orviz
Copy link
Copy Markdown
Contributor

@orviz orviz commented Mar 24, 2026

Closes #251: add ensure_ascii=False to the json.dumps call to ensure raw UTF-8 representation of non-ASCII chars.

I have structured this PR in two commits to follow a TDD approach:

  • The first commit adds a failing test case that asserts correct UTF-8 representation.
  • The second commit provides the fix.

You can verify the assert errors by checking out the first commit and running the new test.

@simleo
Copy link
Copy Markdown
Collaborator

simleo commented Mar 24, 2026

Thanks for this contribution! Before merging, I need you to update author and copyright info. Author info is in CITATION.cff, setup.py and rocrate/__init__.py. Copyright info is in README.md, rocrate/__init__.py and Python file headers. You can see 09bff64 and 04128fa for an example of similar changes.

@orviz
Copy link
Copy Markdown
Contributor Author

orviz commented Mar 25, 2026

@simleo I hope that the requested information has been properly added, such as prepend the author name in the author related files. Thanks!

@simleo
Copy link
Copy Markdown
Collaborator

simleo commented Mar 25, 2026

@orviz sorry, I forgot to highlight the fact that authors are in alphabetical order by surname, in all files where they appear. Could you put them in the right order?

@orviz orviz force-pushed the non_ascii_chars_support branch from 7e182fe to 8219ee4 Compare March 25, 2026 12:24
@orviz
Copy link
Copy Markdown
Contributor Author

orviz commented Mar 25, 2026

@simleo done!

@simleo simleo merged commit 17a8ce1 into ResearchObject:master Mar 25, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ASCII escaping in metadata serialization prevents UTF-8 characters (like tildes) from rendering correctly

2 participants