Jira string truncation does not take into account UTF-8 encoding

Jira uses UTF-8 encoding for unicode characters. This means that some characters will take up multiple characters of space in the string. For example, 🀍 is considered 2 characters as it takes 4 bytes. ~~With the current spec, it seems possible for some to take up 6 bytes which should be 3 characters? Although I couldn't find any examples of this.~~

We should adjust our string truncation to take this into account. I'm hoping there is some standard string encoding functiosn we can use to adjust this.
https://github.com/mozilla/jira-bugzilla-integration/blob/ac3dec121052c0894b09bc55321a4ce6db9fe299/jbi/jira/utils.py#L13-L15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Jira string truncation does not take into account UTF-8 encoding #1135

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if max_length > 0 and len(jira_output) > max_length:
	# Truncate on last word.
	jira_output = jira_output[:max_length].rsplit(maxsplit=1)[0]

Jira string truncation does not take into account UTF-8 encoding #1135

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions