Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-Latin symbols in channel names #4

Closed
plankter opened this issue Jan 19, 2021 · 6 comments
Closed

Non-Latin symbols in channel names #4

plankter opened this issue Jan 19, 2021 · 6 comments
Labels
bug Something isn't working

Comments

@plankter
Copy link

plankter commented Jan 19, 2021

I don't know if it is an issue with tifffile or xtiff library, but when channel names contain non-Latin symbols then to_tiff method crashes:

xtiff/xtiff.py

Line 311 in d0a7b5c

writer.save(data=img, photometric='MINISBLACK', compress=compression, description=description,

Traceback (most recent call last):
  File "/home/anton/bblab/imctools/imctools/converters/mcdfolder2imcfolder.py", line 59, in mcdfolder_to_imcfolder
    imc_writer.write_imc_folder(create_zip=create_zip)
  File "/home/anton/bblab/imctools/imctools/io/imc/imcwriter.py", line 77, in write_imc_folder
    acquisition_data.save_ome_tiff(
  File "/home/anton/bblab/imctools/imctools/data/acquisitiondata.py", line 136, in save_ome_tiff
    to_tiff(
  File "/home/anton/bblab/imctools/venv/lib/python3.8/site-packages/xtiff.py", line 311, in to_tiff
    writer.save(data=img, photometric='MINISBLACK', compress=compression, description=description,
  File "/home/anton/bblab/imctools/venv/lib/python3.8/site-packages/tifffile/tifffile.py", line 1836, in write
    addtag(270, 's', 0, description, writeonce=True)
  File "/home/anton/bblab/imctools/venv/lib/python3.8/site-packages/tifffile/tifffile.py", line 1772, in addtag
    value = bytestr(value, 'ascii') + b'\0'
  File "/home/anton/bblab/imctools/venv/lib/python3.8/site-packages/tifffile/tifffile.py", line 15622, in bytestr
    return s.encode(encoding) if isinstance(s, str) else s
UnicodeEncodeError: 'ascii' codec can't encode character '\u03b5' in position 1241: ordinal not in range(128)

@jwindhager
Copy link
Contributor

Interesting. I'd vaguely guess that this is a problem with tifffile. Happy to take pull requests though, should you find a problem in our code, @plankter.

@plankter
Copy link
Author

My assumption is that an encoding in this code

xtiff/xtiff.py

Line 301 in d0a7b5c

with BytesIO() as description_buffer:
should be ascii instead of utf-8. Then it works. This is due to https://github.com/cgohlke/tifffile/blob/75a11f2d781705a64151d3d329e47187e3be82a8/tifffile/tifffile.py#L1264 in tifffile, which states that "The subject of the image. Must be 7-bit ASCII." But I am also confused that it tells "Cannot be used with the ImageJ or OME formats."

with BytesIO() as description_buffer:
            ome_xml.write(description_buffer, encoding='ascii', xml_declaration=True)
            description = description_buffer.getvalue().decode('ascii')

@plankter
Copy link
Author

Example MCD file with non-Latin channel name characters is available at server_homes/thoch/Data/Data_for_Anton/Slide_7_679.zip

@jwindhager
Copy link
Contributor

But I am also confused that it tells "Cannot be used with the ImageJ or OME formats."

To my understanding, this only applies to situations in which the OME-XML is generated by tifffile itself. As xtiff specifies the OME-XML as a description for a "regular" tiff written by tifffile, this shouldn't be an issue.

@jwindhager jwindhager added the bug Something isn't working label Jan 20, 2021
jwindhager added a commit that referenced this issue Jan 20, 2021
@jwindhager
Copy link
Contributor

with BytesIO() as description_buffer:
            ome_xml.write(description_buffer, encoding='ascii', xml_declaration=True)
            description = description_buffer.getvalue().decode('ascii')

Implemented this in v0.6.4, please upgrade

@jwindhager
Copy link
Contributor

@plankter switched back to UTF-8 in v0.7.6 (just released). While the TIFF standard technically only supports ASCII, OME-XML requires UTF-8, see #8.. The error message reported in this issue should not occur anymore though, as the encoding is now handled by xtiff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants