Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Charset in Caption-Abstract #3249

Closed
jancla opened this issue Sep 1, 2020 · 5 comments
Closed

Charset in Caption-Abstract #3249

jancla opened this issue Sep 1, 2020 · 5 comments

Comments

@jancla
Copy link

jancla commented Sep 1, 2020

Why isn't the Caption-Abstract field used in images uploaded to DW? DW seem to clear the content on upload.

Using Mediamanager to enter text in the Caption field works and the text is visible in DW pages, BUT the text is stored in a strange coding not readable in other applications.

Before upload to DW

$ exiftool -Caption-Abstract image-orginal.jpg

gives

Caption-Abstract : Swedish åäö and ÅÄÖ

The same image uploaded to DW and edited in Media Manager taken from /data/media...

$ exiftool -Caption-Abstract image-orginal-dw.jpg

gives

Caption-Abstract : Swedish åäö and ÅÄÖ

Looks like mojibaki to me.

I'm on MacOS and Hogfather.

@Klap-in
Copy link
Collaborator

Klap-in commented Sep 1, 2020

The metadata fields available in the image metadata editor are defined in this file: https://xref.dokuwiki.org/reference/dokuwiki/nav.html?conf/mediameta.php.html

I guess that these lines provide the caption.
https://github.com/splitbrain/dokuwiki/blob/a3cafac5ceb903787e591f90795b7927a492c09c/conf/mediameta.php#L42-L47

By using conf/mediameta.local.php you can overwrite the original definition. If your caption is not recognized, you could try to extend the lookup list of the definition.

About the encoding, the steps were the data is retrieved or stored are listed below. I have no suggestions, maybe others can give better suggestions.
If the metadata editor is shown, the data is retrieved and cleaned by
https://github.com/splitbrain/dokuwiki/blob/a3cafac5ceb903787e591f90795b7927a492c09c/inc/media.php#L166-L167

The input of the form is saved by
https://github.com/splitbrain/dokuwiki/blob/a3cafac5ceb903787e591f90795b7927a492c09c/inc/media.php#L56-L97
which uses the JpegMeta object, which performs a conversion depending on the type of metadata tag.

@jancla
Copy link
Author

jancla commented Sep 1, 2020

I looked at the Caption-Abstract field more carefully...

Dokuwiki stores the field in Windows Latin1 encoding.

exiftool -L -Caption-Abstract image-orginal.jpg gives

Caption-Abstract : Swedish åäö and ÅÄÖ

OK, thats a bit disappointing but it's not bug.

It is not possible to copy commented photos to DW media folder and use the metadata without first converting both filename and metadata textfields.

Nowadays, of course, most software use utf-8 so there are no interoperability problems.

The problem seems to have popped up before: https://forum.dokuwiki.org/d/10957-exif-ipct-and-utf8-problem

I'll close this since it's not a bug. The charset used could be configurable, though or at least follow $conf['fnencode'] = 'utf-8';

@jancla jancla closed this as completed Sep 1, 2020
@phy25 phy25 added the upstream label Sep 7, 2020
@phy25
Copy link
Collaborator

phy25 commented Sep 7, 2020

This sounds like a feature request to JpegMeta, but that library is hardly maintained...

@jancla
Copy link
Author

jancla commented Sep 8, 2020

Yes, I suppose it's a feature request.... There are many feature requests...

JpegMeta is old and contains to much low level code. It's hard to maintain and should be replaced by a PHP EXIF library. Thats a lot of work.

@Klap-in
Copy link
Collaborator

Klap-in commented Sep 8, 2020

Similar library is suggested at this thread: #1970 (comment) and some further development is found in https://github.com/LycheeOrg/php-exif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants