Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems processing incorrect LangAlt values #1481

Closed
postscript-dev opened this issue Mar 2, 2021 · 4 comments · Fixed by #1482
Closed

Problems processing incorrect LangAlt values #1481

postscript-dev opened this issue Mar 2, 2021 · 4 comments · Fixed by #1482
Assignees
Milestone

Comments

@postscript-dev
Copy link
Collaborator

postscript-dev commented Mar 2, 2021

Describe the bug

The processing problems fall into four categories.

  1. No language value
  2. Empty language value
  3. Mismatched and/or incorrect positioning of quotation marks
  4. Invalid characters in language part

To Reproduce

The problems have been observed on the old-master and the 0.27-maintenance branches (though expected on previous branches also).

  1. No language value
    This causes a segmentation fault when running
exiv2 -M "set Xmp.dc.title lang= test1-1" any_image.jpeg
exiv2 -M "set Xmp.dc.title lang=\" test1-2" any_image.jpeg

This applies to any LangAlt tag and any image.

  1. Empty language value
    This is found when running
exiv2 -M "set Xmp.dc.title lang=\"\" test2" any_image.jpeg

This applies to any LangAlt tag and any image.

  1. Mismatched and/or incorrect positioning of quotation marks
    e.g.
exiv2 -M "set Xmp.dc.title lang=\"\"test3-1" any_image.jpeg

exiv2 -M "set Xmp.dc.title lang=\"test3-2" any_image.jpeg

exiv2 -M "set Xmp.dc.title lang=\"en-UK test3-3" any_image.jpeg
exiv2 -M "set Xmp.dc.title lang=en-US\" test3-4" any_image.jpeg

exiv2 -M "set Xmp.dc.title lang=test3-5\"" any_image.jpeg
exiv2 -M "set Xmp.dc.title lang=test3-6\"\"" any_image.jpeg

This applies to any LangAlt tag and any image.

  1. Invalid characters in language part
    e.g.
exiv2 -M "set Xmp.dc.title lang=en-UK- test4-1" any_image.jpeg
exiv2 -M "set Xmp.dc.title lang=en=UK test4-2" any_image.jpeg

This applies to any LangAlt tag and any image.

Expected behavior

Exiv2 is expected to cope with a LangAlt statement even if it is not well formed - usually by throwing an exception. The examples in 2) and 3) are accepted and returned by the API, giving the impression that they are valid.

Desktop (please complete the following information):

  • OS: Windows 10 (MinGW64 on MinGW/MSYS2) - expected on all
  • Compiler & Version: g++.exe (Rev6, Built by MSYS2 project) 10.2.0
  • Compilation mode and/or compiler flags: Standard CMakeLists.txt values

Additional context

The problem occurs because of issues in the src/value.cpp file, namely the

int LangAltValue::read(const std::string& buf);

function.

  1. No language value
    The segmentation fault is due to not checking the result of
std::string::size_type pos = buf.find_first_of(' ');
  1. Empty language value
    By not checking if the language is empty, this value is added into LangAltValue's ValueType and the xmpsdk does not filter this out. I don't think this is allowed under the Xmp specification but others may know better. Using the above example,
$ exiv2 -px any_image.jpeg
### <SNIP>
Xmp.dc.title                                 LangAlt     1  lang="" test2
  1. Mismatched and/or incorrect positioning of quotation marks
    Pairs of double quotes are accepted around the outside of the language value.

  2. Invalid characters in language part
    The accepted language value is defined in IEFT RFC3066 and I think that it should only contain A-Z (either case), 0-9 and one '-'.

@postscript-dev
Copy link
Collaborator Author

I will look at this more tomorrow.

Also, the templates used when posting an issue are missing. They were stored in the master branch and when this was renamed to old-master, it looks like the setting has not transferred.
You can find the templates here.

@clanmills
Copy link
Collaborator

@postscript-dev If I grant you write access to our repos, can you deal with this?

Please read this to understand what's going on with master/old-master. #1466 (comment)

Here's my situation. I worked 14 hours/day last week and again this week on BMFF support. #1475 (comment)

After that I have to ship Exiv2 v0.27.4 RC1 on 2021-03-31. #1018 (comment)

I was 70 in January and the plan was to retire.

@postscript-dev
Copy link
Collaborator Author

@postscript-dev If I grant you write access to our repos, can you deal with this?

Please read this to understand what's going on with master/old-master. #1466 (comment)

Yes, I intended to submit - hopefully today. I posted so that I could reference this in the testing and commit message and didn't intend to burden you with the work.

Here's my situation. I worked 14 hours/day last week and again this week on BMFF support.#1473

I was 70 in January and the plan was to retire.

Thank you for all your hard work, it is appreciated.

@clanmills
Copy link
Collaborator

@postscript-dev That's the man, Mr PostScript. Thanks. I'll open the repos now. I know you'll treat it with the respect it deserves. Welcome aboard.

When you submit the PR, I'll review and approve. The PR will merge when the CI is green.

@clanmills clanmills added this to the v0.27.4 milestone Mar 3, 2021
postscript-dev pushed a commit to postscript-dev/exiv2 that referenced this issue Mar 6, 2021
+ Fix segmentation faults in langAlt parse
+ Fix mismatched quotation marks and incorrect values
+ Add Python testing
  + Some tests commented out as quotation marks are filtered, preventing them
    from running.
Closes Exiv2#1481.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants