Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use-of-uninitialized-value in XMPMeta::RegisterNamespace #1223

Closed
henices opened this issue May 29, 2020 · 35 comments
Closed

use-of-uninitialized-value in XMPMeta::RegisterNamespace #1223

henices opened this issue May 29, 2020 · 35 comments
Assignees
Labels

Comments

@henices
Copy link

henices commented May 29, 2020

Version : commit 356f862
OS : Fedora Linux

How to reproduce:

build exiv2 binary with MemorySanitizer, run following command:

exiv2 ./use-of-uninitialized-value-RegisterNamespace.tiff

==3642951==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7fb8b831729e in XMPMeta::RegisterNamespace(char const*, char const*) /home/henices/tests/exiv2/xmpsdk/src/XMPMeta.cpp:1042:7
    #1 0x7fb8b84151ae in StartNamespaceDeclHandler(void*, char const*, char const*) /home/henices/tests/exiv2/xmpsdk/src/ExpatAdapter.cpp:263:2
    #2 0x7fb8b6e1c4e1  (/lib64/libexpat.so.1+0x54e1)
    #3 0x7fb8b6e1f424  (/lib64/libexpat.so.1+0x8424)
    #4 0x7fb8b6e22752  (/lib64/libexpat.so.1+0xb752)
    #5 0x7fb8b6e2376f  (/lib64/libexpat.so.1+0xc76f)
    #6 0x7fb8b6e20c42  (/lib64/libexpat.so.1+0x9c42)
    #7 0x7fb8b6e2210d  (/lib64/libexpat.so.1+0xb10d)
    #8 0x7fb8b6e25e7f in XML_ParseBuffer (/lib64/libexpat.so.1+0xee7f)
    #9 0x7fb8b841a2ab in ExpatAdapter::ParseBuffer(void const*, unsigned long, bool) /home/henices/tests/exiv2/xmpsdk/src/ExpatAdapter.cpp:135:11
    #10 0x7fb8b82d3919 in ProcessUTF8Portion(XMLParserAdapter*, unsigned char const*, unsigned long, bool) /home/henices/tests/exiv2/xmpsdk/src/XMPMeta-Parse.cpp:1053:39
    #11 0x7fb8b82cdff5 in XMPMeta::ParseFromBuffer(char const*, unsigned long, unsigned long) /home/henices/tests/exiv2/xmpsdk/src/XMPMeta-Parse.cpp:1224:23
    #12 0x7fb8b81d4d6e in WXMPMeta_ParseFromBuffer_1 /home/henices/tests/exiv2/xmpsdk/src/WXMPMeta.cpp:1274:9
    #13 0x7fb8b7c1f5b8 in TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::ParseFromBuffer(char const*, unsigned long, unsigned long) /home/henices/tests/exiv2/xmpsdk/include/client-glue/TXMPMeta.incl_cpp:903:2
    #14 0x7fb8b7c1edac in TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::TXMPMeta(char const*, unsigned long) /home/henices/tests/exiv2/xmpsdk/include/client-glue/TXMPMeta.incl_cpp:169:9
    #15 0x7fb8b7c0b730 in Exiv2::XmpParser::decode(Exiv2::XmpData&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) /home/henices/tests/exiv2/src/xmp.cpp:606:18
    #16 0x7fb8b80c74d0 in Exiv2::Internal::TiffDecoder::decodeXmp(Exiv2::Internal::TiffEntryBase const*) /home/henices/tests/exiv2/src/tiffvisitor_int.cpp:406:17
    #17 0x7fb8b80c353e in Exiv2::Internal::TiffDecoder::decodeTiffEntry(Exiv2::Internal::TiffEntryBase const*) /home/henices/tests/exiv2/src/tiffvisitor_int.cpp:550:13
    #18 0x7fb8b80c2cd4 in Exiv2::Internal::TiffDecoder::visitEntry(Exiv2::Internal::TiffEntry*) /home/henices/tests/exiv2/src/tiffvisitor_int.cpp:314:9
    #19 0x7fb8b7f9bceb in Exiv2::Internal::TiffEntry::doAccept(Exiv2::Internal::TiffVisitor&) /home/henices/tests/exiv2/src/tiffcomposite_int.cpp:895:17
    #20 0x7fb8b7f9b9d6 in Exiv2::Internal::TiffComponent::accept(Exiv2::Internal::TiffVisitor&) /home/henices/tests/exiv2/src/tiffcomposite_int.cpp:890:50
    #21 0x7fb8b7f9cfe9 in Exiv2::Internal::TiffDirectory::doAccept(Exiv2::Internal::TiffVisitor&) /home/henices/tests/exiv2/src/tiffcomposite_int.cpp:918:19
    #22 0x7fb8b7f9b9d6 in Exiv2::Internal::TiffComponent::accept(Exiv2::Internal::TiffVisitor&) /home/henices/tests/exiv2/src/tiffcomposite_int.cpp:890:50
    #23 0x7fb8b8048eec in Exiv2::Internal::TiffParserWorker::decode(Exiv2::ExifData&, Exiv2::IptcData&, Exiv2::XmpData&, unsigned char const*, unsigned long, unsigned int, void (Exiv2::Internal::TiffDecoder::* (*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, Exiv2::Internal::IfdId))(Exiv2::Internal::TiffEntryBase const*), Exiv2::Internal::TiffHeaderBase*) /home/henices/tests/exiv2/src/tiffimage_int.cpp:1622:22
    #24 0x7fb8b7a6ccf4 in Exiv2::TiffParser::decode(Exiv2::ExifData&, Exiv2::IptcData&, Exiv2::XmpData&, unsigned char const*, unsigned long) /home/henices/tests/exiv2/src/tiffimage.cpp:260:16
    #25 0x7fb8b7a6a9fd in Exiv2::TiffImage::readMetadata() /home/henices/tests/exiv2/src/tiffimage.cpp:187:24
    #26 0x699d46 in Action::Print::printSummary() /home/henices/tests/exiv2/src/actions.cpp:260:16
    #27 0x698d48 in Action::Print::run(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) /home/henices/tests/exiv2/src/actions.cpp:215:62
    #28 0x4a0c5a in main /home/henices/tests/exiv2/src/exiv2.cpp:80:29
    #29 0x7fb8b6e861a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
    #30 0x42369d in _start (/home/henices/tests/exiv2/build_msan/bin/exiv2+0x42369d)

  Uninitialized value was created by a heap allocation
    #0 0x44fa0d in malloc /tmp/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:901:3
    #1 0x7fb8b6e1c71f  (/lib64/libexpat.so.1+0x571f)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/henices/tests/exiv2/xmpsdk/src/XMPMeta.cpp:1042:7 in XMPMeta::RegisterNamespace(char const*, char const*)
Exiting

Please download testcase in https://github.com/henices/pocs/raw/master/use-of-uninitialized-value-RegisterNamespace.tiff

Thanks.

@henices henices added the bug label May 29, 2020
@clanmills clanmills self-assigned this May 29, 2020
@clanmills clanmills added this to the v0.27.3 milestone May 29, 2020
@clanmills
Copy link
Collaborator

Thanks. I've built branch 0.27-maintenance with -DEXIV2_TEAM_USE_SANITIZERS=On on macOS and it doesn't reproduce this. I'll build on Fedora.

I hope to get the fix for this into exiv2 v0.27.3 RC2 which is scheduled for Sunday 2020-05-31.

@D4N
Copy link
Member

D4N commented May 29, 2020 via email

@D4N
Copy link
Member

D4N commented May 29, 2020 via email

@clanmills
Copy link
Collaborator

I'm having a sniff in your file while Fedora's building. Here's what I see:

530 rmills@rmillsmbp:~/Downloads $ exiv2 -pR ~/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff 
STRUCTURE OF TIFF FILE (MM): /Users/rmills/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff
 address |    tag                              |      type |    count |    offset | value
    6402 | 0x0100 ImageWidth                   |     SHORT |        1 |           | 174
    6414 | 0x0101 ImageLength                  |     SHORT |        1 |           | 38
    6426 | 0x0102 BitsPerSample                |     SHORT |        4 |      6586 | 8 8 8 8
    6438 | 0x0103 Compression                  |     SHORT |        1 |           | 5
    6450 | 0x0106 PhotometricInterpretation    |     SHORT |        1 |           | 2
    6462 | 0x0111 StripOffsets                 |      LONG |        1 |           | 8
    6474 | 0x0112 Orientation                  |     SHORT |        1 |           | 1
    6486 | 0x0115 SamplesPerPixel              |     SHORT |        1 |           | 4
    6498 | 0x0116 RowsPerStrip                 |     SHORT |        1 |           | 38
    6510 | 0x0117 StripByteCounts              |      LONG |        1 |           | 6391
    6522 | 0x011c PlanarConfiguration          |     SHORT |        1 |           | 1
    6534 | 0x013d Predictor                    |     SHORT |        1 |           | 2
    6546 | 0x0152 ExtraSamples                 |     SHORT |        1 |           | 1
    6558 | 0x0153 SampleFormat                 |     SHORT |        4 |      6594 | 1 1 1 1
    6570 | 0x02bc XMLPacket                    |      BYTE |      323 |      6602 | <x:xmpmeta xmlns:x="adobe:ns:met ...
END /Users/rmills/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff
531 rmills@rmillsmbp:~/Downloads $ dd bs=1 skip=6602 count=323 if=use-of-uninitialized-value-RegisterNamespace.tiff 
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.1.2">
   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <rdf:Description rdf:about=""
            xmlns:tiff="http://ns.adobe.com/tiff/1.0/">
         <tiff:Compression>5</tiff:Compression>
      </rdf:Description>
   </rdf:RDF>
</x:xmpmeta>
323+0 records in
323+0 records out
323 bytes transferred in 0.001128 secs (286358 bytes/sec)
532 rmills@rmillsmbp:~/Downloads $ exiv2 -vV | grep Core
xmlns=Iptc4xmpCore:http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/
533 rmills@rmillsmbp:~/Downloads $ exiv2 -vV | grep xmptk
534 rmills@rmillsmbp:~/Downloads $ 

So your file want to register the namespace xmptk="XMP Core 5.1.2". I don't much about XMP/xml and namespaces. There are about 60 pre-registered namespaces. I think I can add that to make this "go away", without resolving why it doesn't work!

The current code can handle x:xmptk="XMP Core 4.4.0-Exiv2" in my favourite test file:

549 rmills@rmillsmbp:~/Downloads $ exiv2 -pX http://clanmills.com/Stonehenge.jpg  | xmllint --pretty 1 -
<?xml version="1.0"?>
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="" xmp:Rating="0" xmp:ModifyDate="2015-07-16T20:25:28+01:00">
      <dc:description>
        <rdf:Alt>
          <rdf:li xml:lang="x-default">Classic View</rdf:li>
        </rdf:Alt>
      </dc:description>
    </rdf:Description>
  </rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
550 rmills@rmillsmbp:~/Downloads $ 

@clanmills
Copy link
Collaborator

This doesn't link on Fedora, and I'm not familiar with the platform. I'm going to ask @D4N to reproduce this.

/usr/bin/ld: cannot find /usr/lib64/libasan.so.5.0.0
collect2: error: ld returned 1 exit status
make[2]: *** [src/CMakeFiles/exiv2lib.dir/build.make:706: lib/libexiv2.so.0.27.3.20] Error 1
make[1]: *** [CMakeFiles/Makefile2:520: src/CMakeFiles/exiv2lib.dir/all] Error 2
make: *** [Makefile:147: all] Error 2
[rmills@rmillsmm-fedora build]$ pacman -S libasan
warning: database file for 'core' does not exist
error: you cannot perform this operation unless you are root.
[rmills@rmillsmm-fedora build]$ sudo pacman -S libasan
warning: database file for 'core' does not exist
error: target not found: libasan
[rmills@rmillsmm-fedora build]$ sudo pacman -S libasan-devel
warning: database file for 'core' does not exist
error: target not found: libasan-devel
[rmills@rmillsmm-fedora build]$ ^C
[rmills@rmillsmm-fedora build]$ 

@clanmills
Copy link
Collaborator

@D4N Thanks. I've just read your comment about -fsanitize=memory. I'll try that on macOS.

@D4N
Copy link
Member

D4N commented May 29, 2020 via email

@D4N
Copy link
Member

D4N commented May 29, 2020 via email

@clanmills
Copy link
Collaborator

Thanks @D4N for reminding me about dnf. Presumably short for dnf with me!

I can't build this with the memory sanitizer anywhere (macOS, ubuntu 20.04 or Fedora). Let's assume the report is correct and focus on the code in xmpsdk/src/XMPMeta.cpp

/* class-static */ void
XMPMeta::RegisterNamespace ( XMP_StringPtr	 namespaceURI,
                             XMP_StringPtr	 prefix )
{
	if ( (*namespaceURI == 0) || (*prefix == 0) ) {
		XMP_Throw ( "Empty namespace URI or prefix", kXMPErr_BadParam );
	}
	
	XMP_VarString	nsURI ( namespaceURI );
	XMP_VarString	prfix ( prefix );
	if ( prfix[prfix.size()-1] != ':' ) prfix += ':';
	VerifySimpleXMLName ( prefix, prefix+prfix.size()-1 );	// Exclude the colon.
	
        // Set the new namespace in both maps.
        (*sNamespaceURIToPrefixMap)[nsURI] = prfix;
        (*sNamespacePrefixToURIMap)[prfix] = nsURI;
	
}	// RegisterNamespace

And we know that's operating with this input:

<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.1.2">

@henices Can you add some print statements to the code and get this nailed down? I'm going to inspect a more recent Adobe/XMPsdk to see if the code has been reinforced.

@D4N
Copy link
Member

D4N commented May 29, 2020 via email

@clanmills
Copy link
Collaborator

The latest code is here: git clone https://github.com/adobe/XMP-Toolkit-SDK.git And that function has a lock. I think the name space registry is global, so they are making it thread-safe. However the lock's only use by the CPP_DOM_MODEL. The active ingredient is Define()

/* class-static */ bool
XMPMeta::RegisterNamespace ( XMP_StringPtr	 namespaceURI,
							 XMP_StringPtr	 suggestedPrefix,
							 XMP_StringPtr * registeredPrefix,
							 XMP_StringLen * prefixSize )
{

	bool returnValue = sRegisteredNamespaces->Define ( namespaceURI, suggestedPrefix, registeredPrefix, prefixSize );
#if ENABLE_CPP_DOM_MODEL
	const char * prefix = NULL;
	XMP_StringLen len = 0;
	sRegisteredNamespaces->GetPrefix( namespaceURI, &prefix, &len );
	XMP_VarString prefixWithoutColon( prefix, len - 1 );
	{
		XMP_AutoLock aLock( sDefaultNamespacePrefixMapLock, true );
		AdobeXMPCore_Int::INameSpacePrefixMap_I::InsertInDefaultNameSpacePrefixMap( prefixWithoutColon.c_str(), prefixWithoutColon.size(), namespaceURI, AdobeXMPCommon::npos );
	}
#endif
	return returnValue;
}	// RegisterNamespace

And boy-oh-boy, he's been beafed up. There is an issue with the XMPsdk 4.4.0 in Exiv2 concerning namespace uniqueness. I can't remember what it's about - however it looks as though that's being fixed by this code.

bool XMP_NamespaceTable::Define ( XMP_StringPtr _uri, XMP_StringPtr _suggPrefix,
								  XMP_StringPtr * prefixPtr, XMP_StringLen * prefixLen )
{
	XMP_AutoLock tableLock ( &this->lock, kXMP_WriteLock );
	bool prefixMatches = false;

	XMP_Assert ( (_uri != 0) && (*_uri != 0) && (_suggPrefix != 0) && (*_suggPrefix != 0) );

	XMP_VarString	uri ( _uri );
	XMP_VarString	suggPrefix ( _suggPrefix );
	if ( suggPrefix[suggPrefix.size()-1] != ':' ) suggPrefix += ':';
	VerifySimpleXMLName ( _suggPrefix, _suggPrefix+suggPrefix.size()-1 );	// Exclude the colon.

	XMP_StringMapPos uriPos = this->uriToPrefixMap.find ( uri );

	if ( uriPos == this->uriToPrefixMap.end() ) {

		// The URI is not yet registered, make sure we use a unique prefix.

		XMP_VarString uniqPrefix ( suggPrefix );
		int  suffix = 0;
		char buffer [32];	// AUDIT: Plenty of room for the "_%d_" suffix.

		while ( true ) {
			if ( this->prefixToURIMap.find ( uniqPrefix ) == this->prefixToURIMap.end() ) break;
			++suffix;
			snprintf ( buffer, sizeof(buffer), "_%d_:", suffix );	// AUDIT: Using sizeof for snprintf length is safe.
			uniqPrefix = suggPrefix;
			uniqPrefix.erase ( uniqPrefix.size()-1 );	// ! Remove the trailing ':'.
			uniqPrefix += buffer;
		}

		// Add the new namespace to both maps.

		XMP_StringPair newNS ( uri, uniqPrefix );
		uriPos = this->uriToPrefixMap.insert ( this->uriToPrefixMap.end(), newNS );

		newNS.first.swap ( newNS.second );
		(void) this->prefixToURIMap.insert ( this->prefixToURIMap.end(), newNS );

	}

	// Return the actual prefix and see if it matches the suggested prefix.

	if ( prefixPtr != 0 ) *prefixPtr = uriPos->second.c_str();
	if ( prefixLen != 0 ) *prefixLen = (XMP_StringLen)uriPos->second.size();

	prefixMatches = ( uriPos->second == suggPrefix );
	return prefixMatches;

}	// XMP_NamespaceTable::Define

@henices Any additional information you can provide will get this nailed down, otherwise I'm running out of ideas.

@clanmills
Copy link
Collaborator

@D4N - what a smart guy you are! (I think I've told you that before.). I did a vanilla build on ubuntu and used valgrind.

rmills@rmillsmm-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build$ valgrind bin/exiv2 /media/psf/Home/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff 
==256195== Memcheck, a memory error detector
==256195== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==256195== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==256195== Command: bin/exiv2 /media/psf/Home/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff
==256195== 
File name       : /media/psf/Home/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff
File size       : 6925 Bytes
MIME type       : image/tiff
Image size      : 174 x 38
Camera make     : 
Camera model    : 
Image timestamp : 
Image number    : 
Exposure time   : 
Aperture        : 
Exposure bias   : 
Flash           : 
Flash bias      : 
Focal length    : 
Subject distance: 
ISO speed       : 
Exposure mode   : 
Metering mode   : 
Macro mode      : 
Image quality   : 
Exif Resolution : 174 x 38
White balance   : 
Thumbnail       : None
Copyright       : 
Exif comment    : 

==256195== 
==256195== HEAP SUMMARY:
==256195==     in use at exit: 0 bytes in 0 blocks
==256195==   total heap usage: 3,456 allocs, 3,456 frees, 220,536 bytes allocated
==256195== 
==256195== All heap blocks were freed -- no leaks are possible
==256195== 
==256195== For lists of detected and suppressed errors, rerun with: -s
==256195== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
rmills@rmillsmm-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build$ 

Sadly, it doesn't die. I used valgrind -s bin/exiv2 path...... Same result.

I built with the santizers. bin/exiv2 runs OK on the test file.

I try to run the sanitised exiv2 with valgrind. No output and this curious statement:

==259193==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.

Here's the output of ldd:

mills@rmillsmm-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build$ ldd bin/exiv2
	linux-vdso.so.1 (0x00007ffcbe5ae000)
	libasan.so.5 => /lib/x86_64-linux-gnu/libasan.so.5 (0x00007f72c5f48000)
	libexiv2.so.27 => /home/rmills/gnu/github/exiv2/0.27-maintenance/build/lib/libexiv2.so.27 (0x00007f72c2ccf000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f72c2aee000)
	libubsan.so.1 => /lib/x86_64-linux-gnu/libubsan.so.1 (0x00007f72c2181000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f72c2166000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f72c1f74000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f72c1f6c000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f72c1f61000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f72c1f3e000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f72c1def000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f72c1dd3000)
	libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f72c1da5000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f72c6d36000)
rmills@rmillsmm-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build$ 

bin/exiv2 outputs similar (by reading something in /proc:

rmills@rmillsmm-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build$ bin/exiv2 -vVg library
exiv2 0.27.3.20
library=/usr/lib/x86_64-linux-gnu/libexpat.so.1.6.11
library=/usr/lib/x86_64-linux-gnu/libz.so.1.2.11
library=/usr/lib/x86_64-linux-gnu/libm-2.31.so
library=/usr/lib/x86_64-linux-gnu/libpthread-2.31.so
library=/usr/lib/x86_64-linux-gnu/librt-2.31.so
library=/usr/lib/x86_64-linux-gnu/libdl-2.31.so
library=/usr/lib/x86_64-linux-gnu/libc-2.31.so
library=/usr/lib/x86_64-linux-gnu/libgcc_s.so.1
library=/usr/lib/x86_64-linux-gnu/libubsan.so.1.0.0
library=/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
library=/home/rmills/gnu/github/exiv2/0.27-maintenance/build/lib/libexiv2.so.0.27.3.20
library=/usr/lib/x86_64-linux-gnu/libasan.so.5.0.0
library=/usr/lib/x86_64-linux-gnu/ld-2.31.so
rmills@rmillsmm-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build$ 

Trying LD_PRELOAD without effect:

rmills@rmillsmm-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build$ env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libasan.so.5.0.0 valgrind bin/exiv2 /media/psf/Home/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff 
==262004== Memcheck, a memory error detector
==262004== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==262004== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==262004== Command: bin/exiv2 /media/psf/Home/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff
==262004== 
==262004==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
==262004== 
==262004== HEAP SUMMARY:
==262004==     in use at exit: 0 bytes in 0 blocks
==262004==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==262004== 
==262004== All heap blocks were freed -- no leaks are possible
==262004== 
==262004== For lists of detected and suppressed errors, rerun with: -s
==262004== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
rmills@rmillsmm-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build$ 

@D4N
Copy link
Member

D4N commented May 29, 2020 via email

@clanmills
Copy link
Collaborator

I think I know what's wrong here. RegisterNamespace uses two maps namespace->prefix and prefix->namespace. XMPsdk 4.4.0 does not allow two or more prefixes for the same namespace. I suspect that's a bug in XMPsdk. It would be very helpful for @henices to add a couple of debugging statements to his code in XMPMeta.cpp. Something like this, then we would know exactly which of the 70 calls to RegisterNamespace is causing the damage.

#include <iostream.hpp>
...
XMPMeta( namespace, uri )
{
    std::cout << "namespace = " << namespace << " prefix = " << prefix << std:endl
....
}

@clanmills
Copy link
Collaborator

@henices Can you try this patch please.

Caution: This is work-in-progress and must not escape into any product or production environment.

diff --git a/xmpsdk/src/XMPMeta.cpp b/xmpsdk/src/XMPMeta.cpp
index 30ff4a5a..a9ea4956 100644
--- a/xmpsdk/src/XMPMeta.cpp
+++ b/xmpsdk/src/XMPMeta.cpp
@@ -1034,7 +1034,8 @@ XMPMeta::SetGlobalOptions ( XMP_OptionBits /*options*/ )
 // -------------------------------------------------------------------------------------------------
 // RegisterNamespace
 // -----------------
-
+#include <iostream>
+                
 /* class-static */ void
 XMPMeta::RegisterNamespace ( XMP_StringPtr	 namespaceURI,
                              XMP_StringPtr	 prefix )
@@ -1042,7 +1043,39 @@ XMPMeta::RegisterNamespace ( XMP_StringPtr	 namespaceURI,
 	if ( (*namespaceURI == 0) || (*prefix == 0) ) {
 		XMP_Throw ( "Empty namespace URI or prefix", kXMPErr_BadParam );
 	}
-	
+    string Prefix = string(prefix)+":";
+    // std::cerr << "RegisterNamespace " << prefix << ":" << namespaceURI << std::endl;
+	// std::cout << "prefix = " << prefix << " namespaceURI = " << namespaceURI << std::endl;
+    if ( sNamespaceURIToPrefixMap->find(namespaceURI) != sNamespaceURIToPrefixMap->end()
+    ||   sNamespacePrefixToURIMap->find(Prefix)       != sNamespacePrefixToURIMap->end()
+    ) {
+        string undefined  = "undefined";
+        string old_prefix = sNamespaceURIToPrefixMap->find(namespaceURI) == sNamespaceURIToPrefixMap->end()
+                          ? undefined
+                          : (*sNamespaceURIToPrefixMap)[namespaceURI]
+                          ;
+        if ( old_prefix.size() > 1 ) { // remove trailing ':'
+            if ( old_prefix.at(old_prefix.size()-1) == ':' ) {
+                old_prefix=old_prefix.substr(0,old_prefix.size()-1);
+            }
+        }
+        string old_uri    = sNamespacePrefixToURIMap->find(Prefix) == sNamespacePrefixToURIMap->end()
+                          ? undefined
+                          : (*sNamespacePrefixToURIMap)[Prefix]
+                          ;
+        std::cerr << "DUPLICATE " << string(prefix) << "=" << string(namespaceURI)
+                  << " previous " << old_prefix     << "=" << old_uri
+                  << std::endl;
+                  ;
+        if ( old_prefix != prefix || old_uri != namespaceURI ) {
+            string error = "Cannot redefine namespace " + string(prefix) + string(namespaceURI)
+            + string ( " previous") + old_prefix + old_uri
+            ;
+            XMP_Throw (  error.c_str(), kXMPErr_BadParam );
+        }
+        return;
+    }
+
 	XMP_VarString	nsURI ( namespaceURI );
 	XMP_VarString	prfix ( prefix );
 	if ( prfix[prfix.size()-1] != ':' ) prfix += ':';
@@ -1051,6 +1084,8 @@ XMPMeta::RegisterNamespace ( XMP_StringPtr	 namespaceURI,
         // Set the new namespace in both maps.
         (*sNamespaceURIToPrefixMap)[nsURI] = prfix;
         (*sNamespacePrefixToURIMap)[prfix] = nsURI;
+                
+    // std::cerr << "Registered " << (*sNamespaceURIToPrefixMap)[nsURI] << "=" << (*sNamespacePrefixToURIMap)[prfix] << std::endl;
 	
 }	// RegisterNamespace
 

@clanmills
Copy link
Collaborator

I've reproduced this on ubuntu by building with the command:

$ cmake .. '-DCMAKE_CXX_FLAGS=-fsanitize=memory -Wno-deprecated' -DCMAKE_CXX_COMPILER=$(which clang++) -DCMAKE_CXX_STANDARD=11

However, exiv2 dies long before XMPMeta::RegisterNamespace.

781 rmills@rmillsmbp-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build $ bin/exiv2 -g XX /Home/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff 
==65063==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4d0ec0 in std::char_traits<char>::assign(char&, char const&) (/home/rmills/gnu/github/exiv2/0.27-maintenance/build/bin/exiv2+0x4d0ec0)
    #1 0x4d05bf in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_set_length(unsigned long) (/home/rmills/gnu/github/exiv2/0.27-maintenance/build/bin/exiv2+0x4d05bf)
    #2 0x4cf7d5 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>(char const*, char const*, std::forward_iterator_tag) (/home/rmills/gnu/github/exiv2/0.27-maintenance/build/bin/exiv2+0x4cf7d5)
    #3 0x7f8a8ccfe975 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct_aux<char const*>(char const*, char const*, std::__false_type) (/usr/local/lib/libexiv2.so.27+0x12d975)
    #4 0x7f8a8ccfdf2b in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>(char const*, char const*) (/usr/local/lib/libexiv2.so.27+0x12cf2b)
    #5 0x7f8a8ccfc924 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (/usr/local/lib/libexiv2.so.27+0x12b924)
    #6 0x7f8a8ce71806 in XMPMeta::Initialize() (/usr/local/lib/libexiv2.so.27+0x2a0806)
    #7 0x7f8a8ce4b8bf in WXMPMeta_Initialize_1 (/usr/local/lib/libexiv2.so.27+0x27a8bf)
    #8 0x7f8a8cdacf57 in TXMPMeta<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::Initialize() (/usr/local/lib/libexiv2.so.27+0x1dbf57)
    #9 0x7f8a8cda9317 in Exiv2::XmpParser::initialize(void (*)(void*, bool), void*) (/usr/local/lib/libexiv2.so.27+0x1d8317)
    #10 0x4a19df in main (/home/rmills/gnu/github/exiv2/0.27-maintenance/build/bin/exiv2+0x4a19df)
    #11 0x7f8a8c66f0b2 in __libc_start_main /build/glibc-YYA7BZ/glibc-2.31/csu/../csu/libc-start.c:308:16
    #12 0x42680d in _start (/home/rmills/gnu/github/exiv2/0.27-maintenance/build/bin/exiv2+0x42680d)

SUMMARY: MemorySanitizer: use-of-uninitialized-value (/home/rmills/gnu/github/exiv2/0.27-maintenance/build/bin/exiv2+0x4d0ec0) in std::char_traits<char>::assign(char&, char const&)
Exiting
782 rmills@rmillsmbp-ubuntu:~/gnu/github/exiv2/0.27-maintenance/build $ 

I've tried building with that command on macOS. exiv2 ran normally. I'm not convinced that exiv2 (and exiv2.dylib) are linked to the msan libraries!

634 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance/build $ otool -L bin/exiv2
bin/exiv2:
	@rpath/libexiv2.27.dylib (compatibility version 27.0.0, current version 0.27.3)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 902.1.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.100.1)
636 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance/build $ otool -L lib/libexiv2.0.27.3.20.dylib 
lib/libexiv2.0.27.3.20.dylib:
	@rpath/libexiv2.27.dylib (compatibility version 27.0.0, current version 0.27.3)
	/usr/local/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
	/usr/local/lib/libexpat.1.dylib (compatibility version 8.0.0, current version 8.0.0)
	/usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 902.1.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.100.1)
637 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance/build $ 

It's Friday afternoon. Exiv2 v0.27.3 RC2 will be released on schedule on Sunday. I'll continue to investigate this matter. When I know more, I'll have to choose between:

  1. Release v0.27.3 RC3 with the fix
  2. Defer for v0.27.4 or v0.28
  3. Some other idea!

@D4N
Copy link
Member

D4N commented May 29, 2020 via email

@clanmills
Copy link
Collaborator

Thank you, @D4N for this explanation. We have to push back on @henices for more information, or close this has "cannot reproduce". He has it running on his machine and can add some print statements to find out exactly where it goes wrong.

@clanmills
Copy link
Collaborator

https://clang.llvm.org/docs/MemorySanitizer.html

MemorySanitizer is known to work on large real-world programs (like Clang/LLVM itself) that can be recompiled from source, including all dependent libraries.

Exiv2 uses 44 libraries on macOS. I don't know if the source is available for that lot (probably is). I'm not volunteering to do this.

556 rmills@rmillsmm-local:~/temp $ env DYLD_PRINT_LIBRARIES=1 exiv2 -g XX ~/Stonehenge.jpg 2>&1 | grep loaded | wc -l
44
557 rmills@rmillsmm-local:~/temp $ env DYLD_PRINT_LIBRARIES=1 exiv2 -g XX ~/Stonehenge.jpg 
dyld: loaded: <2056BDEF-BFFA-39B7-A8DD-7327CF9A49BC> /usr/local/bin/exiv2
dyld: loaded: <25F64DE0-6536-318A-9C2D-F1D70C5D7FDC> /usr/local/lib/libexiv2.27.dylib
dyld: loaded: <AD0805FE-F98B-3E2F-B072-83782B22DAC9> /usr/lib/libc++.1.dylib
dyld: loaded: <DC04B185-E3C9-33AF-B450-EF3ED07FB021> /usr/lib/libSystem.B.dylib
dyld: loaded: <DB120508-3BED-37A8-B439-5235EAB4618A> /usr/lib/libz.1.dylib
dyld: loaded: <C9163264-BA81-3CC6-9B8C-48A9A0709DD5> /usr/lib/libexpat.1.dylib
dyld: loaded: <F58FED71-6CCA-30E8-9A51-13E9B46E568D> /usr/lib/libiconv.2.dylib
dyld: loaded: <A5ECC751-A681-30D8-B33C-D192C15D25C8> /usr/lib/system/libcache.dylib
dyld: loaded: <C321A74A-AA91-3785-BEBF-BEDC6975026C> /usr/lib/system/libcommonCrypto.dylib
dyld: loaded: <652A6012-7E5C-3F4F-9438-86BC094526F3> /usr/lib/system/libcompiler_rt.dylib
dyld: loaded: <40113A69-A81C-3397-ADC6-1D16B9A22C3E> /usr/lib/system/libcopyfile.dylib
dyld: loaded: <5E4B0E50-24DD-3E04-9374-EDA9FFD6257B> /usr/lib/system/libcorecrypto.dylib
dyld: loaded: <201EDBF3-0B36-31BA-A7CB-443CE35C05D4> /usr/lib/system/libdispatch.dylib
dyld: loaded: <7E711A46-5E4D-393C-AEA6-440E2A5CCD0C> /usr/lib/system/libdyld.dylib
dyld: loaded: <52662CAA-DB1F-30A3-BE13-D6274B1A6D7B> /usr/lib/system/libkeymgr.dylib
dyld: loaded: <07CF647B-F9DC-3907-AD98-2F85FCB34A72> /usr/lib/system/liblaunch.dylib
dyld: loaded: <D91DFF00-E22F-3796-8A1C-4C1F5F8FA03C> /usr/lib/system/libmacho.dylib
dyld: loaded: <D3B7D02C-7646-3FB4-8529-B36DCC2419EA> /usr/lib/system/libquarantine.dylib
dyld: loaded: <B5E88D9B-C2BE-3496-BBB2-C996317E18A3> /usr/lib/system/libremovefile.dylib
dyld: loaded: <1170348D-2491-33F1-AA79-E2A05B4A287C> /usr/lib/system/libsystem_asl.dylib
dyld: loaded: <7AFBCAA6-81BE-36C3-8DB0-AAE0A4ACE4C5> /usr/lib/system/libsystem_blocks.dylib
dyld: loaded: <935DDCE9-4ED0-3F79-A05A-A123DDE399CC> /usr/lib/system/libsystem_c.dylib
dyld: loaded: <EA9BC2B1-5001-3463-9FAF-39FF61CAC87C> /usr/lib/system/libsystem_configuration.dylib
dyld: loaded: <3D0A3AA8-8415-37B2-AAE3-66C03BCE8B55> /usr/lib/system/libsystem_coreservices.dylib
dyld: loaded: <6EEC9975-EE3B-3C95-AA5B-030FD10587BC> /usr/lib/system/libsystem_darwin.dylib
dyld: loaded: <0115092A-E61B-317D-8670-41C7C34B1A82> /usr/lib/system/libsystem_dnssd.dylib
dyld: loaded: <AFDB5095-0472-34AC-BA7E-497921BF030A> /usr/lib/system/libsystem_featureflags.dylib
dyld: loaded: <851693E9-C079-3547-AD41-353F8C248BE8> /usr/lib/system/libsystem_info.dylib
dyld: loaded: <1E6FACE8-8EB2-3478-A77D-220617855B30> /usr/lib/system/libsystem_m.dylib
dyld: loaded: <D4BA7DF2-57AC-33B0-B948-A688EE43C799> /usr/lib/system/libsystem_malloc.dylib
dyld: loaded: <6DE86DB0-8CD2-361E-BD6A-A34282B47847> /usr/lib/system/libsystem_networkextension.dylib
dyld: loaded: <7E9E2FC8-DF26-340C-B196-B81B11850C46> /usr/lib/system/libsystem_notify.dylib
dyld: loaded: <20C93D69-6452-3C82-9521-8AE54345C66F> /usr/lib/system/libsystem_sandbox.dylib
dyld: loaded: <E851113D-D5B1-3FB0-9D29-9C7647A71961> /usr/lib/system/libsystem_secinit.dylib
dyld: loaded: <84D09AE3-2DA8-3F6D-ACEC-DC4990B1A2FF> /usr/lib/system/libsystem_kernel.dylib
dyld: loaded: <736920EA-6AE0-3B1B-BBDA-7DCDF0C229DF> /usr/lib/system/libsystem_platform.dylib
dyld: loaded: <77488669-19A3-3993-AD65-CA5377E2475A> /usr/lib/system/libsystem_pthread.dylib
dyld: loaded: <25C3866B-004E-3621-9CD3-B1E9C4D887EB> /usr/lib/system/libsystem_symptoms.dylib
dyld: loaded: <A1ED1D3A-5FAD-3559-A1D6-1BE4E1C5756A> /usr/lib/system/libsystem_trace.dylib
dyld: loaded: <253A12E2-F88F-3838-A666-C5306F833CB8> /usr/lib/system/libunwind.dylib
dyld: loaded: <68D433B6-DCFF-385D-8620-F847FB7D4A5A> /usr/lib/system/libxpc.dylib
dyld: loaded: <720155DE-4837-3A43-A8AD-C00C8659FDF5> /usr/lib/libobjc.A.dylib
dyld: loaded: <771E9263-E832-3985-9477-8F1B2D73B771> /usr/lib/libc++abi.dylib
dyld: loaded: <FF23D4ED-A5AD-3592-9574-48486C7DF85B> /usr/lib/libcharset.1.dylib
557 rmills@rmillsmm-local:~/temp $ 

@D4N
Copy link
Member

D4N commented May 29, 2020 via email

@clanmills
Copy link
Collaborator

I'm going to remove this from milestone v0.27.3. Without input from @henices, we can't make progress.

@clanmills clanmills modified the milestone: v0.27.3 May 29, 2020
@clanmills
Copy link
Collaborator

Saturday. A bright new day and some new ideas.

1 This is caused by the file

2 Build Exiv2 with newer XMPsdk

The challenge here is that I haven't reproduced the crash. So, even if I build this and it works, I haven't discovered anything.

3 Build libc++ with MASN

The call into RegisterNamespace() takes place almost instantly in exiv2. The very first thing all our sample applications do is:

int main(int argc, char* const argv[])
{
    Exiv2::XmpParser::initialize();
    ::atexit(Exiv2::XmpParser::terminate);
...
}

And XmpParser::initialise() is:

bool XmpParser::initialize(XmpParser::XmpLockFct xmpLockFct, void* pLockData)
{
    if (!initialized_) {
        xmpLockFct_ = xmpLockFct;
        pLockData_ = pLockData;
        initialized_ = SXMPMeta::Initialize();
        SXMPMeta::RegisterNamespace("http://ns.adobe.com/lightroom/1.0/", "lr");
        SXMPMeta::RegisterNamespace("http://rs.tdwg.org/dwc/index.htm", "dwc");
....
}

We know the order in which the libraries are being loaded:

dyld: loaded: <2056BDEF-BFFA-39B7-A8DD-7327CF9A49BC> /usr/local/bin/exiv2
dyld: loaded: <25F64DE0-6536-318A-9C2D-F1D70C5D7FDC> /usr/local/lib/libexiv2.27.dylib
dyld: loaded: <AD0805FE-F98B-3E2F-B072-83782B22DAC9> /usr/lib/libc++.1.dylib
dyld: loaded: <DC04B185-E3C9-33AF-B450-EF3ED07FB021> /usr/lib/libSystem.B.dylib
dyld: loaded: <DB120508-3BED-37A8-B439-5235EAB4618A> /usr/lib/libz.1.dylib
dyld: loaded: <C9163264-BA81-3CC6-9B8C-48A9A0709DD5> /usr/lib/libexpat.1.dylib

If I can build libc++ with MSAN, we can probably avoid a false positive from elsewhere. I'm not sure I can build libSystem. We can build libz and libexpat.

However the issue is coming from using RegisterNamespace() with data in the file. By the time we arrive in Image::readMetadata(), we've been opening and reading files and almost certainly have pulled in most of the universe. So building libc++ with MSAN isn't going to help if we use exiv2 as the test harness! So, the brings me to my best idea:

4 Simulate this without libexiv2

In reality, all that exiv2 has done here is:
a) call XMPInitialize()
b) opens and reads the file to get the XMP/xml data
c) calls Exiv2::XmpParser::decode()

sample/xmpparser-test.cpp does this exactly! We know how to find the XMP without exiv2, I did that yesterday (with dd). I could build libc++ and libxmp.a with MSAN and write a simple little program to parse the XMP which I'll pass on the command line. It's quite likely that XMPsdk has a sample program to do this.

We know the crash happens in Exiv2::XmpParser::decode() (from the users stack trace), so before I attempt to build libc++, I'm going to step Exiv2::XmpParser::decode() in the debugger to see what's going on. Never been in that code because it has been running smoothly for at least 12 years.

@clanmills
Copy link
Collaborator

Progress Update. I'm going off to cut the grass now.

@henices Are you willing to work with me and @D4N to solve this? Please understand that we haven't reproduced this, it's occurring in code that we didn't write and has not been touched for about 15 years!

The code operates by parsing the XML/xmp with expat. There's a namespace handler StartNamespaceDeclHandler() who calls XMPMeta::RegisterNamespace(uri,prefix). I've put debugging code in the handler and RegisterNamespace(). The pointers are different between the caller and the callee. How can that be? Some kind of template magic? Perhaps the receiver reads the null byte on one of the strings and that's making MSAN angry.

calling RegisterNamespace adobe:ns:meta/,x uri,prefix = 0x7ffee30cb880,0x7ffee30cb888
prefix = x                                              0x7ffee30cb848 ns = adobe:ns:meta/ 0x7ffee30cb850
calling RegisterNamespace http://www.w3.org/1999/02/22-rdf-syntax-ns#,rdf uri,prefix = 0x7ffee30cb880,0x7ffee30cb888
prefix = rdf 0x7ffee30cb848 ns = http://www.w3.org/1999/02/22-rdf-syntax-ns# 0x7ffee30cb850
calling RegisterNamespace http://ns.adobe.com/tiff/1.0/,tiff uri,prefix = 0x7ffee30cb880,0x7ffee30cb888
prefix = tiff 0x7ffee30cb848 ns = http://ns.adobe.com/tiff/1.0/ 0x7ffee30cb850
diff --git a/xmpsdk/src/ExpatAdapter.cpp b/xmpsdk/src/ExpatAdapter.cpp
index 09117c75..384d5a5e 100644
--- a/xmpsdk/src/ExpatAdapter.cpp
+++ b/xmpsdk/src/ExpatAdapter.cpp
@@ -237,6 +237,7 @@ static void SetQualName ( XMP_StringPtr fullName, XML_Node * node )
 }	// SetQualName
 
 // =================================================================================================
+#include <iostream>
 
 static void StartNamespaceDeclHandler ( void * userData, XMP_StringPtr prefix, XMP_StringPtr uri )
 {
@@ -260,6 +261,10 @@ static void StartNamespaceDeclHandler ( void * userData, XMP_StringPtr prefix, X
 	#endif
 	
 	if ( XMP_LitMatch ( uri, "http://purl.org/dc/1.1/" ) ) uri = "http://purl.org/dc/elements/1.1/";
+    std::cout << "calling RegisterNamespace " << uri << "," << prefix
+              << " " << "uri,prefix = " << &uri
+              << ","  << &prefix << std::endl
+    ;
 	XMPMeta::RegisterNamespace ( uri, prefix );
 
 }	// StartNamespaceDeclHandler
diff --git a/xmpsdk/src/XMPMeta.cpp b/xmpsdk/src/XMPMeta.cpp
index 30ff4a5a..5eeabf31 100644
--- a/xmpsdk/src/XMPMeta.cpp
+++ b/xmpsdk/src/XMPMeta.cpp
@@ -1034,11 +1034,15 @@ XMPMeta::SetGlobalOptions ( XMP_OptionBits /*options*/ )
 // -------------------------------------------------------------------------------------------------
 // RegisterNamespace
 // -----------------
-
+#include <iostream>
 /* class-static */ void
 XMPMeta::RegisterNamespace ( XMP_StringPtr	 namespaceURI,
                              XMP_StringPtr	 prefix )
 {
+    std::cout << "prefix = " << prefix       << " " << &prefix
+              << " ns = "    << namespaceURI << " " << &namespaceURI
+              << std::endl;
+                
 	if ( (*namespaceURI == 0) || (*prefix == 0) ) {
 		XMP_Throw ( "Empty namespace URI or prefix", kXMPErr_BadParam );
 	}

@clanmills
Copy link
Collaborator

Looks like I'm on my own with this. I've turned on the XMPsdk debugging/trace magic (which involved a little tweak concerning xmpOut and xmpCoreOut.

$ cd <exivdir>/build
$ make clean
$ cmake ..  '-DCMAKE_CXX_FLAGS=-DXMP_TraceCTorDTor=1 -DTraceXMPCalls=1' -DEXIV2_BUILD_SAMPLES=Off

And when it runs:

$ bin/exiv2 -g XX ~/Downloads/use-of-uninitialized-value-RegisterNamespace.tiff
...
WXMP calling: WXMPMeta_ParseFromBuffer_1 ( this->xmpRef, buffer, bufferSize, options, &wResult )
Entering WXMPMeta_ParseFromBuffer_1
  Auto lock, count = 1
calling RegisterNamespace adobe:ns:meta/,x uri,prefix = 0x7ffeeb5df7d0,0x7ffeeb5df7d8
prefix = x 0x7ffeeb5df798 ns = adobe:ns:meta/ 0x7ffeeb5df7a0
calling RegisterNamespace http://www.w3.org/1999/02/22-rdf-syntax-ns#,rdf uri,prefix = 0x7ffeeb5df7d0,0x7ffeeb5df7d8
prefix = rdf 0x7ffeeb5df798 ns = http://www.w3.org/1999/02/22-rdf-syntax-ns# 0x7ffeeb5df7a0
calling RegisterNamespace http://ns.adobe.com/tiff/1.0/,tiff uri,prefix = 0x7ffeeb5df7d0,0x7ffeeb5df7d8
prefix = tiff 0x7ffeeb5df798 ns = http://ns.adobe.com/tiff/1.0/ 0x7ffeeb5df7a0
  Auto unlock, count = 0
Exiting WXMPMeta_ParseFromBuffer_1
WXMP back, no error
WXMP calling: WXMPIterator_PropCTor_1 ( xmpObj.GetInternalRef(), "", "", options, &wResult );
...
Entering WXMPMeta_Terminate_1 (no lock)
XMP terminating
Exiting WXMPMeta_Terminate_1

Very puzzling. Not a squeak from XMPsdk about RegisterNamespace. I'll investigate that next.

Current Patch:

diff --git a/xmpsdk/include/XMP_Environment.h b/xmpsdk/include/XMP_Environment.h
index 75a8e7a7..d1bf121f 100644
--- a/xmpsdk/include/XMP_Environment.h
+++ b/xmpsdk/include/XMP_Environment.h
@@ -1,6 +1,12 @@
 #ifndef __XMP_Environment_h__
 #define __XMP_Environment_h__ 1
 
+#if TraceXMPCalls
+#include <stdio.h>
+#define xmpOut     stderr
+#define xmpCoreOut stderr
+#endif
+
 // =================================================================================================
 // XMP_Environment.h - Build environment flags for the XMP toolkit.
 // ================================================================
diff --git a/xmpsdk/src/ExpatAdapter.cpp b/xmpsdk/src/ExpatAdapter.cpp
index 09117c75..384d5a5e 100644
--- a/xmpsdk/src/ExpatAdapter.cpp
+++ b/xmpsdk/src/ExpatAdapter.cpp
@@ -237,6 +237,7 @@ static void SetQualName ( XMP_StringPtr fullName, XML_Node * node )
 }	// SetQualName
 
 // =================================================================================================
+#include <iostream>
 
 static void StartNamespaceDeclHandler ( void * userData, XMP_StringPtr prefix, XMP_StringPtr uri )
 {
@@ -260,6 +261,10 @@ static void StartNamespaceDeclHandler ( void * userData, XMP_StringPtr prefix, X
 	#endif
 	
 	if ( XMP_LitMatch ( uri, "http://purl.org/dc/1.1/" ) ) uri = "http://purl.org/dc/elements/1.1/";
+    std::cout << "calling RegisterNamespace " << uri << "," << prefix
+              << " " << "uri,prefix = " << &uri
+              << ","  << &prefix << std::endl
+    ;
 	XMPMeta::RegisterNamespace ( uri, prefix );
 
 }	// StartNamespaceDeclHandler
diff --git a/xmpsdk/src/XMPCore_Impl.cpp b/xmpsdk/src/XMPCore_Impl.cpp
index e3632b98..6ae5c96e 100644
--- a/xmpsdk/src/XMPCore_Impl.cpp
+++ b/xmpsdk/src/XMPCore_Impl.cpp
@@ -63,7 +63,7 @@ XMP_Mutex sXMPCoreLock;
 int sLockCount = 0;
 
 #if TraceXMPCalls
-	FILE * xmpOut = stderr;
+//	FILE * xmpOut = stderr;
 #endif
 
 void *              voidVoidPtr    = 0;	// Used to backfill null output parameters.
diff --git a/xmpsdk/src/XMPMeta.cpp b/xmpsdk/src/XMPMeta.cpp
index 30ff4a5a..2779aa9f 100644
--- a/xmpsdk/src/XMPMeta.cpp
+++ b/xmpsdk/src/XMPMeta.cpp
@@ -1034,6 +1034,7 @@ XMPMeta::SetGlobalOptions ( XMP_OptionBits /*options*/ )
 // -------------------------------------------------------------------------------------------------
 // RegisterNamespace
 // -----------------
+#include <iostream>
 
 /* class-static */ void
 XMPMeta::RegisterNamespace ( XMP_StringPtr	 namespaceURI,
@@ -1042,6 +1043,9 @@ XMPMeta::RegisterNamespace ( XMP_StringPtr	 namespaceURI,
 	if ( (*namespaceURI == 0) || (*prefix == 0) ) {
 		XMP_Throw ( "Empty namespace URI or prefix", kXMPErr_BadParam );
 	}
+    std::cout << "prefix = " << prefix       << " " << &prefix
+              << " ns = "    << namespaceURI << " " << &namespaceURI
+              << std::endl;
 	
 	XMP_VarString	nsURI ( namespaceURI );
 	XMP_VarString	prfix ( prefix );

@clanmills
Copy link
Collaborator

clanmills commented May 30, 2020

I've realised that this is another incarnation of the underlying issue in #984

The fix for #984 in 'master' is #1093 and in '0.27-maintenance' is #1125. The fix submitted submitted was "good enough" as it closed the CVE/POC as documented. However it didn't get totally to the bottom of the issue. It has something to do with XMPsdk using delete on data that we passed to him from src/properties.cpp line105. MSAN is incorrectly reporting "uninitialised".

As I discovered yesterday, if I throw an XMP_Error() when namespaces are duplicated, it causes trouble in the test suite. There have been big improvements in XMPsdk's namespace handling and it now handles duplication by appending _n_ to the prefix. I should build with a newer XMPsdk and see if the test suite runs smoothly.

For v0.28, we should consider removing xmpsdk from Exiv2 and get conan to build and link xmpsdk. I haven't had success yet with conan on MinGW and Cygwin. More work ahead. #1224

I've run out of time for v0.27.3 RC2 to work on this.

I'm going to revisit #984. I ran out of time in September 2019 while working on Exiv2 v0.27.3 (which was to be released on 2019-09-30). If I can get that "properly" fixed, it's likely the fix for that will also fix this issue. If I'm successful in really fixing #984, I'll give serious thought to releasing Exiv2 v0.27.3 RC3.

@clanmills clanmills removed this from the v0.27.3 milestone May 30, 2020
@henices
Copy link
Author

henices commented Jun 2, 2020

Sorry for the delay I was OOO these days. We can build msan with the following steps

mkdir build_msan
cd build_msan
export CC=/home/henices/clang/bin/clang
export CXX=/home/henices/clang/bin/clang++
cmake .. -G "Unix Makefiles" -DCMAKE_CXX_FLAGS="-fsanitize=memory -fsanitize-memory-track-origins=2 -fno-omit-frame-pointer -stdlib=libc++ -g" -DCMAKE_C_FLAGS="-fsanitize=memory -fsanitize-memory-track-origins=2 -fno-omit-frame-pointer -stdlib=libc++ -g" -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=memory -stdlib=libc++ -L/home/henices/clang/msan/lib -Wl,-rpath,/home/henices/clang/msan/lib" -DCMAKE_MODULE_LINKER_FLAGS="-fsanitize=memory -stdlib=libc++ -L/home/henices/clang/msan/lib -Wl,-rpath,/home/henices/clang/msan/lib"
make -j4

Be aware of the fact, we link with a msan instrument build libc++.so

➜  bin ldd ./exiv2    
        linux-vdso.so.1 (0x00007ffe185b3000)
        libexiv2.so.27 => /home/henices/tests/exiv2/build_msan/lib/libexiv2.so.27 (0x00007fea0ad11000)
        libc++.so.1 => /home/henices/clang/msan/lib/libc++.so.1 (0x00007fea0aadc000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fea0a94e000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fea0a92c000)
        librt.so.1 => /lib64/librt.so.1 (0x00007fea0a921000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fea0a91a000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fea0a8fe000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fea0a735000)
        libz.so.1 => /lib64/libz.so.1 (0x00007fea0a71b000)
        libexpat.so.1 => /lib64/libexpat.so.1 (0x00007fea0a6ed000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fea0c64b000)

@henices
Copy link
Author

henices commented Jun 2, 2020

according the page https://clang.llvm.org/docs/MemorySanitizer.html

MSAN only support os: Linux NetBSD and FreeBSD

@D4N
Copy link
Member

D4N commented Jun 2, 2020 via email

@henices
Copy link
Author

henices commented Jun 2, 2020

@D4N
Copy link
Member

D4N commented Jun 3, 2020 via email

@henices
Copy link
Author

henices commented Jun 3, 2020

Thanks for the links! So does that mean that only your libc++ is instrumented?

On June 2, 2020 5:47:50 AM UTC, henices @.***> wrote: oss-fuzz provide a shell script: https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-clang/checkout_build_install_llvm.sh

Yes, only libc++ is instrumented.

@D4N
Copy link
Member

D4N commented Jun 3, 2020 via email

@henices
Copy link
Author

henices commented Jun 3, 2020

I try to build a msan instrument libexpat.so and then link it to exiv2 binary, the use-of-uninitialized-value error was gone.

so it is a false positive, sorry for the mistake.

@henices henices closed this as completed Jun 3, 2020
@D4N
Copy link
Member

D4N commented Jun 3, 2020 via email

@clanmills
Copy link
Collaborator

Wow! This is incredible. For sure, @D4N has learned some new tricks here.

I'm motivated to return to #984. I think that's caused by the namespace duplication issue which is confronted in more recent XMPsdk. It's worth investigating if the "deep down" reason is solved with the latest Adobe code. That would be a very strong motivator to remove the xmpsdk tree in 0.28 and rely on conan to deliver the library.

Sometimes doing less is ...... more! I don't know the history behind the xmpsdk source tree in Exiv2. It was there when I joined the project in 2008. Andreas said to me "everything concerning xmpsdk is a pain in the butt!".

Great Work, Guys. Thank You.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants