_sources/registries/fdd/fddXML/fdd000259.xml

<?xml version="1.0" encoding="UTF-8"?>
<fdd:FDD id="fdd000259" titleName="Speex Audio Codec, Version 1.2" shortName="SPX_1_2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:fdd="http://www.loc.gov/preservation/digital/formats/schemas/fdd/v1" xsi:schemaLocation="http://www.loc.gov/preservation/digital/formats/schemas/fdd/v1 http://www.loc.gov/preservation/digital/formats/schemas/fdd/v1/fdd-v1-2.xsd">
	<fdd:properties>
		<fdd:gdfrGenreSelection>
			<fdd:gdfrGenre>sound</fdd:gdfrGenre>
		</fdd:gdfrGenreSelection>
		<fdd:formatCategories>
			<fdd:category>encoding</fdd:category>
		</fdd:formatCategories>
		<fdd:gdfrComposition>unitary</fdd:gdfrComposition>
		<fdd:gdfrForm>binary</fdd:gdfrForm>
		<fdd:gdfrBasis>sampled</fdd:gdfrBasis>
		<fdd:updates>
			<fdd:date>2008-02-19</fdd:date>
		</fdd:updates>
		<fdd:draftStatus>Full</fdd:draftStatus>
	</fdd:properties>
	<fdd:identificationAndDescription>
		<fdd:fullName>Speex Audio Codec, Version 1.2</fdd:fullName>
		<fdd:keywords>
			<fdd:keyword>audio formats</fdd:keyword>
			<fdd:keyword>codecs</fdd:keyword>
		</fdd:keywords>
		<fdd:description>Speech codec designed for packet networks and voice over IP (VoIP) applications but not for mobile phones. File-based compression is also supported. The flexible codec is based on Code Excited Linear Prediction (CELP) and supports a wide range of speech quality and bit-rates. The VoIP-oriented design means that Speex is robust to lost but not to corrupted packets. Because Speex is targeted at a wide range of devices, its memory footprint is modest and its complexity, which is variable, may also be modest.</fdd:description>
		<fdd:shortDescription>Speech codec designed for packet networks and voice over IP (VoIP) application but not for mobile phones. File-based compression is also supported. The flexible codec is based on Code Excited Linear Prediction (CELP) and supports a wide range of speech quality and bit-rates.</fdd:shortDescription>
		<fdd:productionPhase>Generally used for final-state, end-user delivery.</fdd:productionPhase>
		<fdd:relationships>
			<fdd:relationship>
				<fdd:typeOfRelationship>Used by</fdd:typeOfRelationship>
				<fdd:relatedTo>
					<fdd:id>fdd000260</fdd:id>
					<fdd:shortName>Ogg_SPX</fdd:shortName>
					<fdd:titleName>Ogg Speex Audio Format</fdd:titleName>
				</fdd:relatedTo>
			</fdd:relationship>
			<fdd:relationship>
				<fdd:typeOfRelationship>Used by</fdd:typeOfRelationship>
				<fdd:relatedTo>
					<fdd:id>fdd000608</fdd:id>
					<fdd:shortName>NSV</fdd:shortName>
					<fdd:titleName>Nullsoft Streaming Video</fdd:titleName>
				</fdd:relatedTo>
			</fdd:relationship>
			<fdd:relationship>
				<fdd:typeOfRelationship>Affinity to</fdd:typeOfRelationship>
				<fdd:relatedTo>
					<fdd:titleName>CELP, Code Excited Linear Prediction</fdd:titleName>
				</fdd:relatedTo>
				<fdd:comment>Not documented at this Web site at this time.</fdd:comment>
			</fdd:relationship>
		</fdd:relationships>
	</fdd:identificationAndDescription>
	<fdd:localUse>
		<fdd:experience>In 2007, consideration was being given to the use of <fddLink id="fdd000260">Ogg_SPX</fddLink> for service copies of oral history recordings for access via the Web.</fdd:experience>
		<fdd:preference>See the Library of Congress Recommended Formats Statement for format preferences for <a href="https://www.loc.gov/preservation/resources/rfs/audio.html">audio works</a>.</fdd:preference>
	</fdd:localUse>
	<fdd:sustainabilityFactors>
		<fdd:disclosure>Fully documented. Developed by <a href="http://www.xiph.org">xiph</a> as an open source and patent-free project.</fdd:disclosure>
		<fdd:documentation>
			<a href="http://www.speex.org/docs/manual/speex-manual.pdf">The Speex Codec Manual</a>, Version 1.2 Beta 2, May 22, 2007.</fdd:documentation>
		<fdd:adoption>See <fddLink id="fdd000026">Ogg</fddLink>.</fdd:adoption>
		<fdd:licensingAndPatents>The <a href="http://www.speex.org/docs/manual/speex-manual.pdf">specification</a> provides the license in Appendix D. It is inspired by the BSD (Berkeley Software Distribution) family of free, near-public-domain software licenses. Paraphrasing appendix D: redistributions of source code or binary versions are free but must retain the copyright notice and other wording; the name of the Xiph.org Foundation or of contributors may not be used to endorse or promote products without specific prior written permission.</fdd:licensingAndPatents>
		<fdd:transparency>Encoding depends upon algorithms and tools to read; requires sophistication to build tools.</fdd:transparency>
		<fdd:selfDocumentation>See <fddLink id="fdd000026">Ogg</fddLink>.</fdd:selfDocumentation>
		<fdd:externalDependencies>None.</fdd:externalDependencies>
		<fdd:techProtection>See <fddLink id="fdd000026">Ogg</fddLink>.</fdd:techProtection>
	</fdd:sustainabilityFactors>
	<fdd:qualityAndFunctionalityFactors>
		<fdd:soundQF>
			<fdd:normalSound>Good support.</fdd:normalSound>
			<fdd:fidelity>This is compression designed for comprehensible speech, not for a rich representation of a full audio spectrum and dynamic range.  Paraphrased from the <a href="http://www.speex.org/docs/manual/speex-manual.pdf">specification</a>: CELP was selected as the encoding technique; it scales well to both low bit-rates (e.g. DoD CELP @ 4.8 kbps) and high bit-rates (e.g. G.728 @ 16 kbps).  Speex is designed for
three different sampling rates: 8 kHz, 16 kHz, and 32 kHz, referred to as narrowband (telephone quality), wideband, and ultra-wideband. The encoding process is controlled most of the time by a quality parameter that ranges from 0 to 10. In constant bit-rate (CBR) operation, the quality parameter is an integer, while for variable bit-rate (VBR), the parameter is a float.  There is also <i>Average Bit Rate</i> that dynamically adjusts VBR quality in order to meet a specific target bit-rate.  The management of bit-rate is important in VoIP, where the maximum must be low enough for the communication channel.</fdd:fidelity>
			<fdd:channels>Provides <i>intensity stereo coding</i>.<footNote id="1"/>
			</fdd:channels>
			<fdd:samples>None.</fdd:samples>
			<fdd:beyondSound>Not investigated at this time.</fdd:beyondSound>
		</fdd:soundQF>
	</fdd:qualityAndFunctionalityFactors>
	<fdd:fileTypeSignifiers>
		<fdd:signifiersGroup>
			<fdd:internetMediaType>
				<fdd:sigValues>
					<fdd:sigValue>audio/x-speex</fdd:sigValue>
				</fdd:sigValues>
				<fdd:note>  For Speex-in-Ogg, from the main part of the <a href="http://www.speex.org/docs/manual/speex-manual.pdf">specification</a>.  However, the recommended practice now seems to use the codecs parameter as described in   <a href="http://www.rfc-editor.org/rfc/rfc5334.txt">RFC 5334</a>.</fdd:note>
			</fdd:internetMediaType>
			<fdd:internetMediaType>
				<fdd:sigValues>
					<fdd:sigValue>audio/ogg; codecs=speex

</fdd:sigValue>
				</fdd:sigValues>
				<fdd:note>From <a href="http://www.rfc-editor.org/rfc/rfc5334.txt">RFC 5334</a>.</fdd:note>
			</fdd:internetMediaType>
			<fdd:internetMediaType>
				<fdd:sigValues>
					<fdd:sigValue>audio/speex</fdd:sigValue>
				</fdd:sigValues>
				<fdd:note>Proposed in various drafts of <i>RTP Payload Format for the Speex Codec</i>. e.g., http://www.ietf.org/internet-drafts/draft-ietf-avt-rtp-speex-05.txt (obsolete and hence not actively linked from this format description).</fdd:note>
			</fdd:internetMediaType>
			<fdd:other>
				<fdd:tag>Other</fdd:tag>
				<fdd:values>
					<fdd:sigValues>
						<fdd:sigValue>Speex   </fdd:sigValue>
					</fdd:sigValues>
					<fdd:note>Ogg Codec Identifier.  An 8-character string, with 3 trailing spaces, used within <fddLink id="fdd000026">Ogg</fddLink> container, at beginning of first header page to identify codec.  See <a href="http://www.rfc-editor.org/rfc/rfc5334.txt">IETF RFC 5334</a>
					</fdd:note>
				</fdd:values>
			</fdd:other>
			<fdd:other>
				<fdd:tag>Microsoft FOURCC</fdd:tag>
				<fdd:values>
					<fdd:sigValues>
						<fdd:sigValue>spex</fdd:sigValue>
					</fdd:sigValues>
					<fdd:note>From <a href="http://wiki.xiph.org/index.php/Oggless#Codec_Identifier:">http://wiki.xiph.org/index.php/Oggless#Codec_Identifier:</a>
					</fdd:note>
				</fdd:values>
			</fdd:other>
			<fdd:other>
				<fdd:tag>Other</fdd:tag>
				<fdd:values>
					<fdd:sigValues>
						<fdd:sigValue>speex</fdd:sigValue>
					</fdd:sigValues>
					<fdd:note>Codec identier, long ID. From <a href="http://wiki.xiph.org/index.php/Oggless#Codec_Identifier:">http://wiki.xiph.org/index.php/Oggless#Codec_Identifier:</a>
					</fdd:note>
				</fdd:values>
			</fdd:other>
		</fdd:signifiersGroup>
	</fdd:fileTypeSignifiers>
	<fdd:formatSpecifications>
		<fdd:urls>
			<fdd:url>
				<fdd:urlReference>
					<link>http://www.speex.org/docs/manual/speex-manual.pdf</link>
					<tag>The Speex Codec Manual, Version 1.2 Beta 2, May 22, 2007</tag>
				</fdd:urlReference>
			</fdd:url>
		</fdd:urls>
	</fdd:formatSpecifications>
	<fdd:usefulReferences>
		<fdd:urls>
			<fdd:url>
				<fdd:urlReference>
					<link>http://wiki.xiph.org/index.php/Main_Page</link>
					<tag>Xiph wiki</tag>
				</fdd:urlReference>
			</fdd:url>
			<fdd:url>
				<fdd:urlReference>
					<link>http://www.ietf.org/rfc/rfc5334.txt</link>
					<tag>IETF RFC 5334</tag>
					<comment>Introduces the use of the codecs parameter to distinguish encodings within the Ogg container format.</comment>
				</fdd:urlReference>
			</fdd:url>
			<fdd:url>
				<fdd:urlReference>
					<link>http://wiki.xiph.org/index.php/Oggless</link>
					<tag>Oggless</tag>
					<comment>Documentation for xiph codecs in containers other than Ogg.</comment>
				</fdd:urlReference>
			</fdd:url>
		</fdd:urls>
	</fdd:usefulReferences>
	<fdd:footNotes>
		<fdd:footNote id="1">
			<fdd:text>
				<i>Intensity stereo</i> as explained in the Wikipedia article <a href="http://en.wikipedia.org/wiki/Joint_%28audio_engineering%29">Joint (audio engineering)</a> (consulted August 24, 2007): &quot;More specifically, the dominance of inter-aural time differences (ITD) for sound localization by humans is only given for lower frequencies. That leaves inter-aural amplitude differences (IAD) as the dominant location indicator for higher frequencies. The idea of intensity stereo coding is to merge the upper spectrum into just one channel (thus reducing overall differences between channels) and to transmit a little side information about how to pan certain frequency regions to recover the IAD cues.&quot;</fdd:text>
		</fdd:footNote>
	</fdd:footNotes>
</fdd:FDD>