This is a w3c Schema "Technical Implementation" of the DDI Conceptual Specification. This schema is intended for use in producing electronic versions of codebooks for quantitative social science data. Please note that the attribute xml-lang in the a.globals group is an error that was persisted to retain backward compatibility. DO NOT USE THIS ATTRIBUTE. If this attribute has been used, transfer the content to xml:lang. DO NOT USE THIS ATTRIBUTE. Its inclusion is an error that was persisted to retain backward compatibility. If this attribute has been used, transfer the content to xml:lang. Captures version of the element Indicates version date for the element. Use YYYY-MM-DD, YYYY-MM, or YYYY formats. Used to capture the DDI-Lifecycle type URN for the element. This may be captured during translation from DDI-Lifecycle to DDI-Codebook structure or in preparation for transferring to a DDI-Lifecycle structure. Used to capture the DDI-Codebook type URN for the element. This is used to assign a DDI-Codebook specific URN to the element, according the format prescribed by the DDI-Codebook standard. Base Element Type Description This type forms the basis for all elements. Every element may contain the attributes defined the GLOBALS attribute group. Abstract Text Type Description This type forms the basis for all textual elements. Textual elements may contain text or a mix of select elements. This type is abstract and is refined by more specific types which will limit the allowable elements and attributes. Any textual element will be a subset of this type and can be processed as such. Simple Text Type Description This type forms the basis of most textual elements. Elements using this type may have mixed content (text and child elements). The child elements are from the PHRASE, FORM, and xhtml:BlkNoForm.mix (a specific subset of XHTML) content groups. Note that if elements from the PHRASE and FORM groups must not be used with elements from the xhtml:BlkNoForm.mix group; one can use either elements from xhtml:BlkNoForm.mix or elements from the PHRASE and FORM groups. This type is extended in some cases to include additional attributes. Conceptual Text Type Description This type forms this basis for a textual element which may also provide for a conceptual (see concept) description of the element a longer description (see txt). If the concept and/or txt elements are used, then the element should contain no other child elements or text. Note that if elements from the PHRASE and FORM groups must not be used with elements from the xhtml:BlkNoForm.mix group; one can use either elements from xhtml:BlkNoForm.mix or elements from the PHRASE and FORM groups. Phrase Type Description This type restricts the simpleTextType to allow for only child elements from the PHRASE content group. It still allows for mixed content (text and child elements). String Type Description This type restricts the base abstractTextType to only allow for text (i.e. no child elements). Integer Type Description This type restricts the base abstractTextType to only allow for an integer as text content. No child elements are allowed. Date Simple Type Description This simple type is a union of the various XML Schema date formats. Using this type, a date can be expressed as a year (YYYY), a year and month (YYYY-MM), a date (YYYY-MM-DD) or a complete date and time (YYYY-MM-DDThh:mm:ss). All of these formats allow for an optional timezone offset to be specified. Date Type Description This type restricts the base abstractTextType to allow for only the union of types defined in dateSimpleType as text content. No child elements are allowed. External Link Description This element permits encoders to provide links from any arbitrary element containing ExtLink as a subelement to electronic resources outside the codebook. Link Description This element permits encoders to provide links from any arbitrary element containing Link as a subelement to other elements in the codebook. Form Type Description This type defines the basis for all elements in the FORM content group. This is derived from the abstractTextType. The content may still be mixed (text and child elements), but the child elements are restricted to be those from the PHRASE and FORM content groups, or the itm and label elements. Further, the possible attributes are restricted. This type is abstract, so specific form elements will further refine this type, but all elements in the FORM content group will conform to this structure and may be processed as such. Division Description Formatting element: marks a subdivision in a text. Emphasis Description Formatting element: marks words or phrases that are emphasized for rhetorical effect. Head Description Formatting element: marks off a heading to a division, list, etc. Highlight Description Formatting element: marks a word or phrase as graphically distinct from the surrounding text, while making no claim for the reasons. List Description Formatting element: contains any sequence of items (entries) organized as a list. Item Description Formatting element: marks entries (items) in a list. Label Description Formatting element: contains the label associated with an item in a list; in glossaries, marks the term being defined. Label Description A short description of the parent element. Attribute "level" indicates the level to which the element applies (variable group, nCube group, variable, etc.). The "vendor" attribute allows for specification of different labels for use with different vendors' software. Attribute "country" allows specification of a different label by country for the same element to which it applies. Attribute "sdatrefs" allows pointing to specific dates, universes, or other information encoded in the study description. The attributes "country" and "sdatrefs" are intended to cover instances of comparative data, by retaining consistency in some elements over time and geography, but altering, as appropriate, information pertaining to date, language, and/or location. Example Person (A) Record ]]> Study Procedure Information ]]> Political Involvement and National Goals ]]> Household Variable Section ]]> Sex by Work Experience in 1999 by Income in 1999 ]]> Tenure by Age of Householder ]]> Why No Holiday-No Money ]]> Other Agricultural and Related Occupations ]]> Better ]]> About the same ]]> Inap. ]]> Age by Sex by Poverty Status ]]> SAS Data Definition Statements for ICPSR 6837 ]]> Paragraph Description Marks a paragraph. Abstract Description An unformatted summary describing the purpose, nature, and scope of the data collection, special characteristics of its contents, major subject areas covered, and what questions the PIs attempted to answer when they conducted the study. A listing of major variables in the study is important here. In cases where a codebook contains more than one abstract (for example, one might be supplied by the data producer and another prepared by the data archive where the data are deposited), the "source" and "date" attributes may be used to distinguish the abstract versions. Maps to Dublin Core Description element. Inclusion of this element in the codebook is recommended. The "date" attribute should follow ISO convention of YYYY-MM-DD. The contentType attribute provides forward-compatibility with DDI 3 by describing where the content fits in that structure, or if is mixed in terms of what is contained. Example Data on labor force activity for the week prior to the survey are supplied in this collection. Information is available on the employment status, occupation, and industry of persons 15 years old and over. Demographic variables such as age, sex, race, marital status, veteran status, household relationship, educational background, and Hispanic origin are included. In addition to providing these core data, the May survey also contains a supplement on work schedules for all applicable persons aged 15 years and older who were employed at the time of the survey. This supplement focuses on shift work, flexible hours, and work at home for both main and second jobs. ]]> Location of Data Collection Description Location where the data collection is currently stored. Use the URI attribute to provide a URN or URL for the storage site or the actual address from which the data may be downloaded. Actions to Minimize Losses Description Summary of actions taken to minimize data loss. Includes information on actions such as follow-up visits, supervisory checks, historical matching, estimation, etc. Example To minimize the number of unresolved cases and reduce the potential nonresponse bias, four follow-up contacts were made with agencies that had not responded by various stages of the data collection process. ]]> Alternative Title Description A title by which the work is commonly referred, or an abbreviation of the title. Data Appraisal Description Information on data appraisal. Unit of Analysis Description Basic unit of analysis or observation that the file describes: individuals, families/households, groups, institutions/organizations, administrative units, etc. The "unit" attribute is included to permit the development of a controlled vocabulary for this element. Example individuals ]]> Analysis Unit Description Provides information regarding whom or what the variable/nCube describes. The element may be repeated only to support multiple language expressions of the content. Example This variable reports election returns at the constituency level. ]]> Household ]]> Authoring Entity/Primary Investigator Description The person, corporate body, or agency responsible for the work's substantive and intellectual content. Repeat the element for each author, and use "affiliation" attribute if available. Invert first and last name and use commas. Author of data collection (codeBook/stdyDscr/citation/rspStmt/AuthEnty) maps to Dublin Core Creator element. Inclusion of this element in codebook is recommended. The "author" in the Document Description should be the individual(s) or organization(s) directly responsible for the intellectual content of the DDI version, as distinct from the person(s) or organization(s) responsible for the intellectual content of the earlier paper or electronic edition from which the DDI edition may have been derived. Example United States Department of Commerce. Bureau of the Census ]]> Rabier, Jacques-Rene ]]> Availability Status Description Statement of collection availability. An archive may need to indicate that a collection is unavailable because it is embargoed for a period of time, because it has been superseded, because a new edition is imminent, etc. It is anticipated that a controlled vocabulary will be developed for this element. Example This collection is superseded by CENSUS OF POPULATION, 1880 [UNITED STATES]: PUBLIC USE SAMPLE (ICPSR 6460). ]]> Backflow Description Contains a reference to IDs of possible preceding questions. The "qstn" IDREFS may be used to specify the question IDs. Example For responses on a similar topic, see questions 12-15. ]]> ]]> Bibliographic Citation Description Complete bibliographic reference containing all of the standard elements of a citation that can be used to cite the work. The "format" attribute is provided to enable specification of the particular citation style used, e.g., APA, MLA, Chicago, etc. Example Rabier, Jacques-Rene, and Ronald Inglehart. EURO-BAROMETER 11: YEAR OF THE CHILD IN EUROPE, APRIL 1979 [Codebook file]. Conducted by Institut Francais D'Opinion Publique (IFOP), Paris, et al. ICPSR ed. Ann Arbor, MI: Inter-university Consortium for Political and Social Resarch [producer and distributor], 1981. ]]> Geographic Bounding Polygon Description This field allows the creation of multiple polygons to describe in a more detailed manner the geographic area covered by the dataset. It should only be used to define the outer boundaries of a covered area. For example, in the United States, such polygons can be created to define boundaries for Hawaii, Alaska, and the continental United States, but not interior boundaries for the contiguous states. This field is used to refine a coordinate-based search, not to actually map an area. If the boundPoly element is used, then geoBndBox MUST be present, and all points enclosed by the boundPoly MUST be contained within the geoBndBox. Elements westBL, eastBL, southBL, and northBL of the geoBndBox should each be represented in at least one point of the boundPoly description. Example Nevada State ]]> 42.002207 -120.005729004 42.002207 -114.039663 35.9 -114.039663 36.080 -114.544 35.133 -114.542 35.00208499998 -114.63288 35.00208499998 -114.63323 38.999 -120.005729004 42.002207 -120.005729004 ]]> Norway ]]> 80.76416 33.637497 80.76416 10.2 62.48395 4.789583 57.987915 4.789583 57.987915 11.8 61.27794 13.2336 63.19012 13.2336 67.28615 17.24580 68.14297 21.38362 68.14297 25.50054 69.39685 27.38137 68.76991 28.84424 68.76991 31.31021 71.42 31.31021 71.42 33.637497 80.76416 33.637497 ]]> Number of cases / Record Quantity Description Number of cases or observations. Example 1011 ]]> Category Level Statistic Description May include frequencies, percentages, or crosstabulation results. This field can contain one of the following: 1. textual information (e.g., PCDATA), or 2. non-parseable character data (e.g., the statistics), or 3. some other form of external information (table, image, etc.) In case 1, the tag can be used to mark up character data; tables can also be included in the actual markup. In cases 2 or 3, the element can be left empty and the "URI" attribute used to refer to the external object containing the information. The attribute "type" indicates the type of statistics presented - frequency, percent, or crosstabulation. If a value of "other" is used for this attribute, the "otherType" attribute should take a value from a controlled vocabulary. This option should only be used when applying a controlled vocabulary to this attribute. Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. Example 256 ]]> Category Value Description The explicit response. Example 9 ]]> Category Level Description Used to describe the levels of the category hierarchy. Note that we do not indicate nesting levels or roll-up structures here. This is done to be able to support ragged hierarchies. A category level may be linked to one or more maps of the variable content. This id done by referencing the IDs of the appropriate geoMap elements in the attribute geoMap. Example ]]> ]]> ]]> Category Description A description of a particular response. The attribute "missing" indicates whether this category group contains missing data or not. The attribute "missType" is used to specify the type of missing data, e.g., inap., don't know, no answer, etc. The attribute "country" allows for the denotation of country-specific category values. The "sdatrefs" attribute records the ID values of all elements within the summary data description that apply to this category. The exclusiveness attribute ("excls") should be set to "false" if the category can appear in more than one place in the classification hierarchy. The attribute "catgry" is an IDREF referencing any child categories of this category element. Used to capture nested hierarchies of categories. The attribute "level" is an IDREF referencing the catLevel ID in which this category exists. Example ]]> ]]> ]]> 0 Management, professional and related occupations ]]> 01 Management occupations ]]> 011 Top executives ]]> 012 Financial managers ]]> Category Group Description A description of response categories that might be grouped together. The attribute "missing" indicates whether this category group contains missing data or not. The attribute "missType" is used to specify the type of missing data, e.g., inap., don't know, no answer, etc. The attribute catGrp is used to indicate all the subsidiary category groups which nest underneath the current category group. This allows for the encoding of a hierarchical structure of category groups. The "levelno" attribute allows the addition of a level number, and "levelnm" allows the addition of a level name to the category group. The completeness attribute ("compl") should be set to "false" if the category group is incomplete (not a complete aggregate of all sub-nodes or children). The exclusiveness attribute ("excls") should be set to "false" if the category group can appear in more than one place in the classification hierarchy. Citation Requirement Description Text of requirement that a data collection should be cited properly in articles or other publications that are based on analysis of the data. Example Publications based on ICPSR data collections should acknowledge those sources by means of bibliographic citations. To ensure that such source attributions are captured for social science bibliographic utilities, citations must appear in footnotes or in the reference section of publications. ]]> Bibliographic Citation Description This element encodes the bibliographic information for the work at the level specified: (1) Document Description, Citation (of Marked-up Document), (2) Document Description, Citation (of Marked-up Document Source), (3) Study Description, Citation (of Study), (4) Study Description, Other Material, and (5) Other Material for the study itself. Bibliographic information includes title information, statement of responsibility, production and distribution information, series and version information, text of a preferred bibliographic citation, and notes (if any). A MARCURI attribute is provided to link to the MARC record for the citation. Cleaning Operations Description Methods used to "clean" the data collection, e.g., consistency checking, wild code checking, etc. The "agency" attribute permits specification of the agency doing the data cleaning. Example Checks for undocumented codes were performed, and data were subsequently revised in consultation with the principal investigator. ]]> Coder Instructions Description Any special instructions to those who converted information from one form to another for a particular variable. This might include the reordering of numeric information into another form or the conversion of textual information into numeric information. Example Use the standard classification tables to present responses to the question: What is your occupation? into numeric codes. ]]> This should be used for materials that are primarily descriptions of the content and use of the study, such as appendices, sampling information, weighting details, methodological and technical details, publications based upon the study content, related studies or collection of studies, etc. This section is intended to include or to link to materials used in the production of the study or useful in the analysis of the study. Codebook Description Every element in the DDI DTD/Schema has the following attributes: ID - This uniquely identifies each element. xml-lang - Use of this attribute is deprecated, and it will no longer be supported in the next major version of the DDI specification. For newly created XML documents, please use xml:lang. xml:lang - This attribute specifies the language used in the contents and attribute values of any element in the XML document. Use of ISO (www.iso.org) language codes is recommended. source - This attribute identifies the source that provided information in the element. If the documentation contains two differing sets of information on Sampling Procedure -- one provided by the data producer and one by the archive where the data is deposited -- this information can be distinguished through the use of the source attribute. Note also that the DDI contains a linking mechanism permitting arbitrary links between internal elements (See Link) and from internal elements to external sources (See ExtLink). The top-level element, codeBook, also includes a version attribute to specify the version number of the DDI specification. codeBookAgency - This attribute holds the agency name of the creator or maintainer of the codeBook instance as a whole, and is designed to support forward compatibility with DDI-Lifecycle. Recommend the agency name as filed with the DDI Agency ID Registry with optional additional sub-agency extensions. Cohort Description The element cohort is used when the nCube contains a limited number of categories from a particular variable, as opposed to the full range of categories. The attribute "catRef" is an IDREF to the actual category being used. The attribute "value" indicates the actual value attached to the category that is being used. Example ]]> Date of Collection Description Contains the date(s) when the data were collected. Use the event attribute to specify "start", "end", or "single" for each date entered. The ISO standard for dates (YYYY-MM-DD) is recommended for use with the "date" attribute. The "cycle" attribute permits specification of the relevant cycle, wave, or round of data. Maps to Dublin Core Coverage element. Inclusion of this element in the codebook is recommended. Example 10 November 1998 ]]> Mode of Data Collection Description The method used to collect the data; instrumentation characteristics. XHTML formatting may be used in the txt element for forward-compatibility with DDI 3. Example telephone interviews ]]> face-to-face interviews ]]> mail questionnaires ]]> computer-aided telephone interviews (CATI) ]]> Characteristics of Data Collection Situation Description Description of noteworthy aspects of the data collection situation. Includes information on factors such as cooperativeness of respondents, duration of interviews, number of call-backs, etc. Example There were 1,194 respondents who answered questions in face-to-face interviews lasting approximately 75 minutes each. ]]> Extent of Collection Description Summarizes the number of physical files that exist in a collection, recording the number of files that contain data and noting whether the collection contains machine-readable documentation and/or other supplementary files and information such as data dictionaries, data definition statements, or data collection instruments. Example 1 data file + machine-readable documentation (PDF) + SAS data definition statements ]]> Column Specification Completeness of Study Stored Description This item indicates the relationship of the data collected to the amount of data coded and stored in the data collection. Information as to why certain items of collected information were not included in the data file stored by the archive should be provided. Example Because of embargo provisions, data values for some variables have been masked. Users should consult the data definition statements to see which variables are under embargo. A new version of the collection will be released by ICPSR after embargoes are lifted. ]]> Concept Description The general subject to which the parent element may be seen as pertaining. This element serves the same purpose as the keywords and topic classification elements, but at the data description level. The "vocab" attribute is provided to indicate the controlled vocabulary, if any, used in the element, e.g., LCSH (Library of Congress Subject Headings), MeSH (Medical Subject Headings), etc. The "vocabURI" attribute specifies the location for the full controlled vocabulary. Example Income ]]> more experience ]]> Income ]]> SF: 311-312 draft horses ]]> Conditions Description Indicates any additional information that will assist the user in understanding the access and use conditions of the data collection. Example The data are available without restriction. Potential users of these datasets are advised, however, to contact the original principal investigator Dr. J. Smith (Institute for Social Research, The University of Michigan, Box 1248, Ann Arbor, MI 48106), about their intended uses of the data. Dr. Smith would also appreciate receiving copies of reports based on the datasets. ]]> Confidentiality Declaration Description This element is used to determine if signing of a confidentiality declaration is needed to access a resource. The "required" attribute is used to aid machine processing of this element, and the default specification is "yes". The "formNo" attribute indicates the number or ID of the form that the user must fill out. The "URI" attribute may be used to provide a URN or URL for online access to a confidentiality declaration form. Example To download this dataset, the user must sign a declaration of confidentiality. ]]> To obtain this dataset, the user must complete a Restricted Data Use Agreement. ]]> Contact Persons Description Names and addresses of individuals responsible for the work. Individuals listed as contact persons will be used as resource persons regarding problems or questions raised by the user community. The URI attribute should be used to indicate a URN or URL for the homepage of the contact individual. The email attribute is used to indicate an email address for the contact individual. Example Jane Smith ]]> Control Operations Description Methods to facilitate data control performed by the primary investigator or by the data archive. Specify any special programs used for such operations. The "agency" attribute maybe used to refer to the agency that performed the control operation. Example Ten percent of data entry forms were reentered to check for accuracy. ]]> Controlled Vocabulary Used Description Provides a code value, as well as a reference to the code list from which the value is taken. Note that the CodeValue can be restricted to reference an enumeration. Code List ID Description Identifies the code list that the value is taken from. Example TimeMethod ]]> Code List Name Description Identifies the code list that the value is taken from with a human-readable name. Example Time Method]]> Code List Agency Name Description Agency maintaining the code list. Example DDI Alliance]]> Code List Version ID Description Version of the code list. (Default value is 1.0) Example 1.1]]> Code List URN Description Identifies the code list that the value is taken from with a URN. Example urn:ddi-cv:TimeMethod:1.1]]> Code List Scheme URN Description Identifies the code list scheme using a URN. Example http://www.ddialliance.org/Specification/DDI-CV/TimeMethod_1.1_Genericode1.0_DDI-CVProfile1.0.xml]]> Usage Description Defines where in the instance the controlled vocabulary which is identified is utilized. A controlled vocabulary may occur either in the content of an element or in an attribute on an element. The usage can either point to a collection of elements using an XPath via the selector element or point to a more specific collection of elements via their identifier using the specificElements element. If the controlled vocabulary occurs in an attribute within the element, the attribute element identifies the specific attribute. When specific elements are specified, an authorized code value may also be provided. If the current value of the element or attribute identified is not in the controlled vocabulary or is not identical to a code value, the authorized code value identifies a valid code value corresponding to the meaning of the content in the element or attribute. Selector Description Identifies a collection of elements in which a controlled vocabulary is used. This is a simplified XPath which must correspond to the actual instance in which it occurs, which is to say that the fully qualified element names here must correspond to those in the instance. This XPath can only identify elements and does not allow for any predicates. The XPath must either be rooted or deep. Example /codeBook/stdyDscr/method/dataColl/timeMeth]]> Specific Elements Description Identifies a collection of specific elements via their identifiers in the refs attribute, which allows for a tokenized list of identifier values which must correspond to identifiers which exist in the instance. The authorizedCodeValue attribute can be used to provide a valid code value corresponding to the meaning of the content in the element or attribute when the identified element or attribute does not use an actual valid value from the controlled vocabulary. Example ]]> Attribute Description Identifies an attribute within the element(s) identified by the selector or specificElements in which the controlled vocabulary is used. The fully qualified name used here must correspond to that in the instance, which is to say that if the attribute is namespace qualified, the prefix used here must match that which is defined in the instance. Example type]]> Copyright Description Copyright statement for the work at the appropriate level. Copyright for data collection (codeBook/stdyDscr/citation/prodStmt/copyright) maps to Dublin Core Rights. Inclusion of this element is recommended. Example Copyright(c) ICPSR, 2000 ]]> Description This is an empty element containing only the attributes listed below. It is used to identify the coordinates of the data item within a logical nCube describing aggregate data. CubeCoord is repeated for each dimension of the nCube giving the coordinate number ("coordNo") and coordinate value ("coordVal"). Coordinate value reference ("cordValRef") is an ID reference to the variable that carries the coordinate value. The attributes provide a complete coordinate location of a cell within the nCube. Example ]]> ]]> ]]> Data Access Description This section describes access conditions and terms of use for the data collection. In cases where access conditions differ across individual files or variables, multiple access conditions can be specified. The access conditions applying to a study, file, variable group, or variable can be indicated by an IDREF attribute on the study, file, variable group, or variable elements called "access". Other Forms of Data Appraisal Description Other issues pertaining to data appraisal. Describe here issues such as response variance, nonresponse rate and testing for bias, interviewer and response bias, confidence levels, question bias, etc. Attribute type allows for optional typing of data appraisal processes and option for controlled vocabulary. Example These data files were obtained from the United States House of Representatives, who received them from the Census Bureau accompanied by the following caveats: "The numbers contained herein are not official 1990 decennial Census counts. The numbers represent estimates of the population based on a statistical adjustment method applied to the official 1990 Census figures using a sample survey intended to measure overcount or undercount in the Census results. On July 15, 1991, the Secretary of Commerce decided not to adjust the official 1990 decennial Census counts (see 56 Fed. Reg. 33582, July 22, 1991). In reaching his decision, the Secretary determined that there was not sufficient evidence that the adjustment method accurately distributed the population across and within states. The numbers contained in these tapes, which had to be produced prior to the Secretary's decision, are now known to be biased. Moreover, the tapes do not satisfy standards for the publication of Federal statistics, as established in Statistical Policy Directive No. 2, 1978, Office of Federal Statistical Policy and Standards. Accordingly, the Department of Commerce deems that these numbers cannot be used for any purpose that legally requires use of data from the decennial Census and assumes no responsibility for the accuracy of the data for any purpose whatsoever. The Department will provide no assistance in interpretation or use of these numbers." ]]> Extent of Processing Checks Description Indicate here, at the file level, the types of checks and operations performed on the data file. A controlled vocabulary may be developed for this element in the future. The following examples are based on ICPSR's Extent of Processing scheme: Example The archive produced a codebook for this collection. ]]> Consistency checks were performed by Data Producer/ Principal Investigator. ]]> Consistency checks performed by the archive. ]]> The archive generated SAS and/or SPSS data definition statements for this collection. ]]> Frequencies were provided by Data Producer/Principal Investigator. ]]> Frequencies provided by the archive. ]]> Missing data codes were standardized by Data Producer/ Principal Investigator. ]]> Missing data codes were standardized by the archive. ]]> The archive performed recodes and/or calculated derived variables. ]]> Data were reformatted by the archive. ]]> Checks for undocumented codes were performed by Data Producer/Principal Investigator. ]]> Checks for undocumented codes were performed by the archive. ]]> Data Collection Methdology Description Information about the methodology employed in a data collection. Sample Frame Description Sample frame describes the sampling frame used for identifying the population from which the sample was taken. For example, a telephone book may be a sample frame for a phone survey. In addition to the name, label and text describing the sample frame, this structure lists who maintains the sample frame, the period for which it is valid, a use statement, the universe covered, the type of unit contained in the frame as well as the number of units available, the reference period of the frame and procedures used to update the frame. Use multiple use statements to provide different uses under different conditions. Repeat elements within the use statement to support multiple languages. Sample Frame Name Description Name of the sample frame. Example City of St. Paul Directory]]> Valid Period Description Defines a time period for the validity of the sampling frame. Enter dates in YYYY-MM-DD format. Example 2009-07-01 2011-06-30 ]]> Reference Period Description Indicates the period of time in which the sampling frame was actually used for the study in question. Use ISO 8601 date/time formats to enter the relevant date(s). Example 2009-06-01 ]]> Frame Unit Description Provides information about the sampling frame unit. The attribute "isPrimary" is boolean, indicating whether the unit is primary or not. Example Primary listed owners of published phone numbers in the City of St. Paul ]]> Unit Type Description Describes the type of sampling frame unit. The attribute "numberOfUnits" provides the number of units in the sampling frame. Example Primary listed owners of published phone numbers in the City of St. Paul ]]> Target Sample Size Description Provides both the target size of the sample (this is the number in the original sample, not the number of respondents) as well as the formula used for determining the sample size. Sample Size Description This element provides the targeted sample size in integer format. Example 385 ]]> Sample Size Formula Description This element includes the formula that was used to determine the sample size. Example n0=Z2pq/e2=(1.96)2(.5)(.5)/(.05)2=385 individuals ]]> Instrument Development Description Describe any development work on the data collection instrument. Type attribute allows for the optional use of a defined development type with or without use of a controlled vocabulary. Example The questionnaire was pre-tested with split-panel tests, as well as an analysis of non-response rates for individual items, and response distributions. ]]> Instrument Development Description Description of how and with what frequency the sample frame is updated. Example Changes are collected as they occur through registration and loss of phone number from the specified geographic area. Data are compiled for the date June 1st of odd numbered years, and published on July 1st for the following two-year period. ]]> Custodian Description Custodian identifies the agency or individual who is responsible for creating or maintaining the sample frame. Attribute affiliation provides the affiliation of the custodian with an agency or organization. Attribute abbr provides an abbreviation for the custodian. Example DEX Publications ]]> Collector Training Description Describes the training provided to data collectors including internviewer training, process testing, compliance with standards etc. This is repeatable for language and to capture different aspects of the training process. The type attribute allows specification of the type of training being described. Example Describe research project, describe population and sample, suggest methods and language for approaching subjects, explain questions and key terms of survey instrument. ]]> Data Collector Description The entity (individual, agency, or institution) responsible for administering the questionnaire or interview or compiling the data. This refers to the entity collecting the data, not to the entity producing the documentation. Attribute "abbr" may be used to list common abbreviations given to agencies, etc. Attribute "affiliation" may be used to record affiliation of the data collector. The role attribute specifies the role of person in the data collection process. Example Survey Research Center ]]> Variable Description Description Description of variables. Description Identifies a physical storage location for an individual data entry, serving as a link between the physical location and the logical content description of each data item. The attribute "varRef" is an IDREF that points to a discrete variable description. If the data item is located within an nCube (aggregate data), use the attribute "nCubeRef" (IDREF) to point to the appropriate nCube and the element CubeCoord to identify the coordinates of the data item within the nCube. Kind of Data Description The type of data included in the file: survey data, census/enumeration data, aggregate data, clinical data, event/transaction data, program source code, machine-readable text, administrative records data, experimental data, psychological test, textual data, coded textual, coded documents, time budget diaries, observation data/ratings, process-produced data, etc. This element maps to Dublin Core Type element. The type attribute can be used for forward-compatibility with DDI 3, by providing a type for use of controlled vocabulary, as this is descriptive in DDI 2 and CodeValue in DDI 3. Example survey data ]]> Missing Data Description This element can be used to give general information about missing data, e.g., that missing data have been standardized across the collection, missing data are present because of merging, etc. Example Missing data are represented by blanks. ]]> The codes "-1" and "-2" are used to represent missing data. ]]> Data Sources Description Used to list the book(s), article(s), serial(s), and/or machine-readable data file(s)--if any--that served as the source(s) of the data collection. Example "Voting Scores." CONGRESSIONAL QUARTERLY ALMANAC 33 (1977), 487-498. ]]> United States Internal Revenue Service Quarterly Payroll File ]]> Definition Description Rationale for why the group was constituted in this way. Example The following eight variables were only asked in Ghana. ]]> The following four nCubes form a single presentation table. ]]> Date of Deposit Description The date that the work was deposited with the archive that originally received it. The ISO standard for dates (YYYY-MM-DD) is recommended for use with the "date" attribute. Example January 25, 1999 ]]> Deposit Requirement Description Information regarding user responsibility for informing archives of their use of data through providing citations to the published work or providing copies of the manuscripts. Example To provide funding agencies with essential information about use of archival resources and to facilitate the exchange of information about ICPSR participants' research activities, users of ICPSR data are requested to send to ICPSR bibliographic citations for, or copies of, each completed manuscript or thesis abstract. Please indicate in a cover letter which data were used. ]]> Depositor Description The name of the person (or institution) who provided this work to the archive storing it. Example Bureau of Justice Statistics ]]> Derivation Description Used only in the case of a derived variable, this element provides both a description of how the derivation was performed and the command used to generate the derived variable, as well as a specification of the other variables in the study used to generate the derivation. The "var" attribute provides the ID values of the other variables in the study used to generate this derived variable. Major Deviations from the Sample Design Description Information indicating correspondence as well as discrepancies between the sampled units (obtained) and available statistics for the population (age, sex-ratio, marital status, etc.) as a whole. XHTML formatting may be used in this element for forward-compatibility with DDI 3. Example The suitability of Ohio as a research site reflected its similarity to the United States as a whole. The evidence extended by Tuchfarber (1988) shows that Ohio is representative of the United States in several ways: percent urban and rural, percent of the population that is African American, median age, per capita income, percent living below the poverty level, and unemployment rate. Although results generated from an Ohio sample are not empirically generalizable to the United States, they may be suggestive of what might be expected nationally. ]]> Data Fingerprint Description Allows for assigning a hash value (digital fingerprint) to the data or data file. Set the attribute flag to "data" when the hash value provides a digital fingerprint to the data contained in the file regardless of the storage format (ASCII, SAS, binary, etc.). One approach to compute a data fingerprint is the Universal Numerical Fingerprint (UNF). Set the attribute flag to "dataFile" if the digital fingerprint is only for the data file in its current storage format. Provide the digital fingerprint in digitalFingerprintValue and identify the algorithm specification used (add version as a separate entry if it is not part of the specification entry). Example UNF:3:DaYlT6QSX9r0D50ye+tXpA== UNF v5.0 Calculation Producture [http://thedata.org/book/unf-version-5-0]UNF V5 ]]> File Dimensions Description Dimensions of the overall file. Disclaimer Description Information regarding responsibility for uses of the data collection. This element may be repeated to support multiple language expressions of the content. Example The original collector of the data, ICPSR, and the relevant funding agency bear no responsibility for uses of this collection or for interpretations or inferences based upon such uses. ]]> Date of Distribution Description Date that the work was made available for distribution/presentation. The ISO standard for dates (YYYY-MM-DD) is recommended for use with the "date" attribute. If using a text entry in the element content, the element may be repeated to support multiple language expressions. Example January 25, 1999 ]]> Distributor Statement Description Distribution statement for the work at the appropriate level: marked-up document; marked-up document source; study; study description, other material; other material for study. Distributor Description The organization designated by the author or producer to generate copies of the particular work including any necessary editions or revisions. Names and addresses may be specified and other archives may be co-distributors. A URI attribute is included to provide an URN or URL to the ordering service or download facility on a Web site. Example Ann Arbor, MI: Inter-university Consortium for Political and Social Research ]]> Dimension Description This element defines a variable as a dimension of the nCube, and should be repeated to describe each of the cube's dimensions. The attribute "rank" is used to define the coordinate order (rank="1", rank="2", etc.) The attribute "varRef" is an IDREF that points to the variable that makes up this dimension of the nCube. Document Description Description The Document Description consists of bibliographic information describing the DDI-compliant document itself as a whole. This Document Description can be considered the wrapper or header whose elements uniquely describe the full contents of the compliant DDI file. Since the Document Description section is used to identify the DDI-compliant file within an electronic resource discovery environment, this section should be as complete as possible. The author in the Document Description should be the individual(s) or organization(s) directly responsible for the intellectual content of the DDI version, as distinct from the person(s) or organization(s) responsible for the intellectual content of the earlier paper or electronic edition from which the DDI edition may have been derived. The producer in the Document Description should be the agency or person that prepared the marked-up document. Note that the Document Description section contains a Documentation Source subsection consisting of information about the source of the DDI-compliant file-- that is, the hardcopy or electronic codebook that served as the source for the marked-up codebook. These sections allow the creator of the DDI file to produce version, responsibility, and other descriptions relating to both the creation of that DDI file as a separate and reformatted version of source materials (either print or electronic) and the original source materials themselves. Documentation Source Description Citation for the source document. This element encodes the bibliographic information describing the source codebook, including title information, statement of responsibility, production and distribution information, series and version information, text of a preferred bibliographic citation, and notes (if any). Information for this section should be taken directly from the source document whenever possible. If additional information is obtained and entered in the elements within this section, the source of this information should be noted in the source attribute of the particular element tag. A MARCURI attribute is provided to link to the MARC record for this citation. Documentation Status Description Use this field to indicate if the documentation is being presented/distributed before it has been finalized. Some data producers and social science data archives employ data processing strategies that provide for release of data and documentation at various stages of processing. The element may be repeated to support multiple language expressions of the content. Example This marked-up document includes a provisional data dictionary and brief citation only for the purpose of providing basic access to the data file. A complete codebook will be published at a later date. ]]> Derivation Command Description The actual command used to generate the derived variable. The "syntax" attribute is used to indicate the command language employed (e.g., SPSS, SAS, Fortran, etc.). The element may be repeated to support multiple language expressions of the content. Example RECODE V1 TO V3 (0=1) (1=0) (2=-1) INTO DEFENSE WELFAREHEALTH. ]]> Derivation Description Description A textual description of the way in which this variable was derived. The element may be repeated to support multiple language expressions of the content. Example VAR215.01 "Outcome of first pregnancy" (1988 NSFG=VAR611 PREGOUT1) If R has never been pregnant (VAR203 PREGNUM EQ 0) then OUTCOM01 is blank/inapplicable. Else, OUTCOM01 is transferred from VAR225 OUTCOME for R's 1st pregnancy. ]]> East Bounding Longitude Description The easternmost coordinate delimiting the geographic extent of the dataset. A valid range of values, expressed in decimal degrees (positive east and positive north), is: -180,0 <= East Bounding Longitude Value <= 180,0 Embargo Description Provides information on variables/nCubes which are not currently available because of policies established by the principal investigators and/or data producers. The ISO standard for dates (YYYY-MM-DD) is recommended for use with the "date" attribute. An "event" attribute is provided to specify "notBefore" or "notAfter" ("notBefore" is the default). A "format" attribute is provided to ensure that this information will be machine-processable, and specifies a format for the embargo element. The "format" attribute could be used to specify other conventions for the way that information within the embargo element is set out, if conventions for encoding embargo information were established in the future. This element may be repeated to support multiple language expressions of the content. Example The data associated with this variable/nCube will not become available until September 30, 2001, because of embargo provisions established by the data producers. ]]> Table Entry Estimates of Sampling Error Description Measure of how precisely one can estimate a population value from a given sample. Example To assist NES analysts, the PC SUDAAN program was used to compute sampling errors for a wide-ranging example set of proportions estimated from the 1996 NES Pre-election Survey dataset. For each estimate, sampling errors were computed for the total sample and for twenty demographic and political affiliation subclasses of the 1996 NES Pre-election Survey sample. The results of these sampling error computations were then summarized and translated into the general usage sampling error table provided in Table 11. The mean value of deft, the square root of the design effect, was found to be 1.346. The design effect was primarily due to weighting effects (Kish, 1965) and did not vary significantly by subclass size. Therefore the generalized variance table is produced by multiplying the simple random sampling standard error for each proportion and sample size by the average deft for the set of sampling error computations. ]]> Contents of Files Description Abstract or description of the file. A summary describing the purpose, nature, and scope of the data file, special characteristics of its contents, major subject areas covered, and what questions the PIs attempted to answer when they created the file. A listing of major variables in the file is important here. In the case of multi-file collections, this uniquely describes the contents of each file. Example Part 1 contains both edited and constructed variables describing demographic and family relationships, income, disability, employment, health insurance status, and utilization data for all of 1987. ]]> Data Files Description Description Information about the data file(s) that comprises a collection. This section can be repeated for collections with multiple files. The "URI" attribute may be a URN or a URL that can be used to retrieve the file. The "sdatrefs" are summary data description references that record the ID values of all elements within the summary data description section of the Study Description that might apply to the file. These elements include: time period covered, date of collection, nation or country, geographic coverage, geographic unit, unit of analysis, universe, and kind of data. The "methrefs" are methodology and processing references that record the ID values of all elements within the study methodology and processing section of the Study Description that might apply to the file. These elements include information on data collection and data appraisal (e.g., sampling, sources, weighting, data cleaning, response rates, and sampling error estimates). The "pubrefs" attribute provides a link to publication/citation references and records the ID values of all citations elements within Other Study Description Materials or Other Study-Related Materials that pertain to this file. "Access" records the ID values of all elements in the Data Access section that describe access conditions for this file. Remarks: When a codebook documents two different physical instantiations of a data file, e.g., logical record length (or OSIRIS) and card-image version, the Data File Description should be repeated to describe the two separate files. An ID should be assigned to each file so that in the Variable section the location of each variable on the two files can be distinguished using the unique file IDs. Example ]]> ]]> File Name Description Contains a short title that will be used to distinguish a particular file/part from other files/parts in the data collection. The element may be repeated to support multiple language expressions of the content. Example Second-Generation Children Data ]]> Place of File Production Description Indicates whether the file was produced at an archive or produced elsewhere. Example Washington, DC: United States Department of Commerce, Bureau of the Census ]]> Number of Files Description Total number of physical files associated with a collection. Example 5 files ]]> File Structure Description Type of file structure. The attribute "type" is used to indicate hierarchical, rectangular, relational, or nested (the default is rectangular). If the file is rectangular, the next relevant element is File Dimensions. If the "other" value is used for the type attribute, then the otherType attribute should have a value specifying the other type.The otherType attribute should only be used when applying a controlled vocabulary to this attribute. Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. The fileStrcRef attribute allows for multiple data files with different coverage but the same file structure to share a single fileStrc. The file structure is fully described in the first fileTxt within the fileDscr and then the fileStrc in subsequent fileTxt descriptions would reference the first fileStrcRef rather than repeat the details. File-by-File Description Description Provides descriptive information about the data file. A file name and a full bibliographic citation for the file may be entered, as well as a data fingerprint, if available. Information about the physical properties of the data file is also supported. Make sure to fill out topcClass for the study as these can be used by the data file. Note coverage constraints in fileCont. File Citation Description The complex element fileCitation provides for a full bibliographic citation option for each data file described in fileDscr. To support accurate citation of a data file the minimum element set includes: titl, IDNo, authEnty, producer, and prodDate. If a DOI is available for the data file enter this in the IDNo (this element is repeatable). If a hash value (digital fingerprint) has been created for the data file enter the information regarding its value and algorithm specification in digitalFingerprint. Example ABC News/Washington Post Monthly Poll, December 2010 http://dx.doi.org/10.3886/ICPSR32547.v1 ABC News The Washington Post ABC News 2011 ]]> Type of File Description Types of data files include raw data (ASCII, EBCDIC, etc.) and software-dependent files such as SAS datasets, SPSS export files, etc. If the data are of mixed types (e.g., ASCII and packed decimal), state that here. Note that the element varFormat permits specification of the data format at the variable level. The "charset" attribute allows one to specify the character set used in the file, e.g., US-ASCII, EBCDIC, UNICODE UTF-8, etc. The element may be repeated to support multiple language expressions of the content. Example ASCII data file ]]> Data Format Description Physical format of the data file: Logical record length format, card-image format (i.e., data with multiple records per case), delimited format, free format, etc. The element may be repeated to support multiple language expressions of the content. Example comma-delimited ]]> Forward Progression Description Contains a reference to IDs of possible following questions. The "qstn" IDREFS may be used to specify the question IDs. Example If yes, please ask questions 120-124. ]]> Frequency of Data Collection Description For data collected at more than one point in time, the frequency with which the data were collected. The "freq" attribute is included to permit the development of a controlled vocabulary for this element. Example monthly ]]> quarterly ]]> Funding Agency/Sponsor Description The source(s) of funds for production of the work. If different funding agencies sponsored different stages of the production process, use the "role" attribute to distinguish them. Example National Science Foundation ]]> Sun Microsystems ]]> Geographic Bounding Box Description The fundamental geometric description for any dataset that models geography. GeoBndBox is the minimum box, defined by west and east longitudes and north and south latitudes, that includes the largest geographic extent of the dataset's geographic coverage. This element is used in the first pass of a coordinate-based search. If the boundPoly element is included, then the geoBndBox element MUST be included. Example Nevada State ]]> -120.005729004 -114.039663 35.00208499998 42.002207 ]]> Norway ]]> 4.789583 33.637497 57.987915 80.76416 ]]> Geographic Map Description This element is used to point, using a "URI" attribute, to an external map that displays the geography in question. The "levelno" attribute indicates the level of the geographic hierarchy relayed in the map. The "mapformat" attribute indicates the format of the map. Geographic Coverage Description Information on the geographic coverage of the data. Includes the total geographic scope of the data, and any additional levels of geographic coding provided in the variables. Maps to Dublin Core Coverage element. Inclusion of this element in the codebook is recommended. Fpor forward-compatibility, DDI 3 XHTML tags may be used in this element. Example State of California ]]> Geographic Unit Description Lowest level of geographic aggregation covered by the data. Example state ]]> Grant Number Description The grant/contract number of the project that sponsored the effort. If more than one, indicate the appropriate agency using the "agency" attribute. If different funding agencies sponsored different stages of the production process, use the "role" attribute to distinguish the grant numbers. Example J-LEAA-018-77 ]]> G-Ring Latitude Description Latitude (y coordinate) of a point. Valid range expressed in decimal degrees is as follows: -90,0 to 90,0 degrees (latitude) G-Ring Longitude Description Longitude (x coordinate) of a point. Valid range expressed in decimal degrees is as follows: -180,0 to 180,0 degrees (longitude) Guide to Codebook Description List of terms and definitions used in the documentation. Provided to assist users in using the document correctly. This element was intended to reflect the section in OSIRIS codebooks that assisted users in reading and interpreting a codebook. Each OSIRIS codebook contained a sample codebook page that defined the codebook conventions. The element may be repeated to support multiple language expressions of the content. Holdings Information Description Information concerning either the physical or electronic holdings of the cited work. Attributes include: location--The physical location where a copy is held; callno--The call number for a work at the location specified; and URI--A URN or URL for accessing the electronic copy of the cited work. Example Marked-up Codebook for Current Population Survey, 1999: Annual Demographic File ]]> Codebook for Current Population Survey, 1999: Annual Demographic File ]]> Identification Number Description Unique string or number (producer's or archive's number). An "agency" attribute is supplied. Identification Number of data collection maps to Dublin Core Identifier element. Example 6678 ]]> 2010 ]]> Imputation Description According to the Statistical Terminology glossary maintained by the National Science Foundation, this is "the process by which one estimates missing values for items that a survey respondent failed to provide," and if applicable in this context, it refers to the type of procedure used. When applied to an nCube, imputation takes into consideration all of the dimensions that are part of that nCube. This element may be repeated to support multiple language expressions of the content. Example This variable contains values that were derived by substitution. ]]> Range of Invalid Data Values Description Values for a particular variable that represent missing data, not applicable responses, etc. Example 98 DK 99 Inappropriate ]]> Value Item Description The counterpart to Range; used to encode individual values. This is an empty element consisting only of its attributes. The "UNITS" attribute permits the specification of integer/real numbers. The "VALUE" attribute specifies the actual value. Example ]]> Interviewer Instructions Description Specific instructions to the individual conducting an interview. Example Please prompt the respondent if they are reticent to answer this question. ]]> Range Key Description This element permits a listing of the category values and labels. While this information is coded separately in the Category element, there may be some value in having this information in proximity to the range of valid and invalid values. A table is permissible in this element. Example 05 (PSU) Parti Socialiste Unifie et extreme gauche (Lutte Ouvriere) [United Socialists and extreme left (Workers Struggle)] 50 Les Verts [Green Party] 80 (FN) Front National et extreme droite [National Front and extreme right] ]]> Keywords Description Words or phrases that describe salient aspects of a data collection's content. Can be used for building keyword indexes and for classification and retrieval purposes. A controlled vocabulary can be employed. Maps to Dublin Core Subject element. The "vocab" attribute is provided for specification of the controlled vocabulary in use, e.g., LCSH, MeSH, etc. The "vocabURI" attribute specifies the location for the full controlled vocabulary. Example quality of life ]]> family ]]> career goals ]]> Label Description A short description of the parent element. In the variable label, the length of this phrase may depend on the statistical analysis system used (e.g., some versions of SAS permit 40-character labels, while some versions of SPSS permit 120 characters), although the DDI itself imposes no restrictions on the number of characters allowed. A "level" attribute is included to permit coding of the level to which the label applies, i.e. record group, variable group, variable, category group, category, nCube group, nCube, or other study-related materials. The "vendor" attribute was provided to allow for specification of different labels for use with different vendors' software. The attribute "country" allows for the denotation of country-specific labels. The "sdatrefs" attribute records the ID values of all elements within the Summary Data Description section of the Study Description that might apply to the label. These elements include: time period covered, date of collection, nation or country, geographic coverage, geographic unit, unit of analysis, universe, and kind of data. Location Map Description This element maps individual data entries to one or more physical storage locations. It is used to describe the physical location of aggregate/tabular data in cases where the nCube model is employed. Location Description This is an empty element containing only the attributes listed below. Attributes include "StartPos" (starting position of variable), "EndPos" (ending position of variable), "width" (number of columns the variable occupies), "RecSegNo" (the record segment number, deck or card number the variable is located on), and "fileid", an IDREF link to the fileDscr element for the file that this location is within (this is necessary in cases where the same variable may be coded in two different files, e.g., a logical record length type file and a card image type file). Note that if there is no width or ending position, then the starting position should be the ordinal position in the file, and the file would be described as free-format. The attribute "locMap" is an IDREF to the element locMap and serves as a pointer to indicate that the location information for the nCube's cells (aggregate data) is located in that section. Example ]]> ]]> Logical Record Length Description Logical record length, i.e., number of characters of data in the record. Example 27 ]]> Measure Description The element measure indicates the measurement features of the cell content: type of aggregation used, measurement unit, and measurement scale. An origin point is recorded for anchored scales, to be used in determining relative movement along the scale. Additivity indicates whether an aggregate is a stock (like the population at a given point in time) or a flow (like the number of births or deaths over a certain period of time). The non-additive flag is to be used for measures that for logical reasons cannot be aggregated to a higher level - for instance, data that only make sense at a certain level of aggregation, like a classification. Two nCubes may be identical except for their measure - for example, a count of persons by age and percent of persons by age. Measure is an empty element that includes the following attributes: "varRef" is an IDREF; "aggrMeth" indicates the type of aggregation method used, for example 'sum', 'average', 'count'; "measUnit" records the measurement unit, for example 'km', 'miles', etc.; "scale" records unit of scale, for example 'x1', 'x1000'; "origin" records the point of origin for anchored scales;"additivity" records type of additivity such as 'stock', 'flow', 'non-additive'. If a value of "other" is used for the aggrMeth attribute, a term from a controlled vocabulary should be placed in the "otherAggrMeth" attribute, and a the complex element controlledVocabUsed should be used to specify the controlled vocabulary. Methodology and Processing Description This section describes the methodology and processing involved in a data collection. Coding Instructions Description Describe specific coding instructions used in data processing, cleaning, assession, or tabulation. Element relatedProcesses allows linking a coding instruction to one or more processes such as dataProcessing, dataAppr, cleanOps, etc. Use the txt element to describe instructions in a human readable form. Example recode undocumented/wild codes to missing, i.e., 0. RECODE V1 TO V100 (10 THROUGH HIGH = 0) ]]> . Command Description Provide command code for the coding instruction. The formalLanguage attribute identifies the language of the command code. Example RECODE V1 TO V100 (10 THROUGH HIGH = 0) ]]> Data Processing Description Describes various data processing procedures not captured elsewhere in the documentation, such as topcoding, recoding, suppression, tabulation, etc. The "type" attribute supports better classification of this activity, including the optional use of a controlled vocabulary. Example The income variables in this study (RESP_INC, HHD_INC, and SS_INC) were topcoded to protect confidentiality. ]]> Mathematical Identifier Description Token element containing the smallest unit in the mrow that carries meaning. Mathematical Row Description This element is a wrapper containing the presentation expression mi. It creates a single string without spaces consisting of the individual elements described within it. It can be used to create a single variable by concatenating other variables into a single string. It is used to create linking variables composed of multiple non-contiguous parts, or to define unique strings for various category values of a single variable. nCube Description Describes the logical structure of an n-dimensional array, in which each coordinate intersects with every other dimension at a single point. The nCube has been designed for use in the markup of aggregate data. Repetition of the following elements is provided to support multi-language content: anlysUnit, embargo, imputation, purpose, respUnit, and security. This element includes the following attributes: The attribute "name" includes a short label for the nCube. Following the rules of many statistical analysis systems such as SAS and SPSS, names are usually up to eight characters long. The "sdatrefs" are summary data description references which record the ID values of all elements within the summary data description section of the Study Description which might apply to the nCube. These elements include: time period covered, date of collection, nation or country, geographic coverage, geographic unit, unit of analysis, universe, and kind of data. The "methrefs" are methodology and processing references which record the ID values of all elements within the study methodology and processing section of the Study Description which might apply to the nCube. These elements include information on data collection and data appraisal (e.g., sampling, sources, weighting, data cleaning, response rates, and sampling error estimates). The "pubrefs" attribute provides a link to publication/citation references and records the ID values of all citations elements in Other Study Description Materials or Other Study-Related Materials that pertain to this nCube. The "access" attribute records the ID values of all elements in the Data Access section that describe access conditions for this nCube. The "dmnsQnty" attribute notes the number of dimensions in the nCube. The "cellQnty" attribute indicates the total number of cells in the nCube. nCube Group Description A group of nCubes that may share a common subject, arise from the interpretation of a single question, or are linked by some other factor. This element makes it possible to identify all nCubes derived from a simple presentation table, and to provide the original table title and universe, as well as reference the source. Specific nesting patterns can be described using the attribute nCubeGrp. nCube groups are also created this way in order to permit nCubes to belong to multiple groups, including multiple subject groups, without causing overlapping groups. nCubes that are linked by the same use of the same variable need not be identified by an nCubeGrp element because they are already linked by a common variable element. Note that as a result of the strict sequencing required by XML, all nCube Groups must be marked up before the Variable element is opened. That is, the mark-up author cannot mark up a nCube Group, then mark up its constituent nCubes, then mark up another nCube Group. The "type" attribute refers to the general type of grouping of the nCubes. Specific nCube Groups, included within the 'type' attribute, are: Display: nCubes that are part of the same presentation table. Subject: nCubes that address a common topic or subject, e.g., income, poverty, children. Iteration: nCubes that appear in different sections of the data file measuring a common subject in different ways, e.g., using different universes, units of measurement, etc. Pragmatic: An nCube group without shared properties. Record: nCubes from a single record in a hierarchical file. File: nCube from a single file in a multifile study. Other: nCubes that do not fit easily into any of the categories listed above, e.g., a group of nCubes whose documentation is in another language. A term from a controlled vocabulary may be placed into the otherType attribute if this value is used. The otherType attribute should only be used when applying a controlled vocabulary, and when the type attribute has been given a value of "other". Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. The "nCube" attribute is used to reference all the IDs of the nCubes belonging to the group. The "nCubeGrp" attribute is used to reference all the subsidiary nCube groups which nest underneath the current nCubeGrp. This allows for encoding of a hierarchical structure of nCube groups. The attribute "name" provides a name, or short label, for the group. The "sdatrefs" are summary data description references that record the ID values of all elements within the summary data description section of the Study Description that might apply to the group. These elements include: time period covered, date of collection, nation or country, geographic coverage, geographic unit, unit of analysis, universe, and kind of data. The "methrefs" are methodology and processing references which record the ID values of all elements within the study methodology and processing section of the Study Description which might apply to the group. These elements include information on data collection and data appraisal (e.g., sampling, sources, weighting, data cleaning, response rates, and sampling error estimates). The "pubrefs" attribute provides a link to publication/citation references and records the ID values of all citations elements within Section codeBook/stdyDscr/othrStdyMat or codeBook/otherMat that pertain to this nCube group. The "access" attribute records the ID values of all elements in codeBook/stdyDscr/dataAccs of the document that describe access conditions for this nCube group. Country Description Indicates the country or countries covered in the file. Attribute "abbr" may be used to list common abbreviations; use of ISO country codes is recommended. Maps to Dublin Core Coverage element. Inclusion of this element is recommended. For forward-compatibility, DDI 3 XHTML tags may be used in this element. Example United Kingdom ]]> North Bounding Latitude Description The northernmost coordinate delimiting the geographic extent of the dataset. A valid range of values, expressed in decimal degrees (positive east and positive north), is: -90,0 <= North Bounding Latitude Value <= 90,0 ; North Bounding Latitude Value = South Bounding Latitude Value Notes and comments Description For clarifying information/annotation regarding the parent element. The attributes for notes permit a controlled vocabulary to be developed ("type" and "subject"), indicate the "level" of the DDI to which the note applies (study, file, variable, etc.), and identify the author of the note ("resp"). The parent attribute is used to support capturing information obtained while preparing files for translation to DDI 3. It provides the ID(s) of the element this note is related to. The sameNote attribute is used to support capturing information obtained while preparing files for translation to DDI 3. If the same note is used multiple times all the parent IDs can be captured in a single note and all duplicate notes can reference the note containing the related to references in the attribute sameNote. Example Additional information on derived variables has been added to this marked-up version of the documentation. ]]> This citation was prepared by the archive based on information received from the markup authors. ]]> The source codebook was produced from original hardcopy materials using Optical Character Recognition (OCR). ]]> A machine-readable version of the source codebook was supplied by the Zentralarchiv ]]> This Document Description, or header information, can be used within an electronic resource discovery environment. ]]> Data for 1998 have been added to this version of the data collection. ]]> This citation was sent to ICPSR by the agency depositing the data. ]]> Data on employment and income refer to the preceding year, although demographic data refer to the time of the survey. ]]> Undocumented codes were found in this data collection. Missing data are represented by blanks. ]]> For this collection, which focuses on employment, unemployment, and gender equality, data from EUROBAROMETER 44.3: HEALTH CARE ISSUES AND PUBLIC SECURITY, FEBRUARY-APRIL 1996 (ICPSR 6752) were merged with an oversample. ]]> Data from the Bureau of Labor Statistics used in the analyses for the final report are not provided as part of this collection. ]]> Users should note that this is a beta version of the data. The investigators therefore request that users who encounter any problems with the dataset contact them at the above address. ]]> The number of arrest records for an individual is dependent on the number of arrests an offender had. ]]> Data for all previously-embargoed variables are now available in this version of the file. ]]> There is a restricted version of this file containing confidential information, access to which is controlled by the principal investigator. ]]> This variable group was created for the purpose of combining all derived variables. ]]> This variable group and all other variable groups in this data file were organized according to a schema developed by the adhoc advisory committee. ]]> This nCube Group was created for the purpose of presenting a cross-tabulation between variables "Tenure" and "Age of householder." ]]> Starting with Euro-Barometer 2 the coding of this variable has been standardized following an approximate ordering of each country's political parties along a "left" to "right" continuum in the first digit of the codes. Parties coded 01-39 are generally considered on the "left", those coded 40-49 in the "center", and those coded 60-89 on the "right" of the political spectrum. Parties coded 50-59 cannot be readily located in the traditional meaning of "left" and "right". The second digit of the codes is not significant to the "left-right" ordering. Codes 90-99 contain the response "other party" and various missing data responses. Users may modify these codings or part of these codings in order to suit their specific needs. ]]> Codes 90-99 contain the response "other party" and various missing data responses. ]]> The labels for categories 01 and 02 for this variable, were inadvertently switched in the first version of this variable and have now been corrected. ]]> This variable was created by recoding location of residence to Census regions. ]]> The labels for categories 01 and 02 in dimension 1 were inadvertently switched in the first version of the cube, and have now been corrected. ]]> This nCube was created to meet the needs of local low income programs in determining eligibility for federal funds. ]]> The variables in this study are identical to earlier waves. ]]> Users should be aware that this questionnaire was modified during the CAI process. ]]> Archive Where Study Originally Stored Description Archive from which the data collection was obtained; the originating archive. Example Zentralarchiv fuer empirische Sozialforschung ]]> Other Identifications /Acknowledgments Description Statements of responsibility not recorded in the title and statement of responsibility areas. Indicate here the persons or bodies connected with the work, or significant persons or bodies connected with previous editions and not already named in the description. For example, the name of the person who edited the marked-up documentation might be cited in codeBook/docDscr/rspStmt/othId, using the "role" and "affiliation" attributes. Other identifications/acknowledgments for data collection (codeBook/stdyDscr/citation/rspStmt/othId) maps to Dublin Core Contributor element. Example Jane Smith ]]> Other References Notes Description Indicates other pertinent references. Can take the form of bibliographic citations. Example Part II of the documentation, the Field Representative's Manual, is provided in hardcopy form only. ]]> Other Study-Related Materials Description This section allows for the inclusion of other materials that are related to the study as identified and labeled by the DTD/Schema users (encoders). The' materials may be entered as PCDATA (ASCII text) directly into the document (through use of the "txt" element). This ection may also serve as a "container" for other electronic materials such as setup files by providing a brief description of the study-related materials accompanied by the attributes "type" and "level" defining the material further. The "URI" attribute may be used to indicate the location of the other study-related materials. Other Study-Related Materials may include: questionnaires, coding notes, SPSS/SAS/Stata setup files (and others), user manuals, continuity guides, sample computer software programs, glossaries of terms, interviewer/project instructions, maps, database schema, data dictionaries, show cards, coding information, interview schedules, missing values information, frequency files, variable maps, etc. The "level" attribute is used to clarify the relationship of the other materials to components of the study. Suggested values for level include specifications of the item level to which the element applies: e.g., level= data; level=datafile; level=studydsc; level=study. The URI attribute need not be used in every case; it is intended for capturing references to other materials separate from the codebook itself. In Section 5, Other Material is recursively defined. Other Study Description Materials Description Other materials relating to the study description. This section describes other materials that are related to the study description that are primarily descriptions of the content and use of the study, such as appendices, sampling information, weighting details, methodological and technical details, publications based upon the study content, related studies or collections of studies, etc. This section may point to other materials related to the description of the study through use of the generic citation element, which is available for each element in this section. This maps to Dublin Core Relation element. Note that codeBook/otherMat (Other Study-Related Materials), should be used for materials used in the production of the study or useful in the analysis of the study. The materials in codeBook/otherMat may be entered as PCDATA (ASCII text) directly into the document (through use of the txt element). That section may also serve as a "container" for other electronic materials by providing a brief description of the study-related materials accompanied by the "type" and "level" attributes further defining the materials. Other Study-Related Materials in codeBook/otherMat may include: questionnaires, coding notes, SPSS/SAS/Stata setup files (and others), user manuals, continuity guides, sample computer software programs, glossaries of terms, interviewer/project instructions, maps, database schema, data dictionaries, show cards, coding information, interview schedules, missing values information, frequency files, variable maps, etc. Parallel Title Description Title translated into another language. Example Politbarometer West [Germany], Partial Accumulation, 1977-1995 ]]> Politbarometer, 1977-1995: Partielle Kumulation ]]> Description This is an empty element containing only the attributes listed below. Attributes include "type" (type of file structure: rectangular, hierarchical, two-dimensional, relational), "recRef" (IDREF link to the appropriate file or recGrp element within a file), "startPos" (starting position of variable or data item), "endPos" (ending position of variable or data item), "width" (number of columns the variable/data item occupies), "RecSegNo" (the record segment number, deck or card number the variable or data item is located on), and "fileid" (an IDREF link to the fileDscr element for the file that includes this physical location). Remarks: Where the same variable is coded in two different files, e.g., a fixed format file and a relational database file, simply repeat the physLoc element with the alternative location information. Note that if there is no width or ending position, then the starting position should be the ordinal position in the file, and the file would be described as free-format. New attributes will be added as other storage formats are described within the DDI. Example ]]> ]]> Point Description 0-dimensional geometric primitive, representing a position, but not having extent. In this declaration, point is limited to a longitude/latitude coordinate system. Polygon Description The minimum polygon that covers a geographical area, and is delimited by at least 4 points (3 sides), in which the last point coincides with the first point. PostQuestion Text Description Text describing what occurs after the literal question has been asked. Example The next set of questions will ask about your financial situation. ]]> PreQuestion Text Description Text describing a set of conditions under which a question might be asked. Example For those who did not go away on a holiday of four days or more in 1985... ]]> Processing Status Description Processing status of the file. Some data producers and social science data archives employ data processing strategies that provide for release of data and documentation at various stages of processing. Example Available from the DDA. Being processed. ]]> The principal investigator notes that the data in Public Use Tape 5 are released prior to final cleaning and editing, in order to provide prompt access to the NMES data by the research and policy community. ]]> Date of Production Description Date when the marked-up document/marked-up document source/data collection/other material(s) were produced (not distributed or archived). The ISO standard for dates (YYYY-MM-DD) is recommended for use with the date attribute. Production date for data collection (codeBook/stdyDscr/citation/prodStmt/prodDate) maps to Dublin Core Date element. Example January 25, 1999 ]]> Place of Production Description Address of the archive or organization that produced the work. Example Ann Arbor, MI: Inter-university Consortium for Political and Social Research ]]> Production Statement Description Production statement for the work at the appropriate level: marked-up document; marked-up document source; study; study description, other material; other material for study. Producer Description The producer is the person or organization with the financial or administrative responsibility for the physical processes whereby the document was brought into existence. Use the "role" attribute to distinguish different stages of involvement in the production process, such as original producer. Producer of data collection (codeBook/stdyDscr/citation/prodStmt/producer) maps to Dublin Core Publisher element. The "producer" in the Document Description should be the agency or person that prepared the marked-up document. Example Inter-university Consortium for Political and Social Research ]]> Star Tribune Minnesota Poll ]]> Machine Readable Data Center ]]> Description Explains the purpose for which a particular nCube was created. Example Meets reporting requirements for the Federal Reserve Board ]]> Question Description The question element may have mixed content. The element itself may contain text for the question, with the subelements being used to provide further information about the question. Alternatively, the question element may be empty and only the subelements used. The element has a unique question ID attribute which can be used to link a variable with other variables where the same question has been asked. This would allow searching for all variables that share the same question ID, perhaps because the questions was asked several times in a panel design. The "ID" attribute contains a unique identifier for the question. "Var" references the ID(s) of the variable(s) relating to the question. The attribute "seqNo" refers to the sequence number of the question. The attribute "sdatrefs" may be used to reference elements in the summary data description section of the Study Description which might apply to this question. These elements include: time period covered, date of collection, nation or country, geographic coverage, geographic unit, unit of analysis, universe, and kind of data. The responseDomainType attribute was added to capture the specific DDI 3 response domain type to facilitate translation between DDI 2 and DDI 3. If this is given a value of "other" then a term from a controlled vocabulary should be put into the "otherResponseDomainType" attribute. Example When you get together with your friends, would you say you discuss political matters frequently, occasionally, or never? ]]> Literal Question Description Text of the actual, literal question asked. Example Why didn't you go away in 1985? ]]> Value Range Description This is the actual range of values. The "UNITS" attribute permits the specification of integer/real numbers. The "min" and "max" attributes specify the lowest and highest values that are part of the range. The "minExclusive" and "maxExclusive" attributes specify values that are immediately outside the range. This is an empty element consisting only of its attributes. Example For example, x < 1 or 10 <= x < 20 would be expressed as: ]]> ]]> Dimensions (of record) Description Information about the physical characteristics of the record. The "level" attribute on this element should be set to "record". Record or Record Group Description Used to describe record groupings if the file is hierarchical or relational. The attribute "recGrp" allows a record group to indicate subsidiary record groups which nest underneath; this allows for the encoding of a hierarchical structure of record groups. The attribute "rectype" indicates the type of record, e.g., "A records" or "Household records." The attribute "keyvar" is an IDREF that provides the link to other record types. In a hierarchical study consisting of individual and household records, the "keyvar" on the person record will indicate the household to which it belongs. The attribute "rtypeloc" indicates the starting column location of the record type indicator variable on each record of the data file. The attribute "rtypewidth" specifies the width, for files with many different record types. The attribute "rtypevtype" specifies the type of the indicator variable. The "recidvar" indicates the variable that identifies the record group. Example CPS 1999 Person-Level Record 133 1500 852 ]]> Overall Number of Records Description Overall record count in file. Particularly helpful in instances such as files with multiple cards/decks or records per case. Example 2400 ]]> Records per Case Description Records per case in the file. This element should be used for card-image data or other files in which there are multiple records per case. Example 5 ]]> Records per Case Description Records per case in the file. This element should be used for card-image data or other files in which there are multiple records per case. Example 5 ]]> Related Materials Description Describes materials related to the study description, such as appendices, additional information on sampling found in other documents, etc. Can take the form of bibliographic citations. This element can contain either PCDATA or a citation or both, and there can be multiple occurrences of both the citation and PCDATA within a single element. May consist of a single URI or a series of URIs comprising a series of citations/references to external materials which can be objects as a whole (journal articles) or parts of objects (chapters or appendices in articles or documents). Example Full details on the research design and procedures, sampling methodology, content areas, and questionnaire design, as well as percentage distributions by respondent's sex, race, region, college plans, and drug use, appear in the annual ISR volumes MONITORING THE FUTURE: QUESTIONNAIRE RESPONSES FROM THE NATION'S HIGH SCHOOL SENIORS. ]]> Current Population Survey, March 1999: Technical Documentation includes an abstract, pertinent information about the file, a glossary, code lists, and a data dictionary. One copy accompanies each file order. When ordered separately, it is available from Marketing Services Office, Customer Service Center, Bureau of the Census, Washington, D.C. 20233. ]]> A more precise explanation regarding the CPS sample design is provided in Technical Paper 40, The Current Population Survey: Design and Methodology. Chapter 5 of this paper provides documentation on the weighting procedures for the CPS both with and without supplement questions. ]]> Related Publications Description Bibliographic and access information aboutvarticles and reports based on the data in this collection. Can take the formbof bibliographic citations. Example Economic Behavior Program Staff. SURVEYS OF CONSUMER FINANCES. Annual volumes 1960 through 1970. Ann Arbor, MI: Institute for Social Research. ]]> Data from the March Current Population Survey are published most frequently in the Current Population Reports P- 20 and P- 60 series. These reports are available from the Superintendent of Documents, U. S. Government Printing Office, Washington, DC 20402. They also are available on the INTERNET at http://www. census. gov. Forthcoming reports will be cited in Census and You, the Monthly Product Announcement (MPA), and the Bureau of the Census Catalog and Guide. ]]> Related Studies Description Information on the relationship of the current data collection to others (e.g., predecessors, successors, other waves or rounds) or to other editions of the same file. This would include the names of additional data collections generated from the same data collection vehicle plus other collections directed at the same general topic. Can take the form of bibliographic citations. Example ICPSR distributes a companion study to this collection titled FEMALE LABOR FORCE PARTICIPATION AND MARITAL INSTABILITY, 1980: [UNITED STATES] (ICPSR 9199). ]]> Type of Research Instrument Description The type of data collection instrument used. "Structured" indicates an instrument in which all respondents are asked the same questions/tests, possibly with precoded answers. If a small portion of such a questionnaire includes open-ended questions, provide appropriate comments. "Semi-structured" indicates that the research instrument contains mainly open-ended questions. "Unstructured" indicates that in-depth interviews were conducted. The "type" attribute is included to permit the development of a controlled vocabulary for this element. Example structured ]]> Response Rate Description The percentage of sample members who provided information. This may include a broader description of stratified response rates, information affecting resonse rates etc. Example For 1993, the estimated inclusion rate for TEDS-eligible providers was 91 percent, with the inclusion rate for all treatment providers estimated at 76 percent (including privately and publicly funded providers). ]]> The overall response rate was 82%, although retail firms with an annual sales volume of more than $5,000,000 were somewhat less likely to respond. ]]> Response Unit Description Provides information regarding who provided the information contained within the variable/nCube, e.g., respondent, proxy, interviewer. This element may be repeated only to support multiple language expressions of the content. Example Head of household ]]> Head of household ]]> Restrictions Description Any restrictions on access to or use of the collection such as privacy certification or distribution restrictions should be indicated here. These can be restrictions applied by the author, producer, or disseminator of the data collection. If the data are restricted to only a certain class of user, specify which type. Example In preparing the data file(s) for this collection, the National Center for Health Statistics (NCHS) has removed direct identifiers and characteristics that might lead to identification of data subjects. As an additional precaution NCHS requires, under Section 308(d) of the Public Health Service Act (42 U.S.C. 242m), that data collected by NCHS not be used for any purpose other than statistical analysis and reporting. NCHS further requires that analysts not use the data to learn the identity of any persons or establishments and that the director of NCHS be notified if any identities are inadvertently discovered. ICPSR member institutions and other users ordering data from ICPSR are expected to adhere to these restrictions. ]]> ICPSR obtained these data from the World Bank under the terms of a contract which states that the data are for the sole use of ICPSR and may not be sold or provided to third parties outside of ICPSR membership. Individuals at institutions that are not members of the ICPSR may obtain these data directly from the World Bank. ]]> Table Row Responsibility Statement Description Responsibility for the creation of the work at the appropriate level: marked-up document; marked-up document source; study; study description, other material; other material for study. Sampling Procedure Description The type of sample and sample design used to select the survey respondents to represent the population. May include reference to the target sample size and the sampling fraction. Example National multistage area probability sample ]]> Simple random sample ]]> Stratified random sample ]]> Quota sample ]]> The 8,450 women interviewed for the NSFG, Cycle IV, were drawn from households in which someone had been interviewed for the National Health Interview Survey (NHIS), between October 1985 and March 1987. ]]> Samples sufficient to produce approximately 2,000 families with completed interviews were drawn in each state. Families containing one or more Medicaid or uninsured persons were oversampled. XHTML content may be used for formatting. ]]> Security Description Provides information regarding levels of access, e.g., public, subscriber, need to know. The ISO standard for dates (YYYY-MM-DD) is recommended for use with the date attribute. Example This variable has been recoded for reasons of confidentiality. Users should contact the archive for information on obtaining access. ]]> Variable(s) within this nCube have been recoded for reasons of confidentiality. Users should contact the archive for information on obtaining access. ]]> Series Information Description Contains a history of the series and a summary of those features that apply to the series as a whole. Example The Current Population Survey (CPS) is a household sample survey conducted monthly by the Census Bureau to provide estimates of employment, unemployment, and other characteristics of the general labor force, estimates of the population as a whole, and estimates of various subgroups in the population. The entire non-institutionalized population of the United States is sampled to obtain the respondents for this survey series. ]]> Series Name Description The name of the series to which the work belongs. Example Current Population Survey Series ]]> Series Statement Description Series statement for the work at the appropriate level: marked-up document; marked-up document source; study; study description, other material; other material for study. The URI attribute is provided to point to a central Internet repository of series information. Repeat this field if the study is part of more than one series. Repetition of the internal content should be used to support multiple languages only. Data Set Availability Description Information on availability and storage of the collection. The "media" attribute may be used in combination with any of the subelements. See Location of Data Collection. Software used in Production Description Software used to produce the work. A "version" attribute permits specification of the software version number. The "date" attribute is provided to enable specification of the date (if any) for the software release. The ISO standard for dates (YYYY-MM-DD) is recommended for use with the date attribute. Example MRDC Codebook Authoring Tool ]]> Arbortext Adept Editor ]]> PageMaker ]]> SAS ]]> The SAS transport file was generated by the SAS CPORT procedure. ]]> Sources Statement Description Description of sources used for the data collection. The element is nestable so that the sources statement might encompass a series of discrete source statements, each of which could contain the facts about an individual source. This element maps to Dublin Core Source element. Source Citation Description This complex element allows the inclusion of a standard citation for the sources used in collecting and creating the dataset. Example Tenth Decennial Census of the United States, 1880. Volume I. Statistics of the Population of the United States at the Tenth Census. United States Census Bureau Government Printing Office 1883 ]]> South Bounding Latitude Description The southernmost coordinate delimiting the geographic extent of the dataset. A valid range of values, expressed in decimal degrees (positive east and positive north), is: -90,0 <=South Bounding Latitude Value <= 90,0 ; South Bounding Latitude Value <= North Bounding Latitude Value Special Permissions Description This element is used to determine if any special permissions are required to access a resource. The "required" attribute is used to aid machine processing of this element, and the default specification is "yes". The "formNo" attribute indicates the number or ID of the form that the user must fill out. The "URI" attribute may be used to provide a URN or URL for online access to a special permissions form. Example The user must apply for special permission to use this dataset locally and must complete a confidentiality form. ]]> Characteristics of Source Noted Description Assessment of characteristics and quality of source material. May not be relevant to survey data. This element may be repeated to support multiple language expressions of the content. Documentation and Access to Sources Description Level of documentation of the original sources. May not be relevant to survey data. This element may be repeated to support multiple language expressions of the content. Origins of Sources Description For historical materials, information about the origin(s) of the sources and the rules followed in establishing the sources should be specified. May not be relevant to survey data. This element may be repeated to support multiple language expressions of the content. Standard Categories Description Standard category codes used in the variable, like industry codes, employment codes, or social class codes. The attribute "date" is provided to indicate the version of the code in place at the time of the study. The attribute "URI" is provided to indicate a URN or URL that can be used to obtain an electronic list of the category codes. Example U. S. Census of Population and Housing, Classified Index of Industries and Occupations ]]> Class of the Study Description Generally used to give the data archive's class or study status number, which indicates the processing status of the study. May also be used as a text field to describe processing status. This element may be repeated to support multiple language expressions of the content. Example ICPSR Class II ]]> DDA Class C ]]> Available from the DDA. Being processed. ]]> Study Description Description The Study Description consists of information about the data collection, study, or compilation that the DDI-compliant documentation file describes. This section includes information about how the study should be cited, who collected or compiled the data, who distributes the data, keywords about the content of the data, summary (abstract) of the content of the data, data collection methods and processing, etc. Note that some content of the Study Description's Citation -- e.g., Responsibility Statement -- may be identical to that of the Documentation Citation. This is usually the case when the producer of a data collection also produced the print or electronic codebook for that data collection. Study Development Description Describe the process of study development as a series of development activities. These activities can be typed using a controlled vocabulary. Describe the activity, listing participants with their role and affiliation, resources used (sources of information), and the outcome of the development activity. Example This would allow you to provide inputs for a number of development activities you wanted to capture using separate entry screens and tagged storage of developmentActivity using the type attribute. For example if there was an activity related to data availability the developmentActivity might be as follows: A number of potential sources were evaluated for content, consistency and quality John Doe Study S Collected in 1970 using unknown sampling method Information incomplete missing X province Due to quality issues this was determined not to be a viable source of data for the study ]]> This generic structure would allow you to designate additional design activities etc. Study Authorization Description Provides structured information on the agency that authorized the study, the date of authorization, and an authorization statement. Example Human Subjects Office Statement of authorization issued bu OUHS on 2010-11-04 ]]> Authorizing Agency Description Name of the agent or agency that authorized the study. The "affiliation" attribute indicates the institutional affiliation of the authorizing agent or agency. The "abbr" attribute holds the abbreviation of the authorizing agent's or agency's name. Example Office for Use of Human Subjects ]]> Authorization Statement Description The text of the authorization. Use XHTML to capture significant structure in the document. Example Required documentation covering the study purpose, disclosure information, questionnaire content, and consent statements was delivered to the OUHS on 2010-10-01 and was reviewed by the compliance officer. Statement of authorization for the described study was issued on 2010-11-04 ]]> Study Scope Description This section contains information about the data collection's scope across several dimensions, including substantive content, geography, and time. Quality Statement Description This structure consists of two parts, standardsCompliance and otherQualityStatements. In standardsCompliance list all specific standards complied with during the execution of this study. Note the standard name and producer and how the study complied with the standard. Enter any additional quality statements in otherQualityStatements. Standards Compliance Description This section lists all specific standards complied with during the execution of this study. Specify the standard(s)' name(s) and producer(s) and describe how the study complied with each standard in complianceDescription. Enter any additional quality statements in otherQualityStatement. Example Data Documentation Initiative DDI Alliance Study metadata was created in compliance with the Data Documentation Initiative (DDI) standard ]]> Standard Description Describes a standard with which the study complies. Standard Name Description Contains the name of the standard with which the study complies. The "date" attribute specifies the date when the standard was published, the "version" attribute includes the specific version of the standard with which the study is compliant, and the "URI" attribute includes the URI for the actual standard. Example Data Documentation Initiative ]]> Post Evaluation Procedures Description Use this section to describe evaluation procedures not address in data evaluation processes. These may include issues such as timing of the study, sequencing issues, cost/budget issues, relevance, instituional or legal arrangments etc. of the study. The completionDate attribute holds the date the evaluation was completed. The type attribute is an optional type to identify the type of evaluation with or without the use of a controlled vocabulary. Example United Nations Statistical Division In-depth review of pre-collection and collection procedures The following steps were highly effective in increasing response rates, and should be repeated in the next collection cycle... ]]> Evaluator Type Description The evaluator element identifies persons or organizations involved in the evaluation. The affiliation attribute contains the affiliation of the individual or organization. The abbr attribute holds an abbreviation for the individual or organization. The role attribute indicates the role played by the individual or organization in the evaluation process. Example United Nations Statistical Division ]]> Evaluation Process Description Describes the evaluation process followed. Evaluation Outcomes Description Describe the outcomes of the evaluation. Example The following steps were highly effective in increasing response rates, and should be repeated in the next collection cycle... ]]> Study Budget Description Describe the budget of the project in as much detail as needed. Use XHTML structure elements to identify discrete pieces of information in a way that facilitates direct transfer of information on the study budget between DDI 2 and DDI 3 structures. Example The budget for the study covers a 5 year award period distributed between direct and indirect costs including: Staff, ... ]]> Subtitle Description A secondary title used to amplify or state certain limitations on the main title. It may repeat information already in the main title. Example Monitoring the Future: A Continuing Study of American Youth, 1995 ]]> A Continuing Study of American Youth, 1995 ]]> Census of Population, 1950 [United States]: Public Use Microdata Sample ]]> Public Use Microdata Sample ]]> Subject Information Description Subject information describing the data collection's intellectual content. Summary Data Description Description Information about the and geographic coverage of the study and unit of analysis. Summary Statistics Description One or more statistical measures that describe the responses to a particular variable and may include one or more standard summaries, e.g., minimum and maximum values, median, mode, etc. The attribute "wgtd" indicates whether the statistics are weighted or not. The "weight" attribute is an IDREF(S) to the weight element(s) in the study description. The attribute "type" denotes the type of statistics being shown: mean, median, mode, valid cases, invalid cases, minimum, maximum, or standard deviation. If a value of "other" is used here, a value taken from a controlled vocabulary should be put in the "otherType" attribute. This option should only be used when applying a controlled vocabulary to this attribute. Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. Example 0 ]]> 9 ]]> 4 ]]> Table Table Body Table Group Table Head Time Method Description The time method or time dimension of the data collection. The "method" attribute is included to permit the development of a controlled vocabulary for this element. For forward-compatibility, DDI 3 XHTML tags may be used in this element. Example panel survey ]]> cross-section ]]> trend study ]]> time-series ]]> Time Period Covered Description The time period to which the data refer. This item reflects the time period covered by the data, not the dates of coding or making documents machine-readable or the dates the data were collected. Also known as span. Use the event attribute to specify "start", "end", or "single" for each date entered. The ISO standard for dates (YYYY-MM-DD) is recommended for use with the "date" attribute. The "cycle" attribute permits specification of the relevant cycle, wave, or round of data. Maps to Dublin Core Coverage element. Inclusion of this element is recommended. Example May 1, 1998 ]]> May 31, 1998 ]]> Title Description Full authoritative title for the work at the appropriate level: marked-up document; marked-up document source; study; other material(s) related to study description; other material(s) related to study. The study title will in most cases be identical to the title for the marked-up document. A full title should indicate the geographic scope of the data collection as well as the time period covered. Title of data collection (codeBook/stdyDscr/citation/titlStmt/titl) maps to Dublin Core Title element. This element is required in the Study Description citation. Example Domestic Violence Experience in Omaha, Nebraska, 1986-1987 ]]> Census of Population, 1950 [United States]: Public Use Microdata Sample ]]> Monitoring the Future: A Continuing Study of American Youth, 1995 ]]> Title Statement Description Title statement for the work at the appropriate level: marked-up document; marked-up document source; study; study description, other materials; other materials for study. Topic Classification Description The classification field indicates the broad substantive topic(s) that the data cover. Library of Congress subject terms may be used here. The "vocab" attribute is provided for specification of the controlled vocabulary in use, e.g., LCSH, MeSH, etc. The "vocabURI" attribute specifies the location for the full controlled vocabulary. Maps to Dublin Core Subject element. Inclusion of this element in the codebook is recommended. Example Public opinion -- California -- Statistics ]]> Elections -- California ]]> Total Responses Description The number of responses to this variable. This element might be used if the number of responses does not match added case counts. It may also be used to sum the frequencies for variable categories. Example 1,056 ]]> There are only 725 responses to this question since it was not asked in Tanzania. ]]> Descriptive Text Description Lengthier description of the parent element. The attribute "level" indicates the level to which the element applies. The attribute "sdatrefs" allows pointing to specific dates, universes, or other information encoded in the study description. Example The following five variables refer to respondent attitudes toward national environmental policies: air pollution, urban sprawl, noise abatement, carbon dioxide emissions, and nuclear waste. ]]> The following four nCubes are grouped to present a cross tabulation of the variables Sex, Work experience in 1999, and Income in 1999. ]]> Total population for the agency for the year reported. ]]> When the respondent indicated his political party reference, his response was coded on a scale of 1-99 with parties with a left-wing orientation coded on the low end of the scale and parties with a right-wing orientation coded on the high end of the scale. Categories 90-99 were reserved miscellaneous responses. ]]> Inap., question not asked in Ireland, Northern Ireland, and Luxembourg. ]]> Detailed poverty status for age cohorts over a period of five years, to be used in determining program eligibility ]]> This is a PDF version of the original questionnaire provided by the principal investigator. ]]> Glossary of Terms. Below are terms that may prove useful in working with the technical documentation for this study.. ]]> This is a PDF version of the original questionnaire provided by the principal investigator. ]]> List of Undocumented Codes Description Values whose meaning is unknown. Example Responses for categories 9 and 10 are unavailable. ]]> Universe Description The group of persons or other elements that are the object of research and to which any analytic results refer. Age,nationality, and residence commonly help to delineate a given universe, but any of a number of factors may be involved, such as sex, race, income, veteran status, criminal convictions, etc. The universe may consist of elements other than persons, such as housing units, court cases, deaths, countries, etc. In general, it should be possible to tell from the description of the universe whether a given individual or element (hypothetical or real) is a member of the population under study. A "level" attribute is included to permit coding of the level to which universe applies, i.e., the study level, the file level (if different from study), the record group, the variable group, the nCube group, the variable, or the nCube level. The "clusion" attribute provides for specification of groups included (I) in or excluded (E) from the universe. If all the variables/nCubes described in the data documentation relate to the same population, e.g., the same set of survey respondents, this element would be unnecessary at data description level. In this case, universe can be fully described at the study level. For forward-compatibility, DDI 3 XHTML tags may be used in this element. This element may be repeated only to support multiple language expressions of the content. Example Individuals 15-19 years of age. ]]> Individuals younger than 15 and older than 19 years of age. ]]> Use Statement Description Information on terms of use for the data collection. This element may be repeated only to support multiple language expressions of the content. Range of Valid Data Values Description Values for a particular variable that represent legitimate responses. Example ]]> ]]> Variable Description This element describes all of the features of a single variable in a social science data file. The following elements are repeatable to support multi-language content: anlysUnit, embargo, imputation, respUnit, security, TotlResp. It includes the following attributes: The attribute "name" usually contains the so-called "short label" for the variable, limited to eight characters in many statistical analysis systems such as SAS or SPSS. The attribute "wgt" indicates whether the variable is a weight. The attribute "wgt-var" references the weight variable(s) for this variable. The attribute "qstn" is a reference to the question ID for the variable. The attribute "files" is the IDREF identifying the file(s) to which the variable belongs. The attribute "vendor" is the origin of the proprietary format and includes SAS, SPSS, ANSI, and ISO. The attribute "dcml" refers to the number of decimal points in the variable. The attribute "intrvl" indicates the interval type; options are discrete or continuous. The "rectype" attribute refers to the record type to which the variable belongs. The "sdatrefs" are summary data description references which record the ID values of all elements within the summary data description section of the Study Description which might apply to the variable. These elements include: time period covered, date of collection, nation or country, geographic coverage, geographic unit, unit of analysis, universe, and kind of data. The "methrefs" are methodology and processing references which record the ID values of all elements within the study methodology and processing section of the Study Description which might apply to the variable. These elements include information on data collection and data appraisal (e.g., sampling, sources, weighting, data cleaning, response rates, and sampling error estimates). The "pubrefs" attribute provides a link to publication/citation references and records the ID values of all citations elements within Other Study Description Materials or Other Study-Related Materials that pertain to this variable. The attribute "access" records the ID values of all elements in the Data Access section that describe access conditions for this variable. The "aggrMeth" attribute indicates the type of aggregation method used, for example 'sum', 'average', 'count'. If a value of "other" is given a term from a controlled vocabulary should be used in the "otherAggrMeth" attribute. The "otherAggrMeth" attribute holds a value from a controlled vocabulary when the aggrMeth attribute has a value of "other".This option should only be used when applying a controlled vocabulary to this attribute. Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. The attribute "measUnit" records the measurement unit, for example 'km', 'miles', etc. The "scale" attribute records unit of scale, for example 'x1', 'x1000', etc. The attribute "origin" records the point of origin for anchored scales. The "nature" attribute records the nature of the variable, whether it is 'nominal', 'ordinal', 'interval', 'ratio', or 'percent'. If the 'other' value is used, a value from a controlled vocabulary should be put into the otherNature attribute. The "otherNature" attribute should be used when the nature attribute has a value of "other". This option should only be used when applying a controlled vocabulary to this attribute. Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. The attribute "additivity" records type of additivity, such as 'stock', 'flow', 'non-additive'. When the "other" value is used, a value from a controlled vocabulary should be put into the "otherAdditivity" attribute. The "otherAdditivity" attribute is used only when the "additivity" attribute has a value of "other". This option should only be used when applying a controlled vocabulary to this attribute. Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. The attribute "temporal" indicates whether the variable relays time-related information. The "geog" attribute indicates whether the variable relays geographic information. The attribute "geoVocab" records the coding scheme used in the variable. The attribute "catQnty" records the number of categories found in the variable, and is used primarily for aggregate data files for verifying cell counts in nCubes. The "representationType" attribute was added to capture the specific DDI 3 representation type to facilitate translation between DDI 2 and DDI 3. If the "other" value is used, a term from a controlled vocabulary may be supplied in the otherRepresentationType attribute. The "otherRepresentationType" attribute should be used when the representationType attribute has a value of "other". This option should only be used when applying a controlled vocabulary to this attribute. Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. Variable Format Description The technical format of the variable in question. Attributes for this element include: "type," which indicates if the variable is character or numeric; "formatname," which in some cases may provide the name of the particular, proprietary format actually used; "schema," which identifies the vendor or standards body that defined the format (acceptable choices are SAS, SPSS, IBM, ANSI, ISO, XML-data or other); "category," which describes what kind of data the format represents, and includes date, time, currency, or "other" conceptual possibilities; and "URI," which supplies a network identifier for the format definition. If the "other" value is used for the schema attribute, a value from a controlled vocabulary must be used with the "otherSchema" attribute, and the complex element controlledVocabUsed should be used to identify the controlled vocabulary to which the selected term belongs. For the category attribute, a value from a controlled vocabulary may be provided if the "other" value is chosen. In this case, the term from the controlled vocabulary should be placed in the "othercategory" attribute, and the controlledVocabUsed element should also be filled in. Example The number in this variable is stored in the form 'ddmmmyy' in SAS format. ]]> 19541022 ]]> Variable Group Description A group of variables that may share a common subject, arise from the interpretation of a single question, or are linked by some other factor. Variable groups are created this way in order to permit variables to belong to multiple groups, including multiple subject groups such as a group of variables on sex and income, or to a subject and a multiple response group, without causing overlapping groups. Variables that are linked by use of the same question need not be identified by a Variable Group element because they are linked by a common unique question identifier in the Variable element. Note that as a result of the strict sequencing required by XML, all Variable Groups must be marked up before the Variable element is opened. That is, the mark-up author cannot mark up a Variable Group, then mark up its constituent variables, then mark up another Variable Group. The "type" attribute refers to the general type of grouping of the variables, e.g., subject, multiple response. Use the value of "other" if the value is to come from an external controlled vocabulary, and place the term into the otherType attribute. The "otherType" attribute is used when the "type" attribute has a value of "other". This option should only be used when applying a controlled vocabulary to this attribute. Use the complex element controlledVocabUsed to identify the controlled vocabulary to which the selected term belongs. Specific variable groups, included within the "type" attribute, are: Section: Questions which derive from the same section of the questionnaire, e.g., all variables located in Section C. Multiple response: Questions where the respondent has the opportunity to select more than one answer from a variety of choices, e.g., what newspapers have you read in the past month (with the respondent able to select up to five choices). Grid: Sub-questions of an introductory or main question but which do not constitute a multiple response group, e.g., I am going to read you some events in the news lately and you tell me for each one whether you are very interested in the event, fairly interested in the fact, or not interested in the event. Display: Questions which appear on the same interview screen (CAI) together or are presented to the interviewer or respondent as a group. Repetition: The same variable (or group of variables) which are repeated for different groups of respondents or for the same respondent at a different time. Subject: Questions which address a common topic or subject, e.g., income, poverty, children. Version: Variables, often appearing in pairs, which represent different aspects of the same question, e.g., pairs of variables (or groups) which are adjusted/unadjusted for inflation or season or whatever, pairs of variables with/without missing data imputed, and versions of the same basic question. Iteration: Questions that appear in different sections of the data file measuring a common subject in different ways, e.g., a set of variables which report the progression of respondent income over the life course. Analysis: Variables combined into the same index, e.g., the components of a calculation, such as the numerator and the denominator of an economic statistic. Pragmatic: A variable group without shared properties. Record: Variable from a single record in a hierarchical file. File: Variable from a single file in a multifile study. Randomized: Variables generated by CAI surveys produced by one or more random number variables together with a response variable, e.g, random variable X which could equal 1 or 2 (at random) which in turn would control whether Q.23 is worded "men" or "women", e.g., would you favor helping [men/women] laid off from a factory obtain training for a new job? Other: Variables which do not fit easily into any of the categories listed above, e.g., a group of variables whose documentation is in another language. The "var" attribute is used to reference all the constituent variable IDs in the group. The "varGrp" attribute is used to reference all the subsidiary variable groups which nest underneath the current varGrp. This allows for encoding of a hierarchical structure of variable groups. The attribute "name" provides a name, or short label, for the group. The "sdatrefs" are summary data description references that record the ID values of all elements within the summary data description section of the Study Description that might apply to the group. These elements include: time period covered, date of collection, nation or country, geographic coverage, geographic unit, unit of analysis, universe, and kind of data. The "methrefs" are methodology and processing references which record the ID values of all elements within the study methodology and processing section of the Study Description which might apply to the group. These elements include information on data collection and data appraisal (e.g., sampling, sources, weighting, data cleaning, response rates, and sampling error estimates). The "pubrefs" attribute provides a link to publication/citation references and records the ID values of all citations elements within codeBook/stdyDscr/othrStdyMat or codeBook/otherMat that pertain to this variable group. The "access" attribute records the ID values of all elements in codeBook/stdyDscr/dataAccs of the document that describe access conditions for this variable group. The attribute "nCube" was included in 2.0 and subsequent versions in ERROR. DO NOT USE THIS ATTRIBUTE. It is retained only for purposes of backward-compatibility. Overall Variable Count Description Number of variables. Example 27 ]]> Version Responsibility Statement Description The organization or person responsible for the version of the work. Example Zentralarchiv fuer Empirische Sozialforschung ]]> Inter-university Consortium for Political and Social Research ]]> Zentralarchiv fuer Empirische Sozialforschung ]]> Zentralarchiv fuer Empirische Sozialforschung ]]> Version Statement Description Version statement for the work at the appropriate level: marked-up document; marked-up document source; study; study description, other material; other material for study. A version statement may also be included for a data file, a variable, or an nCube. Example Second version ]]> Version Description Also known as release or edition. If there have been substantive changes in the data/documentation since their creation, this statement should be used at the appropriate level. The ISO standard for dates (YYYY-MM-DD) is recommended for use with the "date" attribute. Example Second ICPSR Edition ]]> Second version of V25 ]]> Second version of N25 ]]> Weighting Description The use of sampling procedures may make it necessary to apply weights to produce accurate statistical results. Describe here the criteria for using weights in analysis of a collection. If a weighting formula or coefficient was developed, provide this formula, define its elements, and indicate how the formula is applied to data. Example The 1996 NES dataset includes two final person-level analysis weights which incorporate sampling, nonresponse, and post-stratification factors. One weight (variable #4) is for longitudinal micro-level analysis using the 1996 NES Panel. The other weight (variable #3) is for analysis of the 1996 NES combined sample (Panel component cases plus Cross-section supplement cases). In addition, a Time Series Weight (variable #5) which corrects for Panel attrition was constructed. This weight should be used in analyses which compare the 1996 NES to earlier unweighted National Election Study data collections. ]]> West Bounding Longitude Description The westernmost coordinate delimiting the geographic extent of the dataset. A valid range of values, expressed in decimal degrees (positive east and positive north), is: -180,0 <=West Bounding Longitude Value <= 180,0