# Chapter 1: Introduction to Statistics

### Section 1.2: Data Classification

### Section Objectives:
##### Objective 1: How to distinguish between qualitative data and quantitative data
##### Objective 2: How to classify data with respect to the four levels of measurement

##### Objective 1: How to distinguish between qualitative data and quantitative data

##### Definitions: Types of Data

* **Qualitative data** is data that consists of attributes, labels, or nonnumerical entries (e.g. birth stone, eye color, pizza topping).
* **Quantitative data** is data that consists of numerical measurements or counts (e.g. age, temperature, weight).

##### Example:

The basic prices of several vehicles are shown in the table. Which data are qualitative data and which are quantitative data?

|Model|Suggested retail price|
|:-----:|:----------------------:|
|Focus Sedan|\$15,995|
|Fusion|\$19,270|
|Mustang|\$20,995|
|Edge|\$26,920|
|Flex|\$28,495|
|Escape Hybrid|\$32,260|
|Expedition|\$35,085|
|F-450|\$44,145|

##### Solution:

|Model (qualitative data)|Suggested retail price (quantitative data)|
|:-:|:-:|
|Focus Sedan|\$15,995|
|Fusion|\$19,270|
|Mustang|\$20,995|
|Edge|\$26,920|
|Flex|\$28,495|
|Escape Hybrid|\$32,260|
|Expedition|\$35,085|
|F-450|\$44,145|

* The names of vehicle models are nonnumerical entries $\rightarrow$ left column consists of qualitative data
* The suggested retail prices are numerical entries $\rightarrow$ right column consists of quantitative data

##### Objective 2: How to classify data with respect to the four levels of measurement

##### Four Levels of Measurement

1. Nominal
2. Ordinal
3. Interval
4. Ratio

##### Levels of Measurement

1. **Nominal level of measurement** is classified as
    * Qualitative data only
    * Categorized using names, labels, or qualities
    * No mathematical computations can be made

2. **Ordinal level of measurement** is classified as
    * Qualitative or quantitative data
    * Data can be arranged in order, or ranked
    * Differences between data entries is not meaningful

##### Example:

Two data sets are shown. Which data set consists of data at the nominal level? Which data set consists of data at the ordinal level?

<!-- <table> -->
<!--     <tr><th>Table 1</th><th>Table 2</th></tr> -->
<!-- <tr><td> -->
        
|Top Five TV Programs (from 5/4/09 to 5/10/09)|
|:-|
|1. American Idol-Wednesday|
|2. American Idol-Tuesday|
|3. Dancing with the Stars|
|4. NCIS|
|5. The Mentalist|

<!-- </td><td> -->
        
|Network Affiliates in Pittsburgh, PA|
|:-:|
|WTAE (ABC)|
|WPXI (NBC)|
|KDKA (CBS)|
|WPGH (FOX)|
        
<!-- </td></tr></table> -->

##### Solution:

<!-- <table>
    <tr><th>Table 1</th><th>Table 2</th></tr>
    <tr><td> -->
        
|$\overbrace{\text{Top Five TV Programs (from 5/4/09 to 5/10/09)}}^{\Large{\text{Ordinal}}}$|
|:-|
|1. American Idol-Wednesday|
|2. American Idol-Tuesday|
|3. Dancing with the Stars|
|4. NCIS|
|5. The Mentalist|

<!-- </td><td> -->
        
|$\overbrace{\text{Network Affiliates in Pittsburgh, PA}}^{\Large{\text{Nominal}}}$|
|:-:|
|WTAE (ABC)|
|WPXI (NBC)|
|KDKA (CBS)|
|WPGH (FOX)|
        
<!-- </td></tr></table> -->

* Table 1 lists the rank of five TV programs; the data can be ordered; the difference between ranks is not meaningful $\rightarrow$ the table data falls under ordinal level of measurement.
* Table 2 lists the call letters of each network affiliate; call letters are names of network affiliates $\rightarrow$ the table data falls under the nominal level of measurement.

##### Levels of Measurement

3. **Interval level of measurement** is classified as
    * Quantitative data
    * Data can be ordered
    * Differences between data entries is meaningful
    * Zero represents a position on a scale (not an **inherent zero**; i.e. zero does not imply "none")
4. **Ratio level of measurement** is classified as
    * Quantitative data
    * Data can be ordered
    * Differences between entries is meaningful
    * Zero entry *is* an **inherent zero** (i.e. zero implies "none")
    * A meaningful ratio of two data values can be formed
    * One data value can be expressed as a multiple of another

##### Example:

Two data sets are shown. Which data set consists of data at the interval level of measurement? Which data set consists of data at the ratio level of measurement?

<!-- <table>
    <tr><th>Table 1</th><th>Table 2</th></tr>
<tr><td width="20%"> -->
        
|New York Yankees' World Series Victories (Years)|
    |:-:|
    |1923|
    |1927|
    |1928|
    |1932|
    |1936|
    |1937|
    |1938|
    |1939|
    |1941|
    |1943|
    |1947|
    |1949|
    |1950|
    |1951|
    |1952|
    |1953|
    |1956|
    |1958|
    |1961|
    |1962|
    |1977|
    |1978|
    |1996|
    |1998|
    |1999|
    |2000|
    |2009|
    

<!-- </td><td width="25%"> -->
        
|2009 American League Home Run Totals (by Team)||
    |:-:|:-:|
    |Team|Home Runs|
    |Baltimore|160|
    |Boston|212|
    |Chicago|184|
    |Cleveland|161|
    |Detroit|183|
    |Kansas City|144|
    |Los Angeles|173|
    |Minnesota|172|
    |New York|244|
    |Oakland|135|
    |Seattle|160|
    |Tampa Bay|199|
    |Texas|224|
    |Toronto|209|
        
<!-- </td></tr></table> -->

##### Solution:

<!-- <table>
    <tr><th>Table 1</th><th>Table 2</th></tr>
<tr><td width="20%"> -->
        
|$\overbrace{\text{New York Yankees' World Series Victories (Years)}}^{\Large{\text{Interval}}}$|
    |:-:|
    |1923|
    |1927|
    |1928|
    |1932|
    |1936|
    |1937|
    |1938|
    |1939|
    |1941|
    |1943|
    |1947|
    |1949|
    |1950|
    |1951|
    |1952|
    |1953|
    |1956|
    |1958|
    |1961|
    |1962|
    |1977|
    |1978|
    |1996|
    |1998|
    |1999|
    |2000|
    |2009|
    

<!-- </td><td width="25%"> -->
        
|$\overbrace{\text{2009 American League Home Run Totals (by Team)}}^{\Large{\text{Ratio}}}$||
    |:-:|:-:|
    |Team|Home Runs|
    |Baltimore|160|
    |Boston|212|
    |Chicago|184|
    |Cleveland|161|
    |Detroit|183|
    |Kansas City|144|
    |Los Angeles|173|
    |Minnesota|172|
    |New York|244|
    |Oakland|135|
    |Seattle|160|
    |Tampa Bay|199|
    |Texas|224|
    |Toronto|209|
        
<!-- </td></tr></table> -->

* Table 1 contains quantitative data; difference between years has meaning; zero does not mean "none" $\rightarrow$ the data is at the interval level of measurement.
* Table 2 contains quantitative data; differences have meaning; zero *does* mean "none" $\rightarrow$ the data is at the ratio level of measurement.

##### End of Section