# Lesson 10: Feature Types & Visualization

Before making an exploratory plot, or any plot for that matter, it's a good idea to examine the feature (or features) and decide on each one's type, which we call feature type. (Some times we refer to a feature as a variable and its type as variable type.)

**Nominal**
A feature that represents "named" categories, where the categories do not have a natural ordering, is called nominal. For example political party affiliation (Democrat, Republican, Green, Other).

**Ordinal**
Measurements that represent ordered categories are called ordinal. Examples of ordinal features are: T-shirt size (small, medium, large); Likert-scale response (disagree, neutral, agree). It is important to note that with an ordinal feature, the difference between, say, small and medium, need not be the same as the difference between medium and large. We can order the categories, but the differences between consecutive categories may not even be quantifiable. Think of the number of stars in a restaurant review and what one star means in comparison to two stars.

Ordinal and nominal data are *subtypes* of categorical data. Another name for categorical data is qualitative. In contrast we also have quantitative features.

**Quantitative**
Data that represent numeric measurements or quantities are called quantitative. Examples include: height measured to the nearest cm, price reported in USD, and distance measured to the nearest kilometer. Quantitative features can be further divided into discrete, meaning that only a few values of the feature are possible, and continuous, meaning that the quantity could in principal be measured to arbitrary precision. The number of siblings in a family takes on a discrete set of values (such as, 0, 1, 2). In contrast, height can theoretically be reported to any number of decimal places so we consider it continuous. There is no hard and fast rule to determine whether a quantity is discrete or continuous. In some cases, it can be a judgment call and at other times we may want to purposefully consider a continuous feature to be discrete.

In [None]:
import pandas as pd
import numpy as np

In [None]:
dogs = pd.read_csv('data/akc.csv')
dogs

<img src="images/dog_table.png" width="700" height="7![dog_table.png](attachment:9a832cb6-4ec9-41a3-9fa0-ee9cbca14c60.png)![dog_table.png](attachment:f21ce15a-e92c-49fc-9d33-eedf0c87109b.png)00">

In [None]:
print(dogs.breed)

In [None]:
dogs.info()

In [None]:
dogs['breed'].value_counts()

In [None]:
dogs.group.value_counts()

In [None]:
dogs[dogs['group'] == 'toy']

In [None]:
np.mean(dogs[dogs['group'] == 'toy']['weight'])