In this chapter, you will be introduced to `text`, `numeric`, `temporal`, and `boolean data` types in `PostgreSQL`. After completing this chapter, you will be able to define the appropriate `data types` for table columns based on the `data values` to be stored.

## 1. Introduction to PostgreSQL data types

#### Data categories in PostgreSQL.
For instance: 
- `text`
- `numeric`
- `temporal`
- `boolean`
- **`Others`:** `geometric, binary, monetary.`

#### Examples.
**EX 1. Representing birthday.**
- Cathy: May 3rd 2006;
- **`Possible presentation`.** Available type:

> **text:** `"May 3, 2006", "5/3/2006"`

> **datetime:** `2006-05-03`.

**EX 2. Tracking payment status.**
- Did attending member pay?
- **`Possible presentation`.** Available type:

> **text:** `"Yes"/"No"` or `"Y" / "N"`

> **boolean:** `'True' / 'False'`

$\Rightarrow$ **`The specific types provide a restriction on values.`**

**EX 3. Trip distance.**
- Mark flew 326 miles for the `client meeting`.
- **`Possible presentation`.** Available type:

> **text:** `"326 miles"` or `"326"` at shortly

> **numeric:** `326`

$\Rightarrow$ **`The specific types provide a restriction on values.`**

### EXERCISEs

#### Exercise 1.1. Matching data representations and categories
Choosing the `proper data type` for representing the data to be stored in a `database` is an important aspect of `database` development. This choice has **`ramifications`** for how the data can be used, how much space is required to store the data, and how resilient the database is to the introduction of erroneous data. This exercise will allow you to identify the best data category for different data values.

**Question.** For each of the `data descriptions`, `drag` the `description` to the most appropriate `PostgreSQL` data category.

**SOLUTION.**

| Numeric | Boolean | Temporal |
|:-|:-|:-|
| The `height` of a customer for a custom clothing retailer business | `Whether or not` a customer is vegan. | A customer `wedding anniversary` |
| The `amount` of a coffee ordered from a distributor| `Whether` a type of plastic is recyclable `or not` | The `length` of the movie|

#### Exercise 1.2. Choosing data types at table creation
It is best to specify the `data type` for a column when the table is initially created. **`The type can be changed later`**. *However, this change may have unintended consequences if the column contains previously populated values*. 

In this exercise, you will specify data types based on the category of data to which the values belong.

We will now return to the `Small Business Association (SBA)` database example from [Chapter 1](). `SBA` loans are provided for specific business projects. You will complete the definition of the project table using the most appropriate column type.

#### Instructions
Complete the **`CREATE TABLE`** `command` for the `project` table using the correct data type from the following choices: `TEXT`, `BOOLEAN`, and `NUMERIC`.

**SOLUTION.**

                -- Create the project table
                CREATE TABLE project (
                                    -- Unique identifier for projects
                                    id SERIAL PRIMARY KEY,
                                    -- Whether or not project is franchise opportunity
                                    is_franchise BOOLEAN DEFAULT FALSE,
                                    -- Franchise name if project is franchise opportunity
                                    franchise_name TEXT DEFAULT NULL,
                                    -- State where project will reside
                                    project_state TEXT,
                                    -- County in state where project will reside
                                    project_county TEXT,
                                    -- District number where project will reside
                                    congressional_district NUMERIC,
                                    -- Amount of jobs projected to be created
                                    jobs_supported NUMERIC
                                );

**Comment!** Good job! By providing the appropriate `data type` for each column, the data in this table will be more useful. Such efforts will allow actions such as calculating an estimate of the total `number of jobs` created by `SBA projects` and enabling searches to know what `franchise opportunities` have been supported by `SBA projects`.

## 2. Defining text columns

### Using `text` in `PostgreSQL`
For example,

                CREATE TABLE book(
                                    isbn CHAR(13) NOT NULL,
                                    author_first_name VARCHAR(50) NOT NULL,
                                    author_last_name VARCHAR(50) NOT NULL,
                                    content  TEXT  NOT NULL
                                  );
The `data-types` **`text`** that can be used be `"text", "varchar(N)"` and `"char(N)"` 

#### 1. The `TEXT` data-types.
- Strings of variable lengths
- Strings of unlimited lengths
- Good for `text-based values` of the `unknown length`.
#### 2. The `VARCHAR`
- Strings of variable lengths
- Strings of unlimited lengths
- Restriction can be imposed on column values. `VARCHAR(N)` meant:

> `N-` maximum number of character stored.

> Column can store `string` with less than `N` characters.

> Inserting any `string` longer than `N` is error

- The `VARCHAR` **without** `N` is equivalent to `TEXT`

#### 3. The `CHAR`
- The `CHAR(N)` **consist exactly** `N` characters.
- `Strings` are right-padded with spaces.
- `CHAR` (without `N`) equivalent to `CHAR(1)`

### EXERCISEs.
#### Exercise 2.1. Matching text types
While there are a small number of `text data types` in `PostgreSQL`, each type is best utilized for representing character sequences with different characteristics. These differences are subtle in the case of using a `VARCHAR(N)` column versus a `CHAR(N)` column. However, there are benefits to be realized by making the correct choice. In this exercise, you will have an opportunity to reason about which text-based data type to use when representing different types of data values.

For each of the data descriptions, identify the most appropriate data type for representing the described data by dragging the description to the appropriate `PostgreSQL` `text data type`.

**SOLUTION.**

| **`CHAR(N)`** | **`VARCHAR(N)`** | **`TEXT`** |
|:-|:-|:-|
| The `nine-digit employee identification number (EIN)` assigned to business by the `Internal Revenue Service`| The `100 character maximum length name` for the community programs hosted by the city library. | A column to store `searchable text of court transcripts`|
| A `two-character code` used to distinguish the stored locations for product in a warehouse. | The `75 character maximum length title` of podcast listed in a publisher's podcast offering table.| A column to repesent the `content of email` for an email service provider.|

**Comment!** 
- You are really showing your ability to distinguish between the different `text data types`. 
- If you know that your data values will not exceed a certain length, in `PostgreSQL`, it is better to specify the column as `VARCHAR(N)` rather than `CHAR(N)` because there are extra computational storage costs when using `CHAR(N)` columns.

#### Exercise 2.2 SBA appeals table
In managing the `SBA database`, it would be helpful for applicants that were denied a loan to have an electronic appeal process that would allow the rejected loan application to be reconsidered. The focus of this exercise will be on designing a table to store both the appeal and the accompanying text describing the justification for reconsideration.

#### Instructions
Create a table named `appeal` which includes a unique identifier,`id`, as well as a column named `content` allowing the storage of as much text as required for the applicant to make her case.

**SOLUTION.**

            -- Create the appeal table
            CREATE TABLE appeal (
                                -- Specify the unique identifier column
                                id SERIAL PRIMARY KEY,
                                -- Define a column for holding the text of the appeals
                                content TEXT NOT NULL
                                );                    
**Comment!** Tremendous job! You have now created a new table making it possible for applicants to appeal rejection decisions. Allowing the `content` column to have an `unrestricted length` makes it possible for `appeals` of significantly different lengths to be stored in the `database`.

## 3. Defining numeric data columns

### Numeric data with discrete values.
- `Using:` **`SMALLINT, INTEGER, BIGINT, SERIAL`** and **`BIGSERIAL`**
- **Example.**

                CREATE TABLE people.employee(
                                            id CHAR(13) NOT NULL,
                                            first_name VARCHAR(50) NOT NULL,
                                            last_name VARCHAR(50) NOT NULL,
                                            num_sales INTEGER
                                          );

- **Detail of integer types.**

| **`TYPEs`** | **`Description`** | **`Range`** |
|:-|:-|:-|
| `SMALLINT` |***small*** `range` integer | `-32768` to `32767` |
| `INTEGER` | `typical choice` for integer | `-2147483648` to `2147483647` |
| `BIGINT` | ***large*** `range` integer | `-9223372036854775808` to `9223372036854775807`|
| `SERIAL` | `auto-increment` integer | `1` to `2147483647` |
| `BIGSERIAL` | ***largre*** `auto-increment` integer | `1` to `9223372036854775807` |

### Numeric data with continuous values.
- `Using:` **`DECIMAL, REAL`** and **`DOUBLE PRECISION`**
- **Example.**

                CREATE TABLE people.employee(
                                              id SERIAL PRIMARY KEY,
                                              first_name VARCHAR(50) NOT NULL,
                                              last_name VARCHAR(50) nOT NULL,
                                              num_sales INTEGER,
                                              salary DECIMAL(8, 2) NOT NULL
                                             );
- **Detail of types.**

| **`TYPEs`** | **`Description`** | **`Range`** |
|:-|:-|:-|
| `DECIMAL(n, m)` | User `specified precision`: `n` meant `precision` and `m` be `scale`. | `131072` digits *before* the decimal point (`d.p`) and `16383` digits *after* the `d.p`|
| `REAL` | `Variable precision` | `6` decimal digits precision |
| `DOUBLE PRECISION` | `Variable precision` | `15` decimal digits precision |

### EXERCISEs.

#### Exercise 2.1. Using integer types
The ***`Small Business Administration through its local Small Business Development Centers (SBDC)`*** provides advising and technical support for entrepreneurs and small businesses. Given the importance of having an online presence, the `SBA` is starting an initiative to help small businesses with e-commerce website development. You have been brought in to develop a database to store data on these development efforts. In this exercise, you will have the opportunity to use your knowledge of integer data types in `PostgreSQL` for defining a table for `SBDC clients`.

#### Instructions
Complete the definition of the `client` table using the most appropriate `integer type` to support the range of possible data values for the column.

**SOLUTION.**

                    -- Create the client table
                    CREATE TABLE client (
                                        -- Unique identifier column
                                        id SERIAL PRIMARY KEY,
                                        -- Name of the company
                                        name VARCHAR(50),
                                        -- Specify a text data type for variable length urls
                                        site_url VARCHAR(50),
                                        -- Number of employees (max of 1500 for small business)
                                        num_employees SMALLINT,
                                        -- Number of customers
                                        num_customers INTEGER
                                        );
**Comments.** 
- By choosing an `integer type` that is most appropriate to the data values that will populate the column, your data tables will use disk storage more effeciently. 
- For example, each data value in a **`SMALLINT`** column uses less storage than a data value in an **`INTEGER`** column.

#### Exercise 2.2. Supporting an `SBA` marketing campaign
The `SBA` has recently seen applications for their loan offerings decline. You have been hired to build a database to support marketing campaigns to increase `SBA loan` applications. 

In this exercise, you will define a campaign table to track campaign characteristics and results. Descriptions for the fields of this table are as follows:

- an `id` column to assign a unique identifier to each campaign
- a `name` column restricted to 50 characters in length
- a `budget` column that is restricted to monetary values less than `$100,000`
- a `num_days` column to indicate the length in days of the campaign (typically 180 days or less)
- a `goal_amount` column to track the target number of applications
- a `num_applications` column to track the number applications received

#### Instructions
Define the `campaign` table including the required columns (`id`, `name`, `budget`, `num_days`, `goal_amount`, `num_applications`) and the most appropriate `data type` specification for each.

**SOLUTION.**

                        -- Create the campaign table
                        CREATE TABLE campaign (
                                              -- Unique identifier column
                                              id SERIAL PRIMARY KEY,
                                              -- Campaign name column
                                              name VARCHAR(50),
                                              -- The campaign's budget
                                              budget NUMERIC(7, 2),
                                              -- The duration of campaign in days
                                              num_days SMALLINT DEFAULT 30,
                                              -- The number of new applications desired
                                              goal_amount INTEGER DEFAULT 100,
                                              -- The number of received applications
                                              num_applications INTEGER DEFAULT 0
                                            );

## 3. Defining boolean and temporal data columns

### Boolean & Temporal data.
Look at the following example,

                        CREATE TABLE book(
                                           id SERIAL PRIMARY KEY,
                                           author_first_name VARCHAR(50) NOT NULL,
                                           author_last_name VARCHAR(50) NOT NULL,
                                           content TEXT NOT NULL,
                                           originally_published DATE NOT NULL,
                                           out_of_print BOOLEAN DEFAULT FALSE
                                            );
**Features / Properties of Boolean.**
- There are 3 possible values: `True` state, `False` state and `NULL` for the unknown state.
- Common for representing `yes-no` scenarios.
- Can be defined with the `keyword:` `BOOL` or `BOOLEAN`; for example

                        CREATE TABLE weather(
                                              is_rain BOOL DEFAULT FALSE,
                                              is_cold BOOLEAN DEFAULT TRUE,
                                                );
                                                
**Temporal data types.**         

| `Types` | `Descriptions` | `Format` |
|:-|:-|:-| 
| `TIMESTAMP`| represent a `date` and `time` | `2010-09-21 15:47:16` | 
| `DATE` | represent a `date` only | `2020-09-26` |
| `TIME` | represent a `time` only | `21:05:42` |

### EXERCISEs
#### Exercise 3.1.  Revisiting the appeals table
The `SBA database` now contains a table for storing `loan applicant appeal requests`. 

The table contains only a **`PRIMARY KEY`** and a text column for the `appeal`. It is useful to track additional information such as when the `appeal` was received, *whether or not the decision was reversed after the appeal, and the date when the appeal was reconsidered*. You will define a new version of the appeals table to capture this information.

#### Instructions
- Add a new column, `received_on`, which captures the `date and time` when the `appeal` was received.
- Add an `approved_on_appeal` column to indicate if the `loan decision` was changed due to the `appeal`. (This field will default to **`NULL`** to indicate that no new decision has yet been reached.)
- Add a `reviewed` column which stores the date when the `appeal` was reviewed.

**SOLUTION.**

                        CREATE TABLE appeal (
                                            id SERIAL PRIMARY KEY,
                                            content TEXT NOT NULL,
                                            -- Add received_on column
                                            received_on TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

                                            -- Add approved_on_appeal column
                                            approved_on_appeal BOOLEAN DEFAULT NULL,

                                            -- Add reviewed column
                                            reviewed DATE
                                            );
**Comment.**
This revamped `appeals` table will provide additional information to properly keep track of `appeal timing` and `decisions`.

#### Exercise 3.2. Boolean defaults
A **`BOOLEAN`** column defaulting to false is not always desired. In some situations, a default value of true is preferred. Imagine a user management system built for a website. 

The `default` behavior is to authorize a newly registered user to access the website. This access remains available unless the user exhibits poor community behavior. Including an approved column on a user table in this database enables such a process. This column is true by default in this scenario.

**Question.** For each of the **`BOOLEAN`** data values described below, choose the best default value for the column when defining a table including this column.

**SOLUTION.**

| `Default: TRUE` | `Default: FALSE` |
|:-|:-|
| The `poisonous` column of the table in a database of `exotic plant` | The `is_closed` column of the `course` table in the `university` course catalog database indicating **if the course if full**|
| | The `is_recylcle` column of the `environmental` database with a table of `common household materials`.|

**Comments.** Noting that:
- It is best to assume a `plant is poisonous` or `default = TRUE` *until this `characteristic is known`*.
- `Materials` should be considered `unrecyclable` *until the `status` is explicitly changed*; so `default = FALSE`
- *Students should be allowed to enroll in a course* until the `class is full`; hence `default = FALSE`.

#### Choosing data types representations
In this chapter, you have explored many of the data types available for use in `PostgreSQL` databases including text, numeric, temporal, and boolean data types. 

Continuing with our `SBA` data example, let's gain some experience determining which data types to use to best represent the data that you are interested in storing. 

For example, if you wanted to understand the `monthly payment rates` of a `borrower`, it would be helpful to have the `loan amount` and `interest rate` represented as numeric values to aid in such a calculation.

**`monthly payment rates`**

| id |
|:-|
|...|

**`borrower`**

| id |
|:-|
|...|

$\Rightarrow$ Your manager has asked you to create a new `loan` table that requires specifying the correct `data type` and properties to use for the table columns.

#### Instructions
- Complete the definition the `loan` table including an `approval_date` to represent the date when a loan is initially approved.
- Set the precision for the decimal-valued gross_approval column to allow loan amounts up to `$5,000,000`.
- Provide a data type to best represent the length (in months) for loan repayment using `term_in_months`.
- Define the data type for the column `revolver_status` to be represented by values of `true` and `false`.

**SOLUTION.**

            -- Create the loan table
            CREATE TABLE loan(
                            borrower_id INTEGER REFERENCES borrower(id),
                            bank_id INTEGER REFERENCES bank(id),
                            
                            -- 'approval_date': the loan approval date
                            approval_date DATE NOT NULL DEFAULT CURRENT_DATE,  
                            
                            -- 'gross_approval': amounts up to $5,000,000.00 (9 letters total with 2 after the float.point)
                            gross_approval DECIMAL(9, 2) NOT NULL, 
                            
                            -- 'term_in_months': total # of months for repayment
                            term_in_months SMALLINT NOT NULL, 
                            
                            -- 'revolver_status': TRUE for revolving line of credit
                            revolver_status BOOLEAN NOT NULL DEFAULT FALSE,  
                            
                            initial_interest_rate DECIMAL(4, 2) NOT NULL
                            );
                            
**Comment.** You just demonstrated your ability to specify `data types` and their properties for columns in a new database. Take note of how `decisions` for defining column `data types` are informed by careful evaluation of the `data` values which will be stored in each `column`.