# Type conversion with a CASE clause

One of the `parking_violation` attributes included for each record is the vehicle's location with respect to the street address of the violation. An `'F'` value in the `violation_in_front_of_or_opposite` column indicates the vehicle was in front of the recorded address. A `'O'` value indicates the vehicle was on the opposite side of the street. The column uses the `TEXT` type to represent the column values. The same information could be captured using a `BOOLEAN` (true/false) value which uses less memory.

In this exercise, you will convert `violation_in_front_of_or_opposite` to a `BOOLEAN` column named `is_violation_in_front` using a `CASE` clause. This column is `true` for records that occur in front of the recorded address and `false` for records that occur opposite of the recorded address.

```
SELECT
  CASE WHEN
          -- Use true when column value is 'F'
          violation_in_front_of_or_opposite = 'F' THEN true
       WHEN
          -- Use false when column value is 'O'
          violation_in_front_of_or_opposite = 'O' THEN false
       ELSE
          NULL
  END AS is_violation_in_front
FROM
  parking_violation;
```

# Applying aggregate functions to converted values

As demonstrated in the video exercise, converting a column's value from `TEXT` to a number allows for calculations to be performed using aggregation functions. The `summons_number` is of `type` TEXT in the `parking_violation` dataset. The maximum (using MAX(`summons_number`)) and minimum (using `MIN(summons_number)`) of the `TEXT` representation `summons_number` can be calculated. If you, however, want to know the size of the range (max - min) of summon_number values , this calculation is not possible because the operation of subtraction on TEXT types is not defined. First, converting `summons_number` to a `BIGINT` will resolve this problem.

In this exercise, you will calculate the size of the range of `summons_number` values as the difference between the maximum and minimum `summons_number`.

```
SELECT
  -- Define the range_size from the max and min summons number
  MAX(summons_number::BIGINT) - MIN(summons_number::BIGINT) AS range_size
FROM
  parking_violation;
```

# Cleaning invalid dates

The `date_first_observed` column in the `parking_violation` dataset represents the date when the parking violation was first observed by the individual recording the violation. Unfortunately, not all `date_first_observed` values were recorded properly. Some records contain a `'0'` value for this column. A `'0'` value cannot be interpreted as a `DATE` automatically as its meaning in this context is ambiguous. The column values require cleaning to enable conversion to a proper `DATE` column.

In this exercise, you will convert the `date_first_observed` value of records with a `'0'` `date_first_observed` value into `NULL` values using the `NULLIF()` function, so that the field can be represented as a proper date.

```
SELECT
  -- Convert date_first_observed into DATE
  DATE(NULLIF(date_first_observed, '0')) AS date_first_observed
FROM
  parking_violation;
```

# Converting and displaying dates

The `parking_violation` dataset with which we have been working has two date columns where dates are represented in different formats: `issue_date` and `date_first_observed`. This is the case because these columns were imported into the database table as `TEXT` types. Using the `DATE` formatting approaches covered in the video exercise, it is possible to convert the dates from `TEXT` values into proper `DATE` columns and then output the dates in a consistent format.

In this exercise, you will use `DATE()` to convert `vehicle_expiration_date` and `issue_date` into `DATE` types and `TO_CHAR()` to display each value in the YYYYMMDD format.

```
SELECT
  summons_number,
  -- Convert issue_date to a DATE
  DATE(issue_date) AS issue_date,
  -- Convert date_first_observed to a DATE
  DATE(date_first_observed) AS date_first_observed
FROM
  parking_violation;
```

```
SELECT
  summons_number,
  -- Display issue_date using the YYYYMMDD format
  TO_CHAR(issue_date, 'YYYYMMDD') AS issue_date,
  -- Display date_first_observed using the YYYYMMDD format
  TO_CHAR(date_first_observed, 'YYYYMMDD') AS date_first_observed
FROM (
  SELECT
    summons_number,
    DATE(issue_date) AS issue_date,
    DATE(date_first_observed) AS date_first_observed
  FROM
    parking_violation
) sub
```

# Extracting hours from a time value

Your team has been tasked with generating a summary report to better understand the hour of the day when most parking violations are occurring. The `violation_time` field has been imported into the database using strings consisting of the hour (in 12-hour format), the minutes, and AM/PM designation for each violation. An example time stored in this field is `'1225AM'`. Note the lack of a colon and space in this format.

Use the `TO_TIMESTAMP()` function and the proper format string to convert the `violation_time` into a `TIMESTAMP`, extract the hour from the `TIME` component of this `TIMESTAMP`, and provide a count of all parking violations by hour issued. The given conversion to a `TIME` value is performed because `violation_time` values do not include date information.

```
SELECT
  -- Populate column with violation_time hours
  EXTRACT(HOUR FROM violation_time) AS hour,
  COUNT(*)
FROM (
    SELECT
      TO_TIMESTAMP(violation_time, 'HH12MIPM')::TIME as violation_time
    FROM
      parking_violation
    WHERE
      violation_time IS NOT NULL
) sub
GROUP BY
  hour
ORDER BY
  hour
```

# A parking violation report by day of the month

Hearing anecdotal evidence that parking tickets are more likely to be given out at the end of the month compared to during the month, you have been tasked with preparing data to get a sense of the distribution of tickets by day of the month. While the date on which the violation occurred is included in the `parking_violation` dataset, it is currently represented as a string date. While this presents an obstacle for producing the data required, you feel confident in your ability to get the data in the format that you need.

In this exercise, you will convert the strings representing the `issue_date` into proper PostgreSQL `DATE` values. From this representation of the data, you will extract the day of the month required to produce the distribution of violations by month day.

```
SELECT
  -- Create issue_day from the day value of issue_date
  EXTRACT(DAY FROM issue_date) AS issue_day,
  -- Include the count of violations for each day
  COUNT(*)
FROM (
  SELECT
    -- Convert issue_date to a `DATE` value
    DATE(issue_date) AS issue_date
  FROM
    parking_violation
) sub
GROUP BY
  issue_day
ORDER BY
  issue_day;
```

# Risky parking behavior

The `parking_violation` table contains many parking violation details. However, it's unclear what causes an individual to violate parking restrictions. One hypothesis is that violators attempt to park in restricted areas just before the parking restrictions end. You have been asked to investigate this phenomenon. You first need to contend with the fact that times in the `parking_violation` table are represented as strings.

In this exercise, you will convert `violation_time`, and `to_hours_in_effect` to `TIMESTAMP` values for violations that took place in locations with partial day restrictions, calculate the interval between the `violation_time` and `to_hours_in_effect` for these records, and identify the records where the `violation_time` is less than 1 hour before `to_hours_in_effect`.

```
SELECT
  summons_number,
  -- Create column for hours between to_hours_in_effect and violation_time
  EXTRACT(HOUR FROM to_hours_in_effect - violation_time) AS hours,
  -- Create column for minutes between to_hours_in_effect and violation_time
  EXTRACT(MINUTE FROM to_hours_in_effect - violation_time) AS minutes
FROM (
  SELECT
    summons_number,
    TO_TIMESTAMP(violation_time, 'HH12MIPM')::time as violation_time,
    TO_TIMESTAMP(to_hours_in_effect, 'HH12MIPM')::time as to_hours_in_effect
  FROM
    parking_violation
  WHERE
    NOT (from_hours_in_effect = '1200AM' AND to_hours_in_effect = '1159PM')
) sub
```

```
SELECT
  -- Return the count of records
  COUNT(*)
FROM
  time_differences
WHERE
  -- Include records with a hours value of 0
  hours = 0 AND
  -- Include records with a minutes value of at most 59
  minutes <= 59;
```