# DBT Tests

### Introduction

Another benefit of DBT is that it allows for testing that our data meets certain expectations.  For example, we can use tests to ensure that no data in a specific column has null values.  In this lesson, we'll see how to DBT can allow to write some tests to ensure data quality.

### Schema Testing Options

With DBT, we can write both custom tests of our data or can use some tests that DBT provides for us out of the box.  Let's start by seeing some assertions that DBT provides for us.  DBT provides helpers to test the following:

* `unique`: Assert that all values in a column are unique

* `not_null`: Assert that no values in a column are null

* `accepted_values`: Assert that all values in a column are one of the accepted values

* `relationships`: Ensure that every foreign key maps to a primary key in the other model.

### Adding a test

Ok, now let's see how we can add tests to our DBT repository.  For example, let's start by adding tests for some of the tables in our `mart` folder.

<img src="./adding_test_file.png" width="30%">

> We can both name the file anything we want, so long it ends with `.yml`.  And we can also place the file anywhere we want in the repository.  But the recommend practice is for there to be one yaml file per modeling folder, and to place the file in the folder with the relevant `.sql` files.

Ok, so now that we have our yaml file let's use it to make an assertion about our data.  We can work on the dimension films data, which looks like the following: 

> <img src="./dim_films.png" width="80%">

Let's begin by asserting that every film has a corresponding `year`.  We do so by adding the following to our `films_mart.yaml` file.

```yaml
version: 2

models: 
  - name: dim_films
  columns:
    - name: year
      tests:
        - not_null
```

So the above will assert that in the `dim_films` table, the `year` column will not have any null values.  Now let's go through the syntax.

The `version: 2` indicates the version of DBT that we are using.  Then because we can assert tests for multiple `models` within the same test file, we then add the `models` key.  Then, in the next indentation we specify the name of the columns we want to test in the model.  And finally the tests.

### YAML Syntax

Before moving on, it's worth making sure we understand some of the yaml syntax.  Starting with `models`, notice that we then go two spaces in on the next line.  This is to indicate that we are about to describe a kind of model, `dim_films`.  The key is `models` because we can describe a list of models, and each element of the list should be preceded by a `-`.  So for example, if we wanted to add tests for another model, our yaml would look like the following:

```yaml
version: 2

models: 
  - name: dim_films
    columns:
    - name: year
      tests:
        - not_null
  - name: dim_categories
```

Getting the syntax correct, can be pretty tricky, but luckily DBT will give us some hints if we get it wrong.  For example, let's purposely make a mistake by not indenting the columns table.  If we run `dbt test` to run the tests we'll see the following compilation error:

> <img src="./dbt_compile_error.png" width="40%">

Here, our tests are not even run because DBT is unable to interpret our yaml.  Notice that dbt is indicating two errors, something with `- name: dim_films` or perhaps the `columns` key directly after it.  If you're unable to spot which error is the problem, it's best to remove some of the yaml just to get it working.  For example, the next step might be to update the yaml to the following:

```yaml
version: 2

models: 
  - name: dim_films
```

Then we can run `dbt test` again to see if our yaml compiles.  If it does, we can keep adding onto our code, by this time specifying the `columns`.  

```yaml
version: 2

models: 
  - name: dim_films
    columns: 
```

And if we run DBT test, we get another error.

> <img src="./invalid_cols.png" width="30%">

This time, DBT is telling us that there are `Invalid models config given`.  And if we look at the bottom, it says that under columns, `None is not a type of array`.  Let's add back in our `year` column.

```yaml
version: 2

models: 
  - name: dim_films
    columns:
      - name: year
```

And now, our code will at least compile.

### Creating a test

* Can name the file anything, and can be located anywhere, but recommended is to store close to the models.

* And you do need to materialize the model before you can test it.

<img src="./dbt_test.png" width="100%">

And then run `dbt test` to run tests.

And then if click into the SQL, can confirm that it's not null.

* And then can look at the values.

<img src="./update_tests.png" width="60%">

Then can view errors as `select * from validation_errors`

* Also see relationship tests from DBT

### Data Tests

* Data tests are custom tests we can write, and will pass if zero rows return, and fail if one or more rows return.

> <img src="./data_test.png" width="60%">

> <img src="./having_not.png" width="60%">

Then can run `dbt test --data` to run the data tests or `dbt test` to run everything.