# Unit Testing for Data Science In Python

1. Unit testing basics
2. Intermediate unit testing
3. Test Organization and Execution
4. Testing Models, Plots and Much More


<br/>
<br/>

## 1. Unit Testing Basics

- TODO
    - pytest를 활용한 기본적인 실행
    - 실행 결과 이해

<br/>
<br/>


* Life cycle of a function  

![life_cycle_of_a_function](https://raw.githubusercontent.com/SSinyu/PlayGround/master/fig/life_cycle_of_a_function.PNG)  

예시) 특정 주택에 대한 면적과 가격 정보가 tab으로 구분되어 문자열로 기록  

|area  price|
|----|
|"2,081\t314,942\n"|
|"1,059\t186,606\n"|
|"\t293,410\n"|
|"1,463238,765\n"|  

<br/>  

`row_to_list`  
-> `row_to_list("2,081\t314,942\n")`    return: ["2,081", "314,942"]  
-> `row_to_list("\t293,410\n")`         return: None

In [1]:
from preprocessing_helpers import row_to_list

print(row_to_list("2,081\t314,942\n"))
print(row_to_list("1,059\t186,606\n"))
print(row_to_list("\t293,410\n"))
print(row_to_list("1,463238,765\n"))

['2,081', '314,942']
['1,059', '186,606']
['', '293,410']
None


* pytest로 unit test 수행

In [2]:
%%writefile test_row_to_list.py
import pytest
from row_to_list import row_to_list

def test_for_clean_row():
    assert row_to_list("2,081\t314,942\n") == ["2,081", "314,942"]

def test_for_missing_area():
    assert row_to_list("\t293,410\n") is None

def test_for_missing_tab():
    assert row_to_list("1,463238,765\n") is None

Overwriting test_row_to_list.py


In [3]:
!pytest test_row_to_list.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 3 items                                                              [0m

test_row_to_list.py [32m.[0m[31mF[0m[32m.[0m[31m                                                  [100%][0m

[31m[1m____________________________ test_for_missing_area _____________________________[0m

    [94mdef[39;49;00m [92mtest_for_missing_area[39;49;00m():
>       [94massert[39;49;00m row_to_list([33m"[39;49;00m[33m\t[39;49;00m[33m293,410[39;49;00m[33m\n[39;49;00m[33m"[39;49;00m) [95mis[39;49;00m [94mNone[39;49;00m
[1m[31mE       AssertionError: assert ['', '293,410'] is None[0m
[1m[31mE        +  where ['', '293,410'] = row_to_list('\t293,410\n')[0m

[1m[31mtest_row_to_list.py[0m:8: AssertionError
[31mFAILED[0m test_row_to_list.py::[1mtest_for_missing_area[0m - AssertionError: assert ['', '293,41

* Unit test script를 통해 특정 기능을 빠르게 파악할 수 있다

In [4]:
!cat test_mystery_function.py

import numpy as np
import pytest

from mystery_function import mystery_function

def test_on_clean_data():
    assert np.array_equal(mystery_function("example_clean_data.txt", num_columns=2), np.array([[2081.0, 314942.0], [1059.0, 186606.0]]))


<br/>
<br/>
<br/>
<br/>
  
## 2. Intermediate unit testing

- TODO
    - pytest 추가 기능
    - test 범위 설정
    - Test Driven Development (TDD)

<br/>
<br/>

* `assert {boolen_experssion}, {message}`

In [5]:
assert 1 == 2, "One is not equal to two"

AssertionError: One is not equal to two

In [6]:
%%writefile test_row_to_list.py
import pytest
from row_to_list import row_to_list

def test_for_clean_row():
    assert row_to_list("2,081\t314,942\n") == ["2,081", "314,942"]

def test_for_missing_area():
    assert row_to_list("\t293,410\n") is None # actually -> ["", "293,410"]

def test_for_missing_tab():
    assert row_to_list("1,463238,765\n") is None

Overwriting test_row_to_list.py


In [7]:
!pytest test_row_to_list.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 3 items                                                              [0m

test_row_to_list.py [32m.[0m[31mF[0m[32m.[0m[31m                                                  [100%][0m

[31m[1m____________________________ test_for_missing_area _____________________________[0m

    [94mdef[39;49;00m [92mtest_for_missing_area[39;49;00m():
>       [94massert[39;49;00m row_to_list([33m"[39;49;00m[33m\t[39;49;00m[33m293,410[39;49;00m[33m\n[39;49;00m[33m"[39;49;00m) [95mis[39;49;00m [94mNone[39;49;00m [90m# actually -> ["", "293,410"][39;49;00m
[1m[31mE       AssertionError: assert ['', '293,410'] is None[0m
[1m[31mE        +  where ['', '293,410'] = row_to_list('\t293,410\n')[0m

[1m[31mtest_row_to_list.py[0m:8: AssertionError
[31mFAILED[0m test_row_to_list.py::[1mtest_for_missing_

In [8]:
%%writefile test_row_to_list.py
import pytest
from row_to_list import row_to_list

test_for_missiong_area_error_message = "row_to_list('\\t293,410\\n') returned ['', '293,410'] instead of None"

def test_for_clean_row():
    assert row_to_list("2,081\t314,942\n") == ["2,081", "314,942"]

def test_for_missing_area():
    assert row_to_list("\t293,410\n") is None, test_for_missiong_area_error_message

def test_for_missing_tab():
    assert row_to_list("1,463238,765\n") is None

Overwriting test_row_to_list.py


In [9]:
!pytest test_row_to_list.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 3 items                                                              [0m

test_row_to_list.py [32m.[0m[31mF[0m[32m.[0m[31m                                                  [100%][0m

[31m[1m____________________________ test_for_missing_area _____________________________[0m

    [94mdef[39;49;00m [92mtest_for_missing_area[39;49;00m():
>       [94massert[39;49;00m row_to_list([33m"[39;49;00m[33m\t[39;49;00m[33m293,410[39;49;00m[33m\n[39;49;00m[33m"[39;49;00m) [95mis[39;49;00m [94mNone[39;49;00m, test_for_missiong_area_error_message
[1m[31mE       AssertionError: row_to_list('\t293,410\n') returned ['', '293,410'] instead of None[0m
[1m[31mE       assert ['', '293,410'] is None[0m
[1m[31mE        +  where ['', '293,410'] = row_to_list('\t293,410\n')[0m

[1m[31mtest_row_to_list.py

* Python은 실수를 부동소수점 방식으로 표현

In [10]:
print(.1 + .1 + .1 == .3)
print(.1 + .1 + .1)

False
0.30000000000000004


In [11]:
assert .1 + .1 + .1 == .3

AssertionError: 

In [12]:
import pytest

assert .1 + .1 + .1 == pytest.approx(.3)

* 다양한 자료형 사용 가능

In [13]:
import numpy as np

assert (0.1 + 0.2, 0.2 + 0.4) == pytest.approx((0.3, 0.6))
assert {'a': 0.1 + 0.2, 'b': 0.2 + 0.4} == pytest.approx({'a': 0.3, 'b': 0.6})
assert np.array([0.1, 0.2]) + np.array([0.2, 0.4]) == pytest.approx(np.array([0.3, 0.6])) 

* 소수 여섯째 자리 까지만 확인

In [14]:
assert pytest.approx(6.1234567) == 6.123456

* 다수의 assert 활용

In [15]:
from train import split_into_training_and_testing_sets

def test_on_six_rows():
    example_argument = np.array([
        [2081.0, 314942.0], 
        [1059.0, 186606.0],
        [1148.0, 206186.0], 
        [1506.0, 248419.0],
        [1210.0, 214114.0], 
        [1697.0, 277794.0]
    ])
    
    expected_training_array_num_rows = 4
    expected_testing_array_num_rows = 2
    
    actual = split_into_training_and_testing_sets(example_argument) 
    # Returns 2-tuple of arrays (training_set, testing_set)
    # Training set contains 75% randomly selected rows of array
    
    assert actual[0].shape[0] == expected_training_array_num_rows, \
            "The actual number of rows in the training array is not {}".format(expected_training_array_num_rows)
    
    assert actual[1].shape[0] == expected_testing_array_num_rows, \
            "The actual number of rows in the testing array is not {}".format(expected_testing_array_num_rows)

* 특정한 에러를 가져야 하는 경우 `pytest.raises()`

In [16]:
example_argument = np.array([2081, 314942, 1059, 186606, 1148, 206186])

split_into_training_and_testing_sets(example_argument)

ValueError: Argument data_array must be two dimensional. Got 1 dimensional array instead!

In [17]:
example_argument = np.array([2081, 314942, 1059, 186606, 1148, 206186])

with pytest.raises(ValueError):
    split_into_training_and_testing_sets(example_argument)

In [18]:
with pytest.raises(ValueError):
    pass

Failed: DID NOT RAISE <class 'ValueError'>

In [19]:
%%writefile test_valueerror_on_one_dimensional_argument.py

import pytest
import numpy as np
from train import split_into_training_and_testing_sets

def test_valueerror_on_one_dimensional_argument():
    example_argument = np.array([2081, 314942, 1059, 186606, 1148, 206186])
    
    with pytest.raises(ValueError) as exception_info:
        split_into_training_and_testing_sets(example_argument)
        
        assert exception_info.match(
            "Argument data array must be two dimensional. "
            "Got 1 dimensional array instead!"
        )

Overwriting test_valueerror_on_one_dimensional_argument.py


In [20]:
!pytest test_valueerror_on_one_dimensional_argument.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 1 item                                                               [0m

test_valueerror_on_one_dimensional_argument.py [32m.[0m[32m                         [100%][0m



* 얼마나 많은 조건을 테스트 해야 할까?

예) 데이터를 입력 받아 train 및 test를 3:1로 분할
```
example_argument = np.array([
        [2081.0, 314942.0], 
        [1059.0, 186606.0],
        [1148.0, 206186.0], 
        [1506.0, 248419.0],
        [1210.0, 214114.0], 
        [1697.0, 277794.0]
    ])

train, test = split_into_training_and_testing_sets(example_argument)  
```

|n of input rows|n of train rows|n of test rows|
|:----|:----|:----|
|8|6|2|
|10|7|3|
|23|17|6|
|...|...|...|

<br/>

* 3가지 유형을 테스트 할 것
    * Bad arguments
    * Special arguments
    * Normal arguments

In [21]:
# Bad arguments
example_argument1 = np.array([2081, 314942, 1059, 186606, 1148, 206186])
with pytest.raises(ValueError):
    split_into_training_and_testing_sets(example_argument1)

example_argument2 = np.array([[845.0, 31036.0]])
with pytest.raises(ValueError):
    split_into_training_and_testing_sets(example_argument2)

![argument_type](https://raw.githubusercontent.com/SSinyu/PlayGround/master/fig/argument_type.PNG)

예시)  

`row_to_list("2,081\t314,942\n")`    return: ["2,081", "314,942"]  

* Boundary Value
  
![boundary_value](https://raw.githubusercontent.com/SSinyu/PlayGround/master/fig/boundary_value.PNG)

In [22]:
import pytest
from preprocessing_helpers import row_to_list

def test_on_no_tab_no_missing_value():    # (0, 0) boundary value
    # Assign actual to the return value for the argument "123\n"
    actual = row_to_list("123\n")
    assert actual is None, "Expected: None, Actual: {0}".format(actual)
    
def test_on_two_tabs_no_missing_value():    # (2, 0) boundary value
    actual = row_to_list("123\t4,567\t89\n")
    # Complete the assert statement
    assert actual is None, "Expected: None, Actual: {0}".format(actual)
    
def test_on_one_tab_with_missing_value():    # (1, 1) boundary value
    actual = row_to_list("\t4,567\n")
    # Format the failure message
    assert actual is None, "Expected: None, Actual: {0}".format(actual)

<br/>
<br/>

* Test Driven Development (TDD)
    * 기능 개발 전에 unit test 부터 작성한다  
    -> unit test를 미루지 않게 된다  
    -> 명확하게 필요한 기능을 정의하여 개발에 도움이 된다

<br/>  

예시) 천단위 구분자(,) 문자열을 숫자로 변환 (`convert_to_int`)


-> `convert_to_int("2,081")`    return: 2081  


In [37]:
%%writefile test_convert_to_int.py

import pytest
from preprocessing_helpers import convert_to_int

def test_with_no_comma():
    ...

def test_with_one_comma():
    ...

def test_with_two_commas():
    ...

<br/>
<br/>
<br/>
<br/>
  
## 3. Test Organization and Execution

- TODO
    - 체계적인 unit test 구성
    - 특정 조건으로 unit test 수행
    - 일부 기능의 test 생략

<br/>

* Project structure example
    * "test_" 로 시작하는 이름의 파일을 테스트 모듈로 식별
  
```
    src/
    ├── data/
    │   ├── __init__.py
    │   └── preprocessing_helpers.py
    │
    ├── features/
    │   ├── __init__.py
    │   └── as_numpy.py
    │
    ├── models/
    │   ├── __init__.py
    │   └── train.py
    │
    │
    tests/
    ├── data/
    │   ├── __init__.py
    │   └── test_preprocessing_helpers.py
    │
    ├── features/
    │   ├── __init__.py
    │   └── test_as_numpy.py
    │
    └── models/
        ├── __init__.py
        └── test_train.py   

```

* class를 사용해 function 단위로 분리하여 test
    * "Test"로 시작하는 class 이름을 테스트 class로 식별
    * "test_"로 시작하는 function 이름을 테스트 function으로 식별

In [None]:
%%writefile test_preprocessing_helpers.py
import pytest
from data.preprocessing_helpers import row_to_list, conver_to_int

def test_on_no_tab_no_missing_value():
    ...

def test_on_two_tabs_no_missing_value():
    ...

def test_with_no_comma():
    ...

def test_with_one_comma():
    ...

...

# (X)

In [None]:
%%writefile test_preprocessing_helpers.py
import pytest
from data.preprocessing_helpers import row_to_list, conver_to_int

class TestRowToList:
    def test_on_no_tab_no_missing_value(self):
        ...

    def test_on_two_tabs_no_missing_value(self):
        ...

    ...

class TestConvertToInt:
    def test_with_no_comma(self):
        ...

    def test_with_one_comma(self):
        ...

    ...

# (O)

예시 1) `test_preprocessing_helpers.py` 만 실행하기

-> `pytest data/test_preprocessing_helpers.py`

```
    tests/
    ├── data/
    │   ├── __init__.py
    │   └── test_preprocessing_helpers.py
    │               ├── TestRowToList (class)
    │               │        ├── test_on_normal_argument_1 (function)
    │               │        ├── test_on_normal_argument_2 (function)
    │               │        └── ...
    │               └── TestConvertToInt (class)
    │                        └── ...
    ├── features/
    │   ├── __init__.py
    │   └── test_as_numpy.py
    │
    └── models/
        ├── __init__.py
        └── test_train.py   

```
<br/>
<br/>

예시 2) `TestConvertToInt (class)` 또는 `test_on_string_with_one_comma (fucntion)` 만 실행하기

-> `pytest data/test_preprocessing_helpers.py::TestConvertToInt`  
-> `pytest data/test_preprocessing_helpers.py::TestConvertToInt::test_on_string_with_one_comma`

```
    tests/
    ├── data/
    │   ├── __init__.py
    │   └── test_preprocessing_helpers.py
    │               ├── TestRowToList (class)
    │               └── TestConvertToInt (class)
    │                        ├── test_on_string_with_one_comma (function)
    |                        └── ...
    ├── features/
    │   ├── __init__.py
    │   └── test_as_numpy.py
    │
    └── models/
        ├── __init__.py
        └── test_train.py   

```
<br/>
<br/>

예시 3) 키워드 표현식으로 `TestSplitIntoTrainingAndTestingSets (class)` 실행하기

-> `pytest -k "TestSplitIntoTrainingAndTestingSets"`  
-> `pytest -k "TestSplit"`


```
    tests/
    ├── data/
    │   ├── __init__.py
    │   └── test_preprocessing_helpers.py
    │
    ├── features/
    │   ├── __init__.py
    │   └── test_as_numpy.py
    │
    └── models/
        ├── __init__.py
        └── test_train.py   
                    └── TestSplitIntoTrainingAndTestingSets (class)

```
<br/>
<br/>

예시 4) 키워드 표현식으로 `TestSplitIntoTrainingAndTestingSets (class)`의 `test_on_one_row` 제외 나머지 실행하기

-> `pytest -k "TestSplit and not test_on_one_row"`  


```
    tests/
    ├── data/
    │   ├── __init__.py
    │   └── test_preprocessing_helpers.py
    │
    ├── features/
    │   ├── __init__.py
    │   └── test_as_numpy.py
    │
    └── models/
        ├── __init__.py
        └── test_train.py   
                    └── TestSplitIntoTrainingAndTestingSets (class)
                                    ├── test_on_empty_row.py
                                    ├── test_on_one_row.py
                                    └── test_on_two_row.py   

```
<br/>
<br/>

* 실패할 테스트를 미리 알려주기 (using TDD)

In [None]:
import pytest

class TestTrainModel:
    @pytest.mark.xfail
    def test_on_linear_data(self):
        ...

* 특정 조건에서 테스트 건너뛰기

In [None]:
import sys

class TestConvertToInt:
    @pytest.mark.skipif(sys.version_info > (2, 7), reason="requires python 3.7 or higher")
    def test_with_no_comma(self):
        ...

In [23]:
%%writefile test_tmp.py
import pytest
import sys

class TestTrainModel:
    @pytest.mark.xfail(reason="not implemented")
    def test_on_linear_data(self):
        assert False


class TestConvertToInt:
    @pytest.mark.skipif(sys.version_info >= (2, 7), reason="requires python 3.7 or higher") 
    def test_with_no_comma(self):
        pass

Overwriting test_tmp.py


In [24]:
!pytest test_tmp.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 2 items                                                              [0m

test_tmp.py [33mx[0m[33ms[0m[33m                                                           [100%][0m



<br/>
<br/>

* 테스트를 건너뛰는 이유 표시하기
    * `pytest -rs`
    * `pytest -rx`

In [25]:
!pytest -rs test_tmp.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 2 items                                                              [0m

test_tmp.py [33mx[0m[33ms[0m[33m                                                           [100%][0m

[33mSKIPPED[0m [1] test_tmp.py:11: requires python 3.7 or higher


In [26]:
!pytest -rx test_tmp.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 2 items                                                              [0m

test_tmp.py [33mx[0m[33ms[0m[33m                                                           [100%][0m

[33mXFAIL[0m test_tmp.py::[1mTestTrainModel::test_on_linear_data[0m - not implemented


In [27]:
!pytest -rsx test_tmp.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 2 items                                                              [0m

test_tmp.py [33mx[0m[33ms[0m[33m                                                           [100%][0m

[33mSKIPPED[0m [1] test_tmp.py:11: requires python 3.7 or higher
[33mXFAIL[0m test_tmp.py::[1mTestTrainModel::test_on_linear_data[0m - not implemented


<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
  
## 4. Testing Models, Plots and Much More


<br/>
<br/>

* 파일을 입력받는 경우, 테스트를 위해 임시로 파일을 만들고 테스트 후 제거해야 한다  
-> `@pytest.fixture` 사용

In [28]:
%%writefile test_tmp.py

import os, pytest
import numpy as np
from as_numpy import get_data_as_numpy_array


@pytest.fixture
def clean_data_file():
    file_path = "clean_data_file.txt"
    with open(file_path, "w") as f:
        f.write("201\t305671\n7892\t298140\n501\t738293\n")   # setup
    yield file_path
    os.remove(file_path)                                      # teardown

    
def test_on_clean_file(clean_data_file):
    expected = np.array([
        [201.0, 305671.0], 
        [7892.0, 298140.0], 
        [501.0, 738293.0]
    ])
    actual = get_data_as_numpy_array(clean_data_file, 2)
    assert actual == pytest.approx(expected), "Expected: {0}, Actual: {1}".format(expected, actual) 

Overwriting test_tmp.py


In [29]:
!pytest test_tmp.py

platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
Matplotlib: 3.5.2
Freetype: 2.6.1
rootdir: /home/ubuntu/ssinyu/FDB
plugins: mock-3.10.0, mpl-0.16.1
collected 1 item                                                               [0m

test_tmp.py [32m.[0m[32m                                                            [100%][0m



<br/>

* 임시 디렉토리 `tmpdir` 사용하기
    1. setup `tmpdir`
    2. setup `tmpdir/{file}`
    3. test
    4. teardown `tmpdir/{file}`
    5. teardown `tmpdir`

In [None]:
import pytest

@pytest.fixture
def raw_and_clean_data_file(tmpdir):
    raw_data_file_path = tmpdir.join("raw.txt")
    clean_data_file_path = tmpdir.join("clearn.txt")
    with open(raw_data_file_path, "w") as f:
        f.write("1,801\t201,411\n")
    yield raw_data_file_path, clean_data_file_path

    # teardown 알아서 수행됨

<br/>   

* 함수 사이의 의존성에 관계 없이 테스트 수행  
ex) `A()`의 출력이 `B()`의 동작에 사용

In [None]:
def convert_to_int(comma_separated_integer_string):
    ...
    # problem
    ...

def preprocess(raw_path, clean_path):
    with open(raw_path, "r") as input_file:
        rows = input_file.readlines()
    
    with open(clean_path, "w") as output_file:
        for row in rows:
            ...
            result = convert_to_int(row)
            ...
            output_file.write()


# ==> 
def convert_to_int_bug_free(comma_separated_integer_string):
    return_values = {
        "1,801": 1801, "201,411": 201411, "2,002": 2002, 
        "333,209": 333209, "1990": None, "782,911": 782911, 
        "1,285": 1285, "389129": None
    }
    return return_values[comma_separated_integer_string]


from unittest.mock import call
class TestPreprocess(object):
    def test_on_raw_data(self, raw_and_clean_data_file, mocker):
        raw_path, clean_path = raw_and_clean_data_file
        convert_to_int_mock = mocker.patch(
            "data.preprocessing_helpers.convert_to_int",
            side_effect=convert_to_int_bug_free
        )
        preprocess(raw_path, clean_path)

        with open(clean_path, "r") as f:
            lines = f.readlines()
        first_line = lines[0]
        assert first_line == "1801\t201411\n"
        second_line = lines[1]
        assert second_line == "2002\t333209\n"  


<br/>

* 모델 테스트
    * 모델에 정확하게 맞는 데이터를 입력하고 출력값 확인

In [30]:
import numpy as np
import pytest
from models.train import model_test

def test_on_perfect_fit():
    test_argument = np.array([[1, 3], [2, 5], [3, 7]])
    expected = 1
    actual = model_test(test_argument, slope=2, intercept=1)
    assert actual == pytest.approx(expected), "Expected: {0}, Actual: {1}".format(expected, actual)

* 플롯 테스트  

* 기본 구조
```
    src/
    ├── data/
    ├── features/
    ├── models/
    ├── visualization/
    │        └── plots.py
    │              └── get_plot_for_best_fit_line (function)
    tests/
    ├── data/
    ├── features/
    ├── models/
    └── visualization/
             └── test_plots.py
```

ex) $y=5x-2$

In [31]:
from visualization.plots import get_plot_for_best_fit_line
import pytest
import numpy as np

class TestGetPlotForBestFitLine(object):
    @pytest.mark.mpl_image_compare
    def test_plot_for_almost_linear_data(self):
        slope = 5.0
        intercept = -2.0
        x_array = np.array([1.0, 2.0, 3.0])
        y_array = np.array([3.0, 8.0, 11.0])
        title = "Test plot for almost linear data"
        return get_plot_for_best_fit_line(slope, intercept, x_array, y_array, title)

In [None]:
!pytest -k "test_plot_for_almost_linear_data" --mpl-generate-pth visualization/baseline

* 위 실행시 visualization/baseline/ 내 function 이름으로 png 파일 생성
  
```
    src/
    ├── data/
    ├── features/
    ├── models/
    ├── visualization/
    │        └── plots.py
    │              └── get_plot_for_best_fit_line (function)
    tests/
    ├── data/
    ├── features/
    ├── models/
    └── visualization/
             ├── test_plots.py
             └── baseline/
                    └── test_plot_for_almost_linear_data.png
             
```

In [None]:
@pytest.mark.mpl_image_compare
def test_plot_for_almost_linear_data():
    ...


!pytest -k "test_plot_for_almost_linear_data" --mpl

* baseline 이미지 및 테스트 이미지가 일치하지 않는 영역이 흰색으로 나타남

![plot_diff](https://raw.githubusercontent.com/SSinyu/PlayGround/master/fig/plot_diff.PNG)
