# Day 10: In class assignment: Unit and Integration Testing

### <p style='text-align: right;'> &#9989; Put your name here.
<p style='text-align: right;'> &#9989; Put your group member names here.

<img alt="Code review icon" src="https://static.thenounproject.com/png/101170-200.png">

Image From: https://static.thenounproject.com/

# ___Learning objectives___

At the end of the exercise, you should be able to:
- Know the difference between unit tests and integration tests.
- Practice writing tests for small, isolated functions.  
- Collaborate to integrate components and validate the entire pipeline.  

----
<a id="preclass"></a>

# Unit tests vs integration tests

In the preclass assignment we learned that unit tests make sure that **components of the code base perform well in isolation**.  They are best used with small, well-defined pieces of code that have controlled inputs and expected outputs.

A complementary tool is **"integration testing"** which verifies that multiple components work together as intended. Integration tests assess how well individual pieces interact within the overall system, often using more realistic inputs and workflows.

----
<a id="preclass"></a>

# Outline

Today you will be working in groups to **implement a simple data processing pipeline**, while assessing it using both unit tests and integration tests.  This pipeline will:
1. Load HTML text from a source.  
2. Extract candidate email addresses using regex.  
3. Filter for `msu.edu` addresses.  
4. Format a final list of MSU email handles for output.

The code that you are writing will mostly be very simple!  The goal is to gain experience writing unit tests for a project and think about how to **design tests that are efficient yet effective**.

## GitHub repository

Each step in the pipeline will be implemented by a function that is contained in its own `.py` file.  All files will be hosted in a GitHub repository.  Each function will also have a corresponding test file that follows one of the naming conventions from the preclass assignment.

## Tasks

The main tasks in the pipeline are as follows:

1. **Load HTML** 
   ```python
   def load_html(source: str) -> str
       """
       Accepts a url (string) and returns the source html of a webpage.

       [save this in html.py]
       """
   ```
2. **Extract All Email Addresses**  
   ```python
   def extract_emails(html: str) -> list[str]
       """
       Parses html (string) and returns a list of email addresses.

       [save this in extract.py]
       """
   ```
3. **Filter for msu.edu Addresses**  
   ```python
   def filter_edu(emails: list[str]) -> list[str]
       """
       Filters a list of email addresses to keep only those that end in msu.edu

       [save this in edufinder.py]
       """
   ```
4. **Format handles**  
   ```python
   def combine_handles(emails: list[str]) -> str
       """
       Combine only the handles of the email addresses into a comma separated string

       [save this in combine.py]
       """
   ```
5. **Pipeline**
   ```python
   from html import load_html
   from extract import extract_emails
   from edufinder import filter_edu
   from combine import combine_handles

   import sys
   
   def msu_handle_finder(source: str) -> str:
       """
       This main function executes the entire pipeline.

       [save this in finder.py]
       """
        html = load_html(source)
        emails = extract_emails(html)
        edu_emails = filter_edu(emails)
        return combine_handles(edu_emails)

    
    if __name__ == '__main__':
        url = sys.argv[1]
        handles = msu_handle_finder(url)

        if len(handles) > 0:
            print('MSU handles found!')
            print(handles)
        else:
            print('No MSU handles found')    
    
    ```

## Team Roles

The work for each team can be split into **five main roles**:

**Role 1:** ___Team Leader___.  
- create GitHub repo  
- invite members to the repo
- oversee the writing of functions and unit tests
- write an integration test that ensures the entire pipeline works together

**Role 2:** ___The Scraper___.
- write `load_html`
- write unit tests for `extract_emails`

**Role 3:** ___The Extractor___.
- write `extract_emails`
- write unit tests for `filter_edu`

**Role 4:** ___The Filtererer___.
- write `filter_edu`
- write unit tests for `combine_handles`

**Role 5:** ___The Jackal___.
- write `combine_handles`
- write unit tests for `load_html`

Assign roles for your team now.  If you have less than five team members then you'll have to divide the extra work.

---
<a id="unittest"></a>
# ___Step 1: Writing code___

&#9989; <font color=blue>**DO THIS:**</font> Write your assigned functions and **be sure to take note of the suggested filenames** to make integration easier at the end.

**If you are the team lead**, take this time to set up the repo and invite your team members.

---
<a id="unittest"></a>
# ___Step 2: Writing tests___

&#9989; <font color=blue>**DO THIS:**</font> Write your assigned **test functions** and **be sure to follow the naming convention from your team**.  

E.g. the test file for `html.py` should be either `html_test.py` or `test_html.py`.

**If you are the team lead**, take this time to write a simple integration test that passes if the pipeline runs without any errors.

---
<a id="unittest"></a>
# ___Step 3: Committing and pushing___

&#9989; <font color=blue>**DO THIS:**</font> Commit and push your code and test functions to the main repo.  **You might need to pull first**, but merging should be easy since everyone is working in separate files.

After the repo is completed, **everyone should pull a copy** to their local machines.

---
<a id="unittest"></a>
# ___Step 4: Run `unittest`___

&#9989; <font color=blue>**DO THIS:**</font> Use `python -m unittest discover` to run all of the tests in the repo.  **Even if your integration test passed**, double check by running `finder.py` using a url where you know there is (and isn't) an MSU handle on the page.

🗒️ **Task:** Record the results of the testing below.

✏️ **Answer:**

---
<a id="unittest"></a>
# ___Step 5: Make edits to code and test suite___

**Did your code run perfectly the first time?**  Of course not.  

&#9989; <font color=blue>**DO THIS:**</font> As a group: 

1) Discuss what changes need to be made to the code and make a plan for executing them.
2) Discuss what changes should be made to the test functions.  Was there an error in your integration that should have been caught by a unit test?  Remember that unit tests should ensure that the inputs and outputs of your function have the proper types.  Make a plan for adding or changing tests as well.

🗒️ **Task:** Record the details of your plan below.

✏️ **Answer:**

---
<a id="unittest"></a>
# ___Step 6: Additional integration tests___

&#9989; <font color=blue>**DO THIS:**</font> Add an integration test that tests two specific urls with known numbers of MSU email handles.  Add it to your repository and ensure your code passes.

---
<a id="unittest"></a>
# ___Step 7: Discussion questions___

1. How does writing tests change the experience of collaboration?  Are there aspects that become easier?  Aspects that become harder?
2. If your code passes all tests today and sits unchanged, will it still pass the same tests in the future?  Why or why not?

✏️ **Answer:**

---
## (Optional) `pytest`

There are many (Many!) other unit testers out there. Fortunately, most of them work nicely together.  One of the best is ```pytest```.

&#9989; **<font color=blue>DO THIS:</font>** Install `pytest` in your `cmse802` environment.

In [None]:
!pip install pytest

&#9989; **<font color=blue>DO THIS:</font>** In your project directory, simply type `pytest` and press enter.

🗒️ **Task:** Record the output of this command below.

✏️ **Answer:**

&#9989; **<font color=blue>DO THIS:</font>** Re-run with `pytest -v`

🗒️ **Task:** Record the output of this command below.  Are there any differences in the information given by `pytest` and `unittest`?

✏️ **Answer:**

## Congratulations, you're done!

Submit this assignment by uploading your notebook to the course Desire2Learn web page.  Go to the "In-Class Assignments" folder, find the appropriate submission link, and upload everything there. Make sure your name is on it!

---

© 2024 Michigan State University. This material was created for the Department of Computational Mathematics, Science and Engineering (CMSE) at Michigan State University.