# 11. Data validation and verification

#### <u>Data validation</u>

<b>Data validation</b> is the process to check if the data provided as inputs to a program conforms with the data requirements ( Important definition )

There are 4 common ways for data validation

<table>
<tr>
    <th>Type</th>
    <th>Definition</th>
</tr>
<tr>
    <td>Range check</td>
    <td>Check that limits the input to a particular range of values</td>
</tr>
<tr>
    <td>Format check</td>
    <td>Check that ensures that an input matches the required data type with a given format</td>
</tr>
<tr>
    <td>Length check</td>
    <td>Check that limits an input to a certain (range of) length</td>
</tr>
<tr>
    <td>Presence check</td>
    <td>Check that a required input is supplied</td>
</tr>
</table>

#### <u>Range check</u>

It is a check that limits an input to a particular range of values.

Here is an example of a code that takes in a test score as an integer between 0 and 100 inclusive, and prints out the grade.

In [6]:
score = None
grade = ''

while score == None:
    score = int(input("Enter score: "))
    
    # Range check
    if score < 0 or score > 100:
        print("The score entered is out of range. It should be between 0 and 100 inclusive.")
        score = None

if (score >= 70):
    grade = 'Distinction'
elif (score >= 60):
    grade = 'Merit'
elif (score >= 50):
    grade = 'Pass'
else:
    grade = 'Fail'

print("Grade: " + grade)

Grade: Fail


#### <u>Format check</u>

It is a check that ensures that an input matches a required data type with a given format.

For instance, a particular form may require the date to be entered in a DD/MM/YYYY format. It needs to check that each part of the date has the correct number of digits. Checking that the month is between 01 and 12, and that the date is between 01 and 28, 29, 30, or 31 (depending on the month and year), would fall under the range check described above.

Consider the earlier code of converting a test score to a grade. If someone enters characters other than digits, the program will crash. As such, we need to handle such inputs.

In [7]:
score = None
grade = ''

while (score == None):
    score = input("Enter score: ")
    
    # Format check
    if not score.isdigit():
        print("You have entered an invalid score. It must be an integer between 0 and 100 inclusive.")
        score = None
        continue

    score = int(score)
    
    # Range check
    if (int(score) < 0 or int(score) > 100):
        print("The score entered is out of range. It should be between 0 and 100 inclusive.")
        score = None

if (score >= 70):
    grade = 'Distinction'
elif (score >= 60):
    grade = 'Merit'
elif (score >= 50):
    grade = 'Pass'
else:
    grade = 'Fail'

print("Grade: " + grade)

You have entered an invalid score. It must be an integer between 0 and 100 inclusive.
Grade: Distinction


#### <u>Length check</u>

It is a check that limits an input to a certain (range of) length.

Example: A password input that checks if password entered at least 8 characters or more

In [8]:
password = None

while password == None:
    password = input("Create a password: ")

    # Length check
    if len(password) < 8:
        print("Your password needs to be at least 8 characters long.")
        password = None

print("Password accepted.")

Your password needs to be at least 8 characters long.
Password accepted.


#### <u>Presence check</u>

It is a check that ensures that a required input is supplied.

Example: For the password input above, we will display an error message if no password is supplied.

In [None]:
password = None

while password == None:
    password = input("Create a password: ")

    # Presence check
    if (password == ''):
        print("You have not entered anything.")
        password = None
        
    # Length check
    elif len(password) < 8:
        print("Your password needs to be at least 8 characters long.")
        password = None

print("Password accepted.")

#### <u>Data verification</u>

<b>Data verification</b> is the process to confirm if the data entered was what was intended to be enter ( Important definition )

There are 3 common ways for data verification

<table>
<tr>
    <th>Type</th>
    <th>Definition</th>
</tr>
<tr>
    <td>Check digit</td>
    <td>A number which allows for detection of errors; through the use of a formula</td>
</tr>
<tr>
    <td>Double entry</td>
    <td>A process which asks the user to enter the data twice; to prevent typos or mismatch of data entered</td>
</tr>
<tr>
    <td>Proofreading data</td>
    <td>Example 1: Asking the user to do a visual check on the data before submitting
    Example 2: Comparing the entered data with an exisiting data inside a database</td>
</tr>
</table>