# Learning Objectives

- [ ]  2.4.1 Explain the difference between data validation and data verification.
- [ ]  2.4.2 Understand data validation techniques such as:
    - range check
    - format check
    - length check
    - presence check
    - check digit
- [ ]  2.4.3 Identify, explain and correct syntax, logic and runtime errors.
- [ ]  2.4.4 Design appropriate test cases using normal, abnormal and extreme data for testing and debugging programs.


# 6 Program Testing and Data Validation


# 6.1 Errors

When writing programs, it can be common to encounter various errors and bugs. In order to ensure that these are eradicated, programmers typically follow various debugging and testing mechanisms, as well as various commonly accepted good programming practices.

Programs may not perform as expected for a number of reasons. In general, errors made while writing a program may include 

- Syntax errors
- Static semantic errors
- Logic errors
- Arithmetic errors

Some of the errors above also falls under the category of **runtime error**, an error that occurs during the execution of a program.

## 6.1.1 Syntax Error

A syntax error corresponds to a structural error of a program such that the code violates the grammar, or rules, of the programming language. In general these are easy to debug since a diagnostic message (that identifies where and what the error) will typically be output by the interpreter or compiler that is translating your code. Typically, only code that is free of any syntax errors will be made executable.

Consider the following examples of a syntax error (in Python):


In [1]:
if A >>> B:
    return("A is greater than B")

SyntaxError: invalid syntax (<ipython-input-1-7111fb4bb4b0>, line 1)

Note that in this case, the syntax error corresponds to the use of the operator `>>>`, which does not exist in Python, and thus constitutes a syntax error (note: `>` should have been used in place of `>>>`).

In [None]:
x = int('123'

## 6.1.2 Static Semantic Errors

Loosely speaking, a static semantic error corresponds to an error that, while syntactically correct, results in an error because of improper use of program statements.

Typically, we rely on testing to determine if such errors exist, and may also rely on exception handling if any form of external input is required (e.g., data to be specified by a user or read from a file).

Consider the following example of static semantic error (in Python):

In [None]:
#Example 6.1.2.1
#The `TypeError` occurs when a function is called on a value of an inappropriate type.
s = 'my string'
s = s - 2

While the above code is syntactically correct, it will cause a runtime error (i.e., an error causes your program to stop unexpectedly - i.e., to crash or freeze - when run) as the code contains a static semantic error: we may not use the `-` operation on a string (first operand) and an integer (second operand); the subtraction operator must only be used with numeric (e.g., `int` and/or `float`) operands.

In [None]:
#Example 6.1.2.2
#The `NameError` will occur if an unknown variable/object is used, i.e. using an object before creating it.
print(my_invisible_var)

In [None]:
#Example 6.1.2.3
#The `IndexError` occurs if you try to access an item by index which is outside the range of the list.
arr = [1,2,3]
arr[3]

In [None]:
#Example 6.1.2.4
#The `ValueError` occurs when a function is called on a value of the correct type, but with an inappropriate value.
#For example, `int()` function is expecting its input to be a string of numerical type.
x = int('a')

In [None]:
#Example 6.1.2.5
#Functions and variables of an object are collectly called `attributes`. When you try to access a non-existent attribute, e.g. due to typo mistake, an AttributeError will occur.
s = list(range(9))
s.sort1()

You can view the full list of Error classes in Python, by using `*Error?`

In [None]:
*Error?

## 6.1.3 Logical Errors

Perhaps the most difficult kinds of errors to resolve are the logic errors. These errors correspond to those that allow a program to run, but produce an unintended or undesired result. These are generally difficult to debug because no error messages are output by the compiler/interpreter. In order to find and correct such errors, we must typically manually trace the code (note that this will be explained in greater detail later in Section 6.3).

Consider the following example of a semantic/logic error (in Python):


In [None]:
from math import pi
area = pi * (radius**3)

Obviously, the incorrect formula for calculating the area of a circle has been utilised. However, this cannot be determined by the compiler/interpreter.

## 6.1.4 Arithmetic Errors

Finally, we may also encounter arithmetic errors, which arise from illegal mathematical expressions. As this type of error will not be flagged by the compiler/interpreter, a runtime error  will occur.
 
Some examples of such errors include:

- A division by 0 error.
- An overflow error; i.e., a result of an arithmetic expression may result in a value that is too large to fit into the finite number of bits, which in turn may lead to wrong/negative values being stored.
- An underflow error; i.e., a result of an arithmetic expression may result in a value that is too small to fit into the finite number of bits, which in turn may lead to zero being stored.
- Approximation errors; i.e., the precise value of an arithmetic expression by not be stored due to the limitations of the floating point representation of decimal values. 

In [None]:
a = 10**0.5
print(a**2)

# 6.2 Error Handling

Handling of exception in important and common in Python code. This is because python believes in `"Ask for forgiveness not permission"`.

To handle exceptions, use a `try/except` statement.
* The `try` block contains code that might throw an exception. 
* If that exception occurs, the remaining code in the `try` block will be **skipped**, and the code in the `except` block is run. 
* If no error occurs, the code in the `except` block doesn't run.

Syntax is:

>```python
>try:
>   <some_code>
>except:
>   <some_code>
>```

**Exercise:**

Use `try-except` to make sure `result = 100/0` statement doesn't cause program to fail. 
* Print out `No division by zero` when exception occurs.

In [None]:
#YOUR_CODE_HERE

**Exercise:**

1. Run code below
2. Change `x = int('a')` to `x = int('999')` and try again.

In [None]:
try:
    print('point 1')
    x = int('a')
#    x = int('999')
    print("point 2")
except ValueError:
    print("point 3")

An `except` statement without any exception specified will catch all errors. This is a bad practice as it may catch unexpected errors and hide programming mistakes. Thus, you should specify the error you're expecting in your code.

**Example:**

In [None]:
try:
    x = 10/ 0
except:
    print('An error occurred')

A `try` statement can have multiple different `except` blocks to handle different exceptions.

Multiple exceptions can also be put into a single except block using parentheses, to have the except block handle all of them.

**Exercise:**

1. Run code below
2. Comment/uncomment `y` assignment statement(s) and try again to see different exceptions.

In [None]:
try:
    x = 10
#   y = int('abc')
#   y = x / 0
    y = x + "hello"
    print('No exception')
except ZeroDivisionError:
    print("Divided by zero")
except (ValueError, TypeError):
    print("ValueError or TypeError occurred")

You can print out the exception object, e.g. in log file, to find out more information about the error.

**Exercise:**

1. Run code below
2. Comment/uncomment `y` assignment statement(s) and try again to see different exceptions.

In [None]:
try:
    x = 10
#   y = int('abc')
#   y = x / 0
    y = x + "hello"
    print('No exception')
except Exception as e:
    # by default, the print function will convert the object passed into a string object
    print(str(e))
    print(repr(e))

The `traceback` module provides methods for formatting and printing exceptions and their calling stacks, which is helpful in identifying the cause of error.

**Example**

In [None]:
import traceback

try:
    x = 10
#   y = int('abc')
#   y = x / 0
    y = x + "hello"
    print('No exception')
except Exception:
    traceback.print_exc()

The `except` code block is placed after `try/except` statements.

If error does not occur, i.e. `except` code block is not executed, `else` code block will run.

The `finally` code block is placed at the bottom of a `try/except/else` statement. 

Code within a `finally` statement always runs regardless whether an exception happens. 

This is good place to put some code which always need to run, e.g. clean up or release resource.

**Exercise**

Comment/uncomment the statement `print(1 / 0)` and examine the printouts.

In [None]:
try:
    #print("try")
    print(1 / 0)
except ZeroDivisionError:    # execute if exeption
    print("except")
else:     # execute if no exception
    print("else")
finally:  # always execute
    print("finally")

#### Example
Close file regardless whether file operation is successful or not. 

In [None]:
try:
    f = open('abc', 'wb')
    f.write('abc')    # Error occurs
except Exception as e:
    print(repr(e))
finally:
    print('Close file')
    f.close()

# 6.3 Program Testing

In order to ensure that any code we write will function correctly under all possible circumstances, we must employ testing. This is especially pertinent to logic and arithmetic errors, but is also applicable to static semantic errors (though it typically has no bearing on syntax errors since testing for that is handled by the interpreter/compiler).

In order to ensure that the program under scrutiny functions as intended, appropriate test cases must be specified. Test cases fall into 3 categories:

- Normal (valid) cases
- Boundary (extreme) cases
- Abnormal (erroneous) cases

It should be noted that testing encompasses a more complex framework for larger projects that involve many programmers. For the purposes of these notes, we will restrict testing to individual algorithms and smaller snippets of code.


## 6.3.1 Normal Test Cases
Normal data values correspond to data that would normally be input into the program. With such test cases, the program should accept the test case, process it, and output a result that is checked to ensure that it is the same as the expected result.


## 6.3.2 Boundary Test Cases
Boundary data values correspond to values that are chosen to be the absolute limits of the normal range. Extreme values are used in testing to make sure that all normal values will be accepted and processed correctly.

## 6.3.3 Abnormal Test Cases
Abnormal data values correspond to values that should not normally be accepted by the system - the values are invalid. The program should reject any abnormal values. Abnormal values are used in testing to ensure that invalid data does not cause a runtime error.

# 6.4 Data Validation

**Data validation** is the process to confirm if input data conforms to required specifications in presence, existence, accuracy, length, range, format, etc, typically validated by computerized means (program testing), and performed at time of data entry. 

Validation corresponds to automated checking to ensures that input data is acceptable/reasonable - i.e., it ensure that the data has the correct type and format; it does not, however, guarantee that the data is accurate. 

An example of validation would be to check data strings corresponding to NRIC or car registration plate numbers.

Data validation techniques include:
- range check : check if the input data is within the acceptable range/region.
- format check : check if characters in each position of an input matches the correct layout; this applies to code numbers which may have a complex format.
- length check : check if the input data has a specified number/range of characters.
- presence check : check for empty fields.
- check digit : check for simple errors in the input of a series of digits such as a single mistyped digit or some permutations of two successive digits.

Data Validation differs from **data verification**, which is the process of ensuring transferred data matches source data.  It typically involves automated or manual inspection (by human) and is performed after data is inputted to the system. 

An example of verification takes place during a typical account registration process, whereby the inputs of email and password are typically asked twice to ensure accuracy.

## Example

Write a Python code to check the validity of a password (input from users).

Validation :

- Password cannot be an empty string.
- At least 1 lowercase letter between `[a-z]` and 1 uppercase letter between `[A-Z]`.
- At least 1 number between `[0-9]`.
- At least 1 character from `[$#@]`.
- Minimum length 6 characters.
- Maximum length 16 characters.