# Learning Objectives

- [ ]  2.4.1 Explain the difference between data validation and data verification.
- [ ]  2.4.2 Understand data validation techniques such as:
    - range check
    - format check
    - length check
    - presence check
    - check digit
- [ ]  2.4.3 Identify, explain and correct syntax, logic and runtime errors.
- [ ]  2.4.4 Design appropriate test cases using normal, abnormal and extreme data for testing and debugging programs.


# D30.1 Program Testing

In order to ensure that any code we write will function correctly under all possible circumstances, we must employ testing. This is especially pertinent to logic and arithmetic errors, but is also applicable to static semantic errors (though it typically has no bearing on syntax errors since testing for that is handled by the interpreter/compiler).

In order to ensure that the program under scrutiny functions as intended, appropriate test cases must be specified. Test cases fall into 3 categories:

- Normal (valid) cases. Normal data values correspond to data that would normally be input into the program. With such test cases, the program should accept the test case, process it, and output a result that is checked to ensure that it is the same as the expected result.
- Boundary (extreme) cases. Boundary data values correspond to values that are chosen to be the absolute limits of the normal range. Extreme values are used in testing to make sure that all normal values will be accepted and processed correctly.
- Abnormal (erroneous) cases. Abnormal data values correspond to values that should not normally be accepted by the system - the values are invalid. The program should reject any abnormal values. Abnormal values are used in testing to ensure that invalid data does not cause a runtime error.

It should be noted that testing encompasses a more complex framework for larger projects that involve many programmers. For the purposes of these notes, we will restrict testing to individual algorithms and smaller snippets of code.


# D30.2 Data Validation

**Data validation** is the process to confirm if input data conforms to required specifications in presence, existence, accuracy, length, range, format, etc, typically validated by computerized means (program testing), and performed at time of data entry. 

Validation corresponds to automated checking to ensures that input data is acceptable/reasonable - i.e., it ensure that the data has the correct type and format; it does not, however, guarantee that the data is accurate. 

An example of validation would be to check data strings corresponding to NRIC or car registration plate numbers.

Data validation techniques include:
- range check : check if the input data is within the acceptable range/region.
- format check : check if characters in each position of an input matches the correct layout; this applies to code numbers which may have a complex format.
- length check : check if the input data has a specified number/range of characters.
- presence check : check for empty fields.
- check digit : check for simple errors in the input of a series of digits such as a single mistyped digit or some permutations of two successive digits.

#### Example

Write a Python code to check the validity of a password (input from users).

Validation :

- Password cannot be an empty string.
- At least 1 lowercase letter between `[a-z]` and 1 uppercase letter between `[A-Z]`.
- At least 1 number between `[0-9]`.
- At least 1 character from `[$#@]`.
- Minimum length 6 characters.
- Maximum length 16 characters.

In [None]:
#YOUR_CODE_HERE

#### Exercise D30.1 ISBN 

The International Standard Book Number (ISBN) is a numeric commercial book identifier which is intended to be unique. Publishers purchase ISBNs from an affiliate of the International ISBN Agency.

Write a code that 
- takes in a string input,
- return `True` if the string is either a valid ISBN-10 or ISBN-13 numbers.

Each line in the text file, `ISBN_EXERCISE.TXT` under the folder `resources`, contain ISBN-10 or ISBN-13 which can be valid or invalid.

Write a program to:
- print out the valid ISBNs in the file.
- print out the number of valid ISBNs in the file.

In [None]:
# YOUR ANSWER HERE

#### Exercise D30.2 2021/NJC/PROMO/P2/Q3(Modified)
A picture element, or a **pixel**, is one of the small squares that make up an image on a computer screen. As such, every digital image can be thought of being made up of pixels. The following is an example of an image of the size 10 by 10 pixels. The first 10 refers to the number of rows in the image and the latter 10 refers to the number of columns in the image.

<center>
<img src="./img/exercise9-pixel.png"><br>
</center>

A pixel holds the information of a color and the colors are often represented in a string starting with a hex `#` character followed by 3 pairs of hexadecimal digits. We will call this the hexadecimal representation of the pixel. For example, the color white is `#FFFFF` and the color silver is `#C0C0C0`.

##### Task 1
Write a program code with the following specification:
- input a hexadecimal number as a string
- validate the input
- calculate the denary value of the hexadecimal number input
- output the denary value.

In [None]:
# YOUR ANSWER HERE

Each pair of the hexadecimal digits actually represents the intensity of 3 primary colors, red, green and blue, that make up the color in the pixel. As such, instead of having the hexadecimal representation of the pixel, each pixel can be represented by a list of 3 denary values that represents the intensity of the colours, red, green and blue respectively, e.g., the color `#FFCO80` can be represented as the list `[255,192,128]`. We will call this representation of the pixel the `RGB representation`.

##### Task 2
A 16 by 13 pixels image is stored as hexadecimal form in the text file `TASK3_2.CSV`, where each line represents a row in the image.
Write a program code with the following specification: 
- read the hexadecimal values from the file TASK3_2.CSV and store them in a suitable format
- using the program code in Task 3.1,
- convert the image from its hexadecimal pixels representation ito the RGB presentation,
- write the new presentation into the file `MY_RGB_IMAGE.TXT`. 
Conversely, sometimes pixels are given in their RGB representation instead and for graphic designers, hexadecimal values are often preferred.

In [None]:
# YOUR ANSWER HERE

##### Task 3
A 16 by 13 pixels image is stored its RGB representation in the text file `TASK3_3.CSV` in the `resources` folder, where each line represents a row in the image.
Write an additional program code with the following specification: 
- read the rgb values from the file `TASK3_3.CSV` and store them in a suitable format,
- write the new presentation into the file `MY_HEX_IMAGE.TXT`.

In [None]:
# YOUR ANSWER HERE

# D30.3 Data Verification

**Data Verification** is a process in which different types of data are checked for accuracy and inconsistencies after data migration is done. 

Data Validation differs from **data verification**, which is the process of ensuring transferred data matches source data.  It typically involves automated or manual inspection (by human) and is performed after data is inputted to the system. 

Here are some examples of data verification techniques.

## D30.3.1 During Data Entry
- *Visual Check*. The simplest form of data verification. The copied data is compared with the original data by manually inspecting them via sight. 
- *Double Entry*. For double entry, the user keys in the data value, which is followed by the software blanking out the first entry and ask the user to key in a second time. The two entries are then checked if they match. An example of double entry verification takes place during a typical account registration process, whereby the inputs of email and password are typically asked twice to ensure accuracy.

## D30.3.2 During Data Transfer
The contents of original data and the copy made during transfer are checked byte-by-byte to check that they match exactly. 