# 02_02 - Data Validation and Weighted Sum Check Digit

## Understanding Goals

At the end of this chapter, you should understand:
- What is data validation?
- What is a weighted sum check digit?

# Section 3: Data Validation

## _3.1 Data integrity, privacy and security_

- Data Privacy: a requirement for data to be available only to authorised users.
- Data Security: a requirement for data to be recoverable if lost or corrupted.
- Data Integrity: a requirement for data to be accurate and up to date.

## _3.2 Data Verification vs Data Validation_

**Data verification** is the process of checking a copy of data to make sure that it is exactly equal to the original copy of the data.

For example, login process whereby the inputs of username and encrypted password are matched with the ones found in the database is a data verification process.

In the chapter of Network, we will also learn about Parity Check and Checksum as data verification methods.

**Data validation** deals with making sure the data is valid (clean, correct and useful). It ensures the validity (mostly correctness and meaningfulness) of data.

<table class="table table-bordered">
    <!-- Header Row -->
    <tr>
        <th style="width:20%; text-align:left">Data Validation Checks</th>
        <th style="width:80%; text-align:left">Description</th>
    </tr>
    <tr>
        <td style="text-align:left">Presence Check</td>
        <td style="text-align:left">Check for empty fields</td>
    </tr>
    <tr>
        <td style="text-align:left">Type Check</td>
        <td style="text-align:left">Check if input is of the correct data type.</td>
    </tr>
    <tr>
        <td style="text-align:left">Range Check</td>
        <td style="text-align:left">Check if input is within the acceptable range/region.</td>
    </tr>
    <tr>
        <td style="text-align:left">Length Check</td>
        <td style="text-align:left">Check if input has the specific number of characters.</td>
    </tr>
    <tr>
        <td style="text-align:left">Layout/Format Check</td>
        <td style="text-align:left">Check if character of each position of an input matches the correct layout, especially code numbers which have a complex layout.</td>
    </tr>
    <tr>
        <td style="text-align:left">Restricted Value Check</td>
        <td style="text-align:left">Check input with a predefined set of data.</td>
    </tr>
</table>

## _3.3 Check Digit / Check Code_

Check digit/code is a widely used data validation method. For example, NRIC number, Vehicle Registration number, ISBN number are all using weighted modulus computation to calculate the last check digit.

### ~ Example ~

The last letter of the NRIC is called a check code. It is used to check whether the IC numbers are valid.

The algorithm follows the rules below, we will use NRIC number `S1234567D` as an example:

1) Each digit will be multiplied with the following weight and add together:

<table class="table table-bordered">
    <tr>
        <th style="width:10%; text-align:left">Digit</th>
        <th style="width:5%; text-align:center">1</th>
        <th style="width:5%; text-align:center">2</th>
        <th style="width:5%; text-align:center">3</th>
        <th style="width:5%; text-align:center">4</th>
        <th style="width:5%; text-align:center">5</th>
        <th style="width:5%; text-align:center">6</th>
        <th style="width:5%; text-align:center">7</th>
    </tr>
    <tr>
        <th style="text-align:left">Weight</th>
        <td style="text-align:center">2</td>
        <td style="text-align:center">7</td>
        <td style="text-align:center">6</td>
        <td style="text-align:center">5</td>
        <td style="text-align:center">4</td>
        <td style="text-align:center">3</td>
        <td style="text-align:center">2</td>
    </tr>
</table>

`1 × 2 + 2 × 7 + 3 × 6 + 4 × 5 + 5 × 4 + 6 × 3 + 7 × 2 = 106`

2) If the first letter of the NRIC starts with `T` or `G`, add `4` to the total.

3) Divide the number by `11` and get the remainder.

`106 % 11 = 7`

4) The last letter on the NRIC depends on the IC type (the first letter in the IC) using the cipher below with the remainder:

If the IC starts with `S` or `T`:

<table class="table table-bordered">
    <tr>
        <th style="width:10%; text-align:left">Remainder</th>
        <th style="width:5%; text-align:center">0</th>
        <th style="width:5%; text-align:center">1</th>
        <th style="width:5%; text-align:center">2</th>
        <th style="width:5%; text-align:center">3</th>
        <th style="width:5%; text-align:center">4</th>
        <th style="width:5%; text-align:center">5</th>
        <th style="width:5%; text-align:center">6</th>
        <th style="width:5%; text-align:center">7</th>
        <th style="width:5%; text-align:center">8</th>
        <th style="width:5%; text-align:center">9</th>
        <th style="width:5%; text-align:center">10</th>
    </tr>
    <tr>
        <th style="text-align:left">Check Code</th>
        <td style="text-align:center">J</td>
        <td style="text-align:center">Z</td>
        <td style="text-align:center">I</td>
        <td style="text-align:center">H</td>
        <td style="text-align:center">G</td>
        <td style="text-align:center">F</td>
        <td style="text-align:center">E</td>
        <td style="text-align:center">D</td>
        <td style="text-align:center">C</td>
        <td style="text-align:center">B</td>
        <td style="text-align:center">A</td>
    </tr>
</table>

If the IC starts with `F` or `G`:  

<table class="table table-bordered">
    <tr>
        <th style="width:10%; text-align:left">Remainder</th>
        <th style="width:5%; text-align:center">0</th>
        <th style="width:5%; text-align:center">1</th>
        <th style="width:5%; text-align:center">2</th>
        <th style="width:5%; text-align:center">3</th>
        <th style="width:5%; text-align:center">4</th>
        <th style="width:5%; text-align:center">5</th>
        <th style="width:5%; text-align:center">6</th>
        <th style="width:5%; text-align:center">7</th>
        <th style="width:5%; text-align:center">8</th>
        <th style="width:5%; text-align:center">9</th>
        <th style="width:5%; text-align:center">10</th>
    </tr>
    <tr>
        <th style="text-align:left">Check Code</th>
        <td style="text-align:center">X</td>
        <td style="text-align:center">W</td>
        <td style="text-align:center">U</td>
        <td style="text-align:center">T</td>
        <td style="text-align:center">R</td>
        <td style="text-align:center">Q</td>
        <td style="text-align:center">P</td>
        <td style="text-align:center">N</td>
        <td style="text-align:center">M</td>
        <td style="text-align:center">L</td>
        <td style="text-align:center">K</td>
    </tr>
</table>

The letter provided must correspond to the letter decipher with the remainder in order for the NRIC to be valid.
        
Write a program to validate an NRIC input. If the NRIC is valid, print `NRIC is valid`; otherwise, print `NRIC is invalid`.

In [2]:
# Your Codes Here
nric = 'S1234567D'
def check_nric():
    weightage = {1:2, 2:7, 3:6, 4:5, 5:4, 6:3, 7:2}
    start_ST = {0:'J', 1:'Z', 2:'I', 3:'H', 4:'G', 5:'F', 6:'E', 7:'D', 8:'C', 9:'B', 10:'A'}
    start_FG = {0:'X', 1:'W', 2:'U', 3:'T', 4:'R', 5:'Q', 6:'P', 7:'N', 8:'M', 9:'L', 10:'K'}
    total = 0
    try:
        for i in range(1, len(nric) - 1):
            total += int(nric[i]) * weightage[i]
    except:
        print('Invalid NRIC')
        exit()

    valid = False
    if nric[0] == 'T' or nric[0] == 'G':
        total += 4
    rem = total % 11
    if nric[0] == 'S' or nric[0] == 'T':
        valid = (start_ST[rem] == nric[-1])
    elif nric[0] == 'F' or nric[0] == 'G':
        valid = (start_FG[rem] == nric[-1])

    if valid:
        print('Valid NRIC')
    else:
        print('Invalid NRIC')

check_nric(nric)
nric = input('Enter NRIC to check if valid: ')
check_nric(nric)


Valid NRIC
