## 5 Input Validation

### 5.1 Why Validation is Needed?
For a problem to be solved, its inputs, outputs and processes need to be defined clearly. While the programmer often has control over the processes used and outputs produced by the code, the supplied inputs can come from many possible sources, and the programmer often has no control over what is supplied. This means that the supplied inputs may not actually meet the requirements for valid or acceptable input data as defined by the problem.

### 5.2 Recovering from Invalid Input
**Learning Outcomes**
> 2.4.6 Justify the use of data validation and identify the appropriate action to take when invalid data is encountered: asking for input again (for interactive input) or exiting the program (for non-interactive input).)
#### 5.2.1 Asking for Input Again
If the program is meant to be interactive, i.e. the input are to be entered by user, we can ask for the data to be re-entered. This option makes the most sense if the data might change by trying again.

In this option, the program will ask user to re-enter until valid input are entered.

#### 5.2.2 Exiting the Program
In non-interactive program, i.e. batch program, or program that read data from files, if the data is not valid, we can skip the rest of the program and exit.
.


### 5.3 Common Validation Checks
**Learning Outcomes**
> 2.4.7 Validate input for acceptance by performing:
> - length check
> - range check
> - presence check
> - format check
> - existence check (i.e. check for whether input data is already in the system)
> - calculation of a check digit

#### 5.3.1 Length Checks

Length checks are use to check:  
- the number of elements in a list
- the number of characters in a string

In general, length checks make use of the `len()` function, to ensure the input data is of the exact length (string) or size (list).

In [1]:
# Length Check

user_input = input("Enter a 5-letter word: ")

while len(user_input) != 5:
    print("Invalid input, the number of characters must be exactly 5.")
    user_input = input("Enter a 5-letter word: ")

print("You have entered {}".format(user_input))

Enter a 5-letter word:  j


Invalid input, the number of characters must be exactly 5.


Enter a 5-letter word:  12345


You have entered 12345


#### 5.3.2 Range Checks

Range checks are use to limit the input to a particular range of (numeric) values.

In general, range checks uses the following 4 relational operators:
> `>`, `>=`, `<`, `<=`

In [None]:
# Range Check

marks = int(input("Enter your exam marks [0..100]: "))

while marks < 0 or marks > 100:
    print("Invalid entry, marks must be between 0 and 100, inclusive")
    marks = int(input("Enter your exam marks [0..100]: "))

if marks >= 50:
    print("You passed.")
else:
    print("Sign up for SSP")

#### 5.3.3 Presence Checks

Presence checks are used to ensure all the required input are provided.

In general, presence checks check against an empty string.

In [16]:
# Presence Check
usrname = input("Username: ")
while not usrname:
    print("Username is required.")
    usrname = input("Username: ")

print("Hello,", usrname)

Username:  


Username is required.


Username:  s


Hello, s


#### 5.3.4 Format Checks
Some input may have complex requirements, usually to follow a particular pattern.
Format checks often require you to decompose the input data into smaller parts, then run further checks on those smaller parts. 

Built-in string methods, length check and range check may be used to validate the smaller parts.

In [34]:
# Singapore NRIC is made up of 3 sections, {prefix + body + check digit}
# where prefix is an alphabet, body is a 7-digit number, check digit is an alphabet.
# e.g. S1234567D
nric = input("NRIC: ")
list = [i for i in nric]

if len(list)!=9:
    print('skibbidi')
elif list[0].isalpha() != True and nric[8].isaplha() != True:
    print('skibbid')

else:
    for i in list[1:8]:
        if i.isdigit() == False:
            print('skibidi')
            break
        else: 
            continue
    
    else: print('toilet')


NRIC:  s1234567f


toilet


#### 5.3.5 Existence Checks

For some problems, input is valid only if it is (or alternatively, is not) in an existing collection or repository of data. This is known as an existence check.

In [35]:
# Existence Check
email_addr = {"robin"   : "pang_hee_tee_robin@sst.edu.sg",
              "jovita"  : "jovita_tang@sst.edu.sg",
              "raymond" : "chng_soon_hsien@sst.edu.sg",
              "aurelius": "aurelius_yeo@sst.edu.sg" }

name = input("Enter the first name of CP+ teacher: ").lower()
while True:
    if name in email_addr:
        print("{}'s email: {}".format(name.upper(), email_addr[name]))
        break
    else:
        print("{} is not a CP+ teacher!".format(name.upper()))
        name = input("Enter the first name of CP+ teacher: ").lower()


Enter the first name of CP+ teacher:  aurelius_yeo@sst.edu.sg


AURELIUS_YEO@SST.EDU.SG is not a CP+ teacher!


Enter the first name of CP+ teacher:  yeo


YEO is not a CP+ teacher!


Enter the first name of CP+ teacher:  aurelius


AURELIUS's email: aurelius_yeo@sst.edu.sg


In [18]:
# Existence Check extension.
# Allow the user to enter either first name or surname. 
# How would you implement it?
email_addr = {"robin pang"   : "pang_hee_tee_robin@sst.edu.sg",
              "jovita tang"  : "jovita_tang@sst.edu.sg",
              "raymond chng" : "chng_soon_hsien@sst.edu.sg",
              "aurelius yeo": "aurelius_yeo@sst.edu.sg" }

name = input("Enter the first name or surname of CP+ teacher: ").lower()

list = [i.split() for i in email_addr]

for i in range(len(list)):
    if name in list[i]:
        print(f"{name.upper()}'s email: {email_addr[list[i][0]+' '+list[i][1]]}", )
        break
else: print(f"{name.upper()} is not a CP+ teacher!")

Enter the first name or surname of CP+ teacher:  raymond


RAYMOND's email: chng_soon_hsien@sst.edu.sg


In [30]:
# A popular standard for check digits that are used in product barcodes is the Universal Product Code (UPC-A) 
# standard. Figure 5.12 shows an example of a 12-digit UPC-A product code. The 12th digit is the check digit.
#
# To calculate the check digit, the first 11 digits need to be processed using the following algorithm:
# 1.	Add the digits in the odd-numbered positions (i.e., 1st, 3rd … 11th).
# 2.	Multiply the result by three.
# 3.	Add the digits in the even-numbered positions (i.e., 2nd, 4th … 10th) to the result.
# 4.	If the last digit of the result is 0, the check digit should be 0. 
#       Otherwise, subtract the last digit of the result from 10. 
#       The check digit should be the same as the resulting answer.

# Input
upc = input("Enter first 11 digits of UPC-A: ")
f = 0
odd = []
even = []

# Process

for i in upc:
    if f==0:
        odd.append(int(i))
        f=1
    else: 
        even.append(int(i))
        f=0

skibbidi = str((3*(sum(odd))) + sum(even))

if skibbidi.endswith('0'):
    check_digit = 0
else: check_digit = 10-(int(skibbidi[-1]))



# Output
print(check_digit)


Enter first 11 digits of UPC-A:  05555555555


0


### [Plus] Data Type Validation
Direct type casting can cause your program to have a run time error, causing the program to end abruptly (crash). Check if the input is the correct data type may be required.

#### Positive Integer (Most common)
Use .isdigit() will allow the program to check if the enter string is made up of numbers. If the result returns `True`, the inpout can then be casted into int.

In [15]:
# Positive int
n = input ("Enter a positive integer: ")
while not n.isdigit():
    print("Not a positive integer, please re-enter.")
    n = input("Enter a positive integer: ")
n = int(n)

Enter a positive integer:  -0


Not a positive integer, please re-enter.


Enter a positive integer:  0


#### Negative Integer, Float
These 2 data input will be harder to validate, to do so, you will have to use `try` & `except` keywords

**`try` & `except`** syntax:  
`try:`  
`   ### code that can potentially crash your program`   
`   ### code that survive the above line`  
`except Exceptions:`  
`   ### code to execute instead of crashing your program`  

The error exception thrown when casting is known as ValueError
Full list of built-in excpetion : https://docs.python.org/3/library/exceptions.html#bltin-exceptions

In [32]:
# Try/except

while True:
    try:
        cash = float(input("Cost $"))
        break
    except ValueError:
        print("Not a valid number.")

gst = cash*0.09
svc = cash*0.1
print("GST(9%) :{:>10.2f}".format(gst))
print("SVC(10%):{:>10.2f}".format(svc))
print("Total   :{:>10.2f}".format(cash+gst+svc))

Cost $ 1.2.3


Not a valid number.


Cost $ 3


GST(9%) :      0.27
SVC(10%):      0.30
Total   :      3.57
