In this exercise, you will replicate HKID check digit formula.
You will given a list of HKIDs and you have to tell if the check digit is correct.
Details about the formula can be found here:
    https://computerterminal.blogspot.com/2013/02/hong-kong-id-formula-hkid-number-check.html

In [1]:
# only AB987654(3) and E364912(5) are correct HKIDs
hkid_list = ['AB987654(3)', 'AB987654(9)', 'E364912(5)', 'E364912(9)', 'E364912(5)    ', 'ABCD1234', 'HELLO']

Functions

In [2]:
import re
# use regular expressions to check if a string has a format that looks like a HKID
# i.e. 1-2 letters followed by 6 numbers then a bracket and a number or 'A'

def is_hkid(hkid):
    x = re.search("[A-Z]{1,2}[0-9]{6}\([A0-9]\)", str(hkid))
    if x is None:
        return False
    else:
        return True

In [3]:
for hkid in hkid_list:
    print(is_hkid(hkid))

True
True
True
True
True
False
False


In [4]:
# a helper function of hkid_to_sum()
def letter_to_number(letter):
    letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    if letter == ' ':
        return 36
    else:
        return letters.index(letter) + 10

In [5]:
def hkid_to_sum(hkid):
    hkid_without_bracket = hkid.split("(")[0]
    sum = 0
    for index in range(0, len(hkid_without_bracket)):
        digit = hkid_without_bracket[index]
        # Try convert to integer. If can't convert, means it is a letter
        try:
            digit_value = int(digit) * (9 - index)
        except:
            digit_value = letter_to_number(digit) * (9 - index)
        sum += digit_value
    return sum

In [6]:
def get_last_digit_of_hkid(hkid):
    return hkid.split("(")[1].split(")")[0]

In [7]:
def check_digit(sum):
    mod = sum % 11
    if mod >= 2:
        return str(11 - mod)
    elif mod == 1:
        return 'A'
    else:
        return '0'

In [8]:
# assume there is another step to check if it is a valid HKID format
# e.g. only A123456(7) or AB123456(7) or A123456(A)

def add_leading_space(hkid):
# remove any leading/ trailing whitespace
    hkid = hkid.strip()
# add a leading whitespace if the hkid starts with one letter
    if len(hkid) == 10:
        return " " + hkid
    else:
        return hkid

In [9]:
# main function
def hkid_to_check_digit(hkid):
    return check_digit(hkid_to_sum(add_leading_space(hkid)))

testing

In [10]:
for hkid in hkid_list:
    if is_hkid(hkid):
        if hkid_to_check_digit(hkid) == get_last_digit_of_hkid(hkid):
            print(hkid + ' has a correct check digit')
        else:
            print(hkid + ' has an incorrect check digit, the correct one should be: ' + hkid_to_check_digit(hkid))
            
    else:
        print(hkid + ' is not a HKID.')

AB987654(3) has a correct check digit
AB987654(9) has an incorrect check digit, the correct one should be: 3
E364912(5) has a correct check digit
E364912(9) has an incorrect check digit, the correct one should be: 5
E364912(5)     has a correct check digit
ABCD1234 is not a HKID.
HELLO is not a HKID.


In [11]:
Expected output:
    AB987654(3) has a correct check digit
    AB987654(9) has an incorrect check digit, the correct one should be: 3
    E364912(5) has a correct check digit
    E364912(9) has an incorrect check digit, the correct one should be: 5
    E364912(5)     has a correct check digit
    ABCD1234 is not a HKID.
    HELLO is not a HKID.

SyntaxError: invalid syntax (<ipython-input-11-4b8bb8a26a4e>, line 1)