# Introduction to Regular Expressions (Regex)

Regular Expressions (Regex) are powerful tools used for pattern matching in text.
They are commonly used for validation, searching, and formatting operations.
In Python, we use the `re` module to work with regex.

In this tutorial, we will cover regex for:
- Indonesian NIK (National Identification Number)
- Mobile and Home Phone Numbers
- Vehicle Plate Numbers

In [1]:
import re

## NIK (Indonesian National Identification Number) Validation

The Indonesian NIK consists of 16 digits, with specific patterns related to regions.
The regex pattern ensures the correct structure and prevents invalid formats.

### Explanation:
- `^(1[1-9]|21|[37][1-6]|5[1-3]|6[1-5]|[89][12])` → The first two digits represent the province code.
  - Example: `11` for Aceh, `31` for Jakarta.
- `\d{2}\d{2}` → The next four digits represent the city and district code.
- `([04][1-9]|[1256][0-9]|[37][01])` → The date of birth (adjusted for females by adding 40).
- `(0[1-9]|1[0-2])` → The birth month (01 to 12).
- `\d{2}` → The birth year (last two digits of the year).
- `\d{4}$` → A unique identifier number.

In [2]:
NIK_REGEX = re.compile(r"^(1[1-9]|21|[37][1-6]|5[1-3]|6[1-5]|[89][12])\d{2}\d{2}([04][1-9]|[1256][0-9]|[37][01])(0[1-9]|1[0-2])\d{2}\d{4}$")


## Mobile Phone Number Validation

Indonesian mobile numbers start with `+62` or `62`, followed by an 8 to 13-digit number.

### Explanation:
- `^(\+62|62)?` → The country code is optional (`+62` or `62`).
- `[\s-]?` → Allows an optional space or dash separator.
- `08[1-9]{1}\d{1}` → Ensures the number starts with `08` followed by two digits representing the provider code.
- `[\s-]?\d{4}[\s-]?\d{2,5}$` → The remaining digits, allowing spaces or dashes.

In [3]:
MOBILE_PHONE_REGEX = re.compile(r"^(\+62|62)?[\s-]?0?8[1-9]{1}\d{1}[\s-]?\d{4}[\s-]?\d{2,5}$")

## Home Phone Number Validation

Landline numbers in Indonesia usually start with `021` or `62`.

### Explanation:
- `^(\+62|62)?` → The country code is optional.
- `[\s-]?0?([2-7]|9)\d(\d)?` → Ensures valid area codes (Jakarta: `021`, Surabaya: `031`, etc.).
- `[\s-]?[2-9](\d){6,7}` → The rest of the phone number (7-8 digits, avoiding leading `0` in local numbers).

In [4]:
HOME_PHONE_REGEX = re.compile(r"^(\+62|62)?[\s-]?0?([2-7]|9)\d(\d)?[\s-]?[2-9](\d){6,7}")

## Vehicle Plate Number Validation

A basic regex pattern to validate Indonesian vehicle plate numbers.

### Explanation:
- `^[A-Z]{1,2}` → The first 1-2 letters represent the region.
- `\s{0,1}` → Allows an optional space.
- `\d{0,4}` → The middle section consists of up to 4 digits.
- `\s{0,1}` → Another optional space.
- `[A-Z]{0,3}$` → The suffix consists of up to 3 optional letters.

In [5]:
PLATE_NUMBER_REGEX = re.compile(r"^[A-Z]{1,2}\s{0,1}\d{0,4}\s{0,1}[A-Z]{0,3}$")

## Example Variables

Let's test our regex patterns with some sample data.

In [6]:
# Example variables
nik = "1530105808258884"
phone = "+62-861-06103981"
home = "021-4792421"
plate_number = "F 1523 XYZ"

## Checking NIK

We use `re.match()` to validate whether the input follows the NIK format.

In [7]:
# Check NIK
if NIK_REGEX.match(nik):
    print("Valid NIK")
else:
    print("Invalid NIK")

Valid NIK


## Checking Phone & Plate Numbers

Similarly, we check if the given number matches the home/mobile phone format or a licence plate number format.

In [8]:
# Check Phone Number
if MOBILE_PHONE_REGEX.match(phone):
    print("Valid Phone Number")
else:
    print("Invalid Phone Number")

Valid Phone Number


In [9]:
# Check Phone Number
if HOME_PHONE_REGEX.match(home):
    print("Valid Home Number")
else:
    print("Invalid Home Number")

Valid Home Number


In [10]:
# Check Phone Number
if PLATE_NUMBER_REGEX.match(plate_number):
    print("Valid Plate Number")
else:
    print("Invalid Plate Number")

Valid Plate Number
