## **How QR Codes Work**
QR codes are **two-dimensional barcodes** that encode data using black and white squares. The main components of a QR code are:

1. **Finder Patterns** (Large squares at the corners)
2. **Timing Patterns** (Alternating black/white lines for alignment)
3. **Alignment Patterns** (Smaller squares for distortion correction)
4. **Data Area** (Stores encoded information)
5. **Error Correction** (Uses Reed-Solomon error correction)

Each QR code consists of a **grid of black and white modules**, where black represents **1** and white represents **0**.

---

## **Steps to Generate a QR Code**
1. **Analyze Unicode characters**
2. **Create data segment**
3. **Fit to version number**
4. **Concatenate segments, add padding, make codewords**
5. **Split blocks, add ECC, interleave**
6. **Draw fixed patterns**
7. **Draw codewords and remainder**
8. **Try applying each mask**
9. **Find penalty patterns**
10. **Calculate penalty points, select best mask**

---

## **Implementing QR Code in Julia**

In [2]:
using Plots

### **1. Analyze Unicode characters**

In [4]:
text = "Hello, world! 123"
num_codepoints = length(text)  # Count number of characters
println("Number of code points in the input text string: $num_codepoints")

Number of code points in the input text string: 17


In [255]:
function analyze_unicode(text)
    println("Index\tChar\tCP hex\tNM\tAM\tBM\tKM")

    nm_results = Bool[]  # Stores true/false for Numeric Mode (NM)
    am_results = Bool[]  # Stores true/false for Alphanumeric Mode (AM)
    km_results = Bool[]  # Stores true/false for Kanji Mode (KM)
    
    for (i, c) in enumerate(text)
        code_point = UInt32(c)
        cp_hex = uppercase(string(code_point, base=16))  # Unicode hex

        # Numeric Mode (0-9)
        nm = '0' <= c <= '9'
        push!(nm_results, nm)

        # Alphanumeric Mode (0-9, A-Z, and some special symbols)
        am = nm || ('A' <= c <= 'Z') || c in "\$%*+-./:"
        push!(am_results, am)

        # Byte Mode (always applicable in UTF-8)
        bm = true  

        # Kanji Mode (Shift JIS X 0208 full range)
        km = (
            (0x4E00 <= code_point <= 0x9FAF) ||  # Common CJK Kanji (JIS X 0208 & X 0212)
            (0x3400 <= code_point <= 0x4DBF) ||  # Rare Kanji
            (0x3000 <= code_point <= 0x303F) ||  # CJK Symbols & Punctuation
            (0x3040 <= code_point <= 0x30FF) ||  # Hiragana & Katakana
            (0xFF00 <= code_point <= 0xFFEF) ||  # Full-width ASCII & Kana
            (0x0370 <= code_point <= 0x03FF) ||  # Greek Letters (α, β, γ, etc.)
            (0x0400 <= code_point <= 0x04FF) ||  # Cyrillic Letters (А, Б, В, etc.)
            (0x2500 <= code_point <= 0x257F) ||  # Box Drawing Characters
            (code_point in [0x2606, 0x2605, 0x266A])  # Special symbols like ☆, ★, ♪
        )
        push!(km_results, km)

        println("$i\t$c\tU+$cp_hex\t$(nm ? "Yes" : "No")\t$(am ? "Yes" : "No")\t$(bm ? "Yes" : "No")\t$(km ? "Yes" : "No")")
    end
    return nm_results, am_results, km_results
end


analyze_unicode (generic function with 1 method)

In [259]:
function determine_encoding(text)
    # Check if all characters can be encoded in each mode

    nm, am, km = analyze_unicode(text)
    
    can_numeric = all(nm)
    can_alphanumeric = all(am)
    can_kanji = all(km)
    can_byte = true     # UTF-8 can encode everything (always true)

    println("Can every character be encoded in:")
    println("Mode\t\tEncodable")
    println("Numeric\t\t$(can_numeric ? "Yes" : "No")")
    println("Alphanumeric\t$(can_alphanumeric ? "Yes" : "No")")
    println("Byte\t\tYes")  # Always possible
    println("Kanji\t\t$(can_kanji ? "Yes" : "No")")

    # Determine best encoding mode
    if can_numeric
        return "Numeric"
    elseif can_alphanumeric
        return "Alphanumeric"
    elseif can_kanji
        return "Kanji"
    else
        return "Byte"  # Always possible
    end
end

determine_encoding (generic function with 1 method)

### 2. **Create data segment**

In [271]:
# Encode based on mode
function encode_text(text)
    mode = determine_encoding(text)
    
    total_bits = 0
    bit_strings = []  # Store binary strings
    
    if mode == "Numeric"
        index = 0  # Track character index
        for i in 1:3:length(text)
            num_str = text[i:min(i+2, end)]  # Group 1-3 digits
            num_value = parse(Int, num_str)
            
            # Determine bit length
            bit_length = if length(num_str) == 3
                10
            elseif length(num_str) == 2
                7
            else
                4
            end

            binary_value = lpad(string(num_value, base=2), bit_length, '0')  # Convert to binary
            push!(bit_strings, binary_value)  # Store binary string

            println("Index\tChar\tBits") 

            # Print debug output for each character
            for j in 0:length(num_str)-1
                println("$(index + j)\t$(num_str[j+1])\t$(binary_value)")
            end

            total_bits += bit_length
            index += length(num_str)
        end
    
    elseif mode == "Alphanumeric"
        char_map = Dict(zip(collect("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ \$%*+-./:"), 0:44))
        index = 0  # Track character index

        for i in 1:2:length(text)
            char1 = text[i]
            val1 = char_map[char1]  # Get mapped value

            if i + 1 <= length(text)  # Pair of characters
                char2 = text[i+1]
                val2 = char_map[char2]
                num_value = val1 * 45 + val2
                bit_length = 11
            else  # Single character
                num_value = val1
                bit_length = 6
            end

            binary_value = lpad(string(num_value, base=2), bit_length, '0')
            push!(bit_strings, binary_value)  # Store binary string

            println("Index\tChar\tBits") 
            
            # Print debug output for each character
            println("$(index)\t$char1\t$binary_value")
            if i + 1 <= length(text)
                println("$(index+1)\t$char2\t$binary_value")
            end

            total_bits += bit_length
            index += (bit_length == 11) ? 2 : 1
        end
        
    
    elseif mode == "Kanji"
        # TODO: Properly implement Shift JIS mapping here
        # For now, we just convert the Unicode code point to binary
        println("Index\tChar\tValues (hex)\tBits")        
        for (i, c) in enumerate(text)
            code_point = UInt32(c)  # Get Unicode code point
            hex_value = uppercase(string(code_point, base=16))  # Convert to hex
            binary_value = lpad(string(code_point, base=2), 8, '0')  # Convert to 8-bit binary
            push!(bit_strings, binary_value)  # Store binary string
        
            println("$i\t$c\t$hex_value\t\t$binary_value")
            total_bits += 8
        end     

    else  # Byte Mode
        println("Index\tChar\tValues (hex)\tBits")        
        for (i, c) in enumerate(text)
            code_point = UInt32(c)  # Get Unicode code point
            hex_value = uppercase(string(code_point, base=16))  # Convert to hex
            binary_value = lpad(string(code_point, base=2), 8, '0')  # Convert to 8-bit binary
            push!(bit_strings, binary_value)  # Store binary string
        
            println("$i\t$c\t$hex_value\t\t$binary_value")
            total_bits += 8
        end
    end
    
    return mode, join(bit_strings), total_bits
end

encode_text (generic function with 1 method)

In [275]:
encode_text(text)

Index	Char	CP hex	NM	AM	BM	KM
1	「	U+300C	No	No	Yes	Yes
2	魔	U+9B54	No	No	Yes	Yes
3	法	U+6CD5	No	No	Yes	Yes
4	少	U+5C11	No	No	Yes	Yes
5	女	U+5973	No	No	Yes	Yes
6	ま	U+307E	No	No	Yes	Yes
7	ど	U+3069	No	No	Yes	Yes
8	か	U+304B	No	No	Yes	Yes
9	☆	U+2606	No	No	Yes	Yes
10	マ	U+30DE	No	No	Yes	Yes
11	ギ	U+30AE	No	No	Yes	Yes
12	カ	U+30AB	No	No	Yes	Yes
13	」	U+300D	No	No	Yes	Yes
14	っ	U+3063	No	No	Yes	Yes
15	て	U+3066	No	No	Yes	Yes
16	、	U+3001	No	No	Yes	Yes
17	　	U+3000	No	No	Yes	Yes
18	И	U+418	No	No	Yes	Yes
19	А	U+410	No	No	Yes	Yes
20	И	U+418	No	No	Yes	Yes
21	　	U+3000	No	No	Yes	Yes
22	ｄ	U+FF44	No	No	Yes	Yes
23	ｅ	U+FF45	No	No	Yes	Yes
24	ｓ	U+FF53	No	No	Yes	Yes
25	ｕ	U+FF55	No	No	Yes	Yes
26	　	U+3000	No	No	Yes	Yes
27	κ	U+3BA	No	No	Yes	Yes
28	α	U+3B1	No	No	Yes	Yes
29	？	U+FF1F	No	No	Yes	Yes
Can every character be encoded in:
Mode		Encodable
Numeric		No
Alphanumeric	No
Byte		Yes
Kanji		Yes
Index	Char	Values (hex)	Bits
1	「	300C		11000000001100
2	魔	9B54		1001101101010100
3	法	6CD5		110110011010101
4	少	5C11		1011100000100

("Kanji", "11000000001100100110110101010011011001101010110111000001000110110010111001111000001111110110000011010011100000100101110011000000110110000110111101100001010111011000010101011110000000011011100000110001111000001100110110000000000011100000000000010000011000100000100001000001100011000000000000111111110100010011111111010001011111111101010011111111110101010111000000000000111011101011101100011111111100011111", 232)