# Roman numerals

> For a number written in Roman numerals to be considered valid there are basic rules which must be followed. Even though the rules allow some numbers to be expressed in more than one way there is always a "best" way of writing a particular number.
>
> For example, it would appear that there are at least six ways of writing the number sixteen:
>
> IIIIIIIIIIIIIIII  
> VIIIIIIIIIII  
> VVIIIIII  
> XIIIIII  
> VVVI  
> XVI
>
> However, according to the rules only XIIIIII and XVI are valid, and the last example is considered to be the most efficient, as it uses the least number of numerals.
>
> The 11K text file, roman.txt (right click and 'Save Link/Target As...'), contains one thousand numbers written in valid, but not necessarily minimal, Roman numerals; see [About... Roman Numerals](https://projecteuler.net/about=roman_numerals) for the definitive rules for this problem.
>
> Find the number of characters saved by writing each of these in their minimal form.
>
> Note: You can assume that all the Roman numerals in the file contain no more than four consecutive identical units.


I’ll approach this problem by parsing each Roman numeral string into an integer, converting that integer into a minimal Roman Numeral string, and comparing the sums of the lengths of the strings.

For parsing the strings, I'll iterate through the indexes of the characters within the string—not assuming that each character is one byte wide, even though these happen to be—adding or subtracting the value for the current ‘digit’ (depending on its relationship to the character that follows it) from a running total. Observe that the running total can take on a negative value, even though the final total will be positive.

In [1]:
function parseroman(s)
    R = Dict('M'=>1000, 'D'=>500, 'C'=>100, 'L'=>50, 'X'=>10, 'V'=>5, 'I'=>1)
    value = zero(Int)
    for i in eachindex(s)
        if i < lastindex(s) && R[s[i]] < R[s[nextind(s, i)]]
            value -= R[s[i]]
        else
            value += R[s[i]]
        end
    end
    value
end

parseroman (generic function with 1 method)

For turning numbers back into Roman numeral strings, I’ll break it down into the number of 1000s it contains, the number of 500s, etc. I’m pulling out the bigger numbers first, so `D`, the number of 500s, can only be 0 or 1, and so on. I can then build the string up from there, right to left, starting with `M` ‘M’s. Then, the same thing happens for 5s and 1s, at the three powers of 10: 10<sup>2</sup>, 10<sup>1</sup>, and 10<sup>0</sup>. If there are four ones, we subtract one from five or 10, otherwise we proceed directly. Julia strings are immutable, so we store our substrings in an array and join up the pieces at the end.

In [2]:
function toroman(n)
    M, n = divrem(n, 1000)
    D, n = divrem(n, 500)
    C, n = divrem(n, 100)
    L, n = divrem(n, 50)
    X, n = divrem(n, 10)
    V, I = divrem(n, 5)

    function f(a, b, c)
        d = []
        if b == 4
            push!(d, string(c[3], c[a == 1 ? 1 : 2]))
            a, b = 0, 0
        end
        push!(d, c[2]^a, c[3]^b)
        d
    end

    digits = [ 'M'^M ]
    append!(digits, f(D, C, ['M', 'D', 'C']))
    append!(digits, f(L, X, ['C', 'L', 'X']))
    append!(digits, f(V, I, ['X', 'V', 'I']))

    join(digits)
end

toroman (generic function with 1 method)

To get the answer to this problem, I’ll iterate through the lines of the supplied file, recording the length of each. Julia understands composing functions with the ∘ operator, so I can round-trip each Roman numeral with `(toroman ∘ parseroman)(…)` instead of `toroman(parseroman(…))`. Nice. The answer is the difference between the lengths of the given numerals and the ones I produced.

In [3]:
given = 0
computed = 0
for r in eachline("p089_roman.txt")
    given += length(r)
    computed += length((toroman ∘ parseroman)(r))
end
println("numbers expressed with $given characters, redone with $computed characters")
given - computed

numbers expressed with 8850 characters, redone with 8107 characters


743

## Implementing a Custom Type

What if we could use the system `parse` and `convert` functions? We can if we write own implementations of a custom `Roman` type.

In [4]:
primitive type Roman <: Integer 64 end
Roman(x::Int64) = reinterpret(Roman, x)
Int64(x::Roman) = reinterpret(Int64, x)
Base.convert(::Type{Roman}, x) = Roman(x)
Base.convert(::Type{Int64}, x::Roman) = Int64(x)

import Base: +
+(a::Roman, b::Roman) = Roman(Int64(a) + Int64(b))
+(a::Roman, b::Int64) = Roman(Int64(a) + b)
+(a::Int64, b::Roman) = Roman(a + Int64(b))

import Base: -
-(a::Roman, b::Roman) = Roman(Int64(a) - Int64(b))
-(a::Roman, b::Int64) = Roman(Int64(a) - b)
-(a::Int64, b::Roman) = Roman(a - Int64(b))

import Base: <
<(a::Roman, b::Roman) = isless(Int64(a), Int64(b))

function Base.parse(::Type{Roman}, c::AbstractChar)
    R = Dict('M'=>1000, 'D'=>500, 'C'=>100, 'L'=>50, 'X'=>10, 'V'=>5, 'I'=>1)
    haskey(R, uppercase(c)) || throw(DomainError(c, "not a Roman numeral"))
    reinterpret(Roman, R[uppercase(c)])
end

function Base.parse(::Type{Roman}, s::AbstractString)
    value = zero(Int)
    for i in eachindex(s)
        if i < lastindex(s) && parse(Roman, s[i]) < parse(Roman, s[nextind(s, i)])
            value -= parse(Roman, s[i])
        else
            value += parse(Roman, s[i])
        end
    end
    reinterpret(Roman, value)
end

function Base.String(x::Roman)
    n = Int64(x)
    M, n = divrem(n, 1000)
    D, n = divrem(n, 500)
    C, n = divrem(n, 100)
    L, n = divrem(n, 50)
    X, n = divrem(n, 10)
    V, I = divrem(n, 5)

    function f(a, b, c)
        d = []
        if b == 4
            push!(d, string(c[3], c[a == 1 ? 1 : 2]))
            a, b = 0, 0
        end
        push!(d, c[2]^a, c[3]^b)
        d
    end

    digits = [ 'M'^M ]
    append!(digits, f(D, C, ['M', 'D', 'C']))
    append!(digits, f(L, X, ['C', 'L', 'X']))
    append!(digits, f(V, I, ['X', 'V', 'I']))

    join(digits)    
end

Base.length(x::Roman) = length(String(x))
Base.show(io::IO, x::Roman) = print(io, String(x))


In [5]:
@assert sum(length, parse(Roman, r) for r in eachline("p089_roman.txt")) == 8107 "total size of Roman Numerals is wrong"