# Problem 89
## [Roman numerals](https://projecteuler.net/problem=89)

<p>For a number written in Roman numerals to be considered valid there are basic rules which must be followed. Even though the rules allow some numbers to be expressed in more than one way there is always a "best" way of writing a particular number.</p>
<p>For example, it would appear that there are at least six ways of writing the number sixteen:</p>
<p class="margin_left monospace">IIIIIIIIIIIIIIII<br />
VIIIIIIIIIII<br />
VVIIIIII<br />
XIIIIII<br />
VVVI<br />
XVI</p>
<p>However, according to the rules only <span class="monospace">XIIIIII</span> and <span class="monospace">XVI</span> are valid, and the last example is considered to be the most efficient, as it uses the least number of numerals.</p>
<p>The 11K text file, <a href="project/resources/p089_roman.txt">roman.txt</a> (right click and 'Save Link/Target As...'), contains one thousand numbers written in valid, but not necessarily minimal, Roman numerals; see <a href="about=roman_numerals">About... Roman Numerals</a> for the definitive rules for this problem.</p>
<p>Find the number of characters saved by writing each of these in their minimal form.</p>
<p class="smaller">Note: You can assume that all the Roman numerals in the file contain no more than four consecutive identical units.</p>


In [1]:
romans = []
with open("p089_roman.txt", "r") as f:
    romans = f.read().splitlines()

In [3]:
chars = {
    "M": 1000,
    "D": 500,
    "C": 100,
    "L": 50,
    "X": 10,
    "V": 5,
    "I": 1
}

In [4]:
letters = ["M", "D", "C", "L", "X", "V", "I"]

In [5]:
romans[0]

'MMMMDCLXXII'

In [6]:
subs = {
    "IX": 9,
    "IV": 4,
    "XC": 90,
    "XL": 40,
    "CD": 400,
    "CM": 900
}

In [29]:
def parse(n):
    s = 0
    for k, v in subs.items():
        if k in n:
            s += v
            n = n[:n.index(k)] + n[n.index(k)+2:]
    for c in n:
        s += chars[c]
    return s

In [45]:
units = {
    1000: "M",
    900: "CM",
    500: "D",
    400: "CD",
    100: "C",
    90: "XC",
    50: "L",
    40: "XL",
    10: "X",
    9: "IX",
    5: "V",
    4: "IV",
    1: "I"
}

In [46]:
def romanize(n):
    s = ""
    while n:
        for k, v in units.items():
            while n >= k:
                s += v
                n -= k
    return s


In [47]:
romanize(4314)

'MMMMCCCXIV'

In [50]:
def solution():
    c = 0 
    for r in romans:
        n = parse(r)
        minimal = romanize(n)
        if len(minimal) < len(r):
            print(r, n, minimal)
            c += len(r) - len(minimal)
    return c

In [51]:
solution()

MMMDLXVIIII 3569 MMMDLXIX
MMCCCLXXXXIX 2399 MMCCCXCIX
MDCCCXXIIII 1824 MDCCCXXIV
MMMMDCCCCI 4901 MMMMCMI
MCCLXXVIIII 1279 MCCLXXIX
MMMMCCXXXXI 4241 MMMMCCXLI
MMMDCCCCXXXIV 3934 MMMCMXXXIV
CDXVIIII 419 CDXIX
MMMMCCCLXXXXVI 4396 MMMMCCCXCVI
MMMDCCCVIIII 3809 MMMDCCCIX
DCCLXXXIIII 784 DCCLXXXIV
MDCCCCXXXII 1932 MCMXXXII
MMMMCMLXXXXVIII 4998 MMMMCMXCVIII
MMDCCCLXXXIIII 2884 MMDCCCLXXXIV
MMCCCCXXXXV 2445 MMCDXLV
MMMMDLXXXVIIII 4589 MMMMDLXXXIX
MMDCCCCLXXVI 2976 MMCMLXXVI
MCCCCLXX 1470 MCDLXX
MMCDLVIIII 2459 MMCDLIX
MMMMCCCCXIX 4419 MMMMCDXIX
MMMMCDLXXXIIII 4484 MMMMCDLXXXIV
MMMCCCCXXX 3430 MMMCDXXX
CLXXXXI 191 CXCI
DLXXXX 590 DXC
MMMMDLXXXXVIII 4598 MMMMDXCVIII
DCCCCXLIII 943 CMXLIII
MMMMCCCCXI 4411 MMMMCDXI
MMCCCCXI 2411 MMCDXI
MMMMDXXXXII 4542 MMMMDXLII
MMMCCCCXV 3415 MMMCDXV
MMDCCCCXVI 2916 MMCMXVI
CCLIIII 254 CCLIV
MMDCCLXXXVIIII 2789 MMDCCLXXXIX
MMCCLXIIII 2264 MMCCLXIV
CMLXXXXI 991 CMXCI
MXVIIII 1019 MXIX
MCCCCLXVIII 1468 MCDLXVIII
MMMDCCLXXIIII 3774 MMMDCCLXXIV
MMMMCCCLXXXXVII 4397 M

743