### Run-length encoding

Implement run-length encoding and decoding.

Run-length encoding (RLE) is a simple form of data compression, where runs
(consecutive data elements) are replaced by just one data value and count.

For example we can represent the original 53 characters with only 13.

```
"WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB"  ->  "12WB12W3B24WB"
```

RLE allows the original data to be perfectly reconstructed from
the compressed data, which makes it a lossless data compression.

```
"AABCCCDEEEE"  ->  "2AB3CD4E"  ->  "AABCCCDEEEE"
```


### Final algorithm

In [55]:
def rle(sen:str)->str:
    sen = sen.replace(' ','') # remove any whitespace from string
    rle = ''
    count = 0
    p = sen[0]
    for el in sen:
        if el == p:
            count += 1
        else:
            if count == 1: # don't show a count of 1
                rle += p
            else:
                rle += str(count) + p
            count = 1
        p = el
    # add final element   
    if count == 1: # don't show a count of 1
        rle += el
    else:
        rle += str(count) + el
    print(rle)
    
    
rle('AABBBCCCC') #'2A3B4C'
rle('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWB') # '12WB12W3B24WB'
rle('⏰⚽⚽⚽⭐⭐⏰') # '⏰3⚽2⭐⏰'
rle('AABCCCD EEEEG') # '2AB3CD4EG'

2A3B4C
12WB12W3B24WB
⏰3⚽2⭐⏰
2AB3CD4EG


### Experimentation

In [38]:
sen = "AABCCCD EEEE"
split = [letter for char in sen.split() for letter in char]
print(split)
occ = {letter : split.count(letter) for letter in split}
print(occ)
#enc = [letter, split.count(letter) for letter in split]
#print(enc)
# enumerate string:
splat = [(i,letter) for i,letter in enumerate(sen)] # takes whitespace char as well !!!
print(splat)
acc = {i:letter for i,letter in enumerate(sen)}
print(acc)
rle = []
i,j = 0,0
letter = sen[0]
print(len(sen))
for i in range(len(sen)-1):
    if sen[i+1] == sen[i]:
        j += 1
    else:
        app= str(j) + sen[i]
        rle.append(app)
print(rle)

['A', 'A', 'B', 'C', 'C', 'C', 'D', 'E', 'E', 'E', 'E']
{'A': 2, 'B': 1, 'C': 3, 'D': 1, 'E': 4}
[(0, 'A'), (1, 'A'), (2, 'B'), (3, 'C'), (4, 'C'), (5, 'C'), (6, 'D'), (7, ' '), (8, 'E'), (9, 'E'), (10, 'E'), (11, 'E')]
{0: 'A', 1: 'A', 2: 'B', 3: 'C', 4: 'C', 5: 'C', 6: 'D', 7: ' ', 8: 'E', 9: 'E', 10: 'E', 11: 'E'}
12
['1A', '1B', '3C', '3D', '3 ']


In [24]:
sen = '⏰⚽⚽⚽⭐⭐⏰'#"AABCCCD EEEE"
split = [letter for char in sen.split() for letter in char]
print(split)
occ = {letter : split.count(letter) for letter in split}
print(occ)
#enc = [letter, split.count(letter) for letter in split]
#print(enc)
# enumerate string:
rl = ''
for key,val in occ.items():
    if val == 1:
        val = ''
        rl += val + str(key)
    else:
        rl += str(val) + str(key)
print(rl)
splat = [(i,letter) for i,letter in enumerate(sen)] # takes whitespace char as well !!!
print(splat)
acc = {i:letter for i,letter in enumerate(sen)}
print(acc)
rle = []
i,j = 0,0
letter = sen[0]
print(len(sen))
for i in range(len(sen)-1):
    if sen[i+1] == sen[i]:
        j += 1
    else:
        app= str(j) + sen[i]
        rle.append(app)
print(rle)

['⏰', '⚽', '⚽', '⚽', '⭐', '⭐', '⏰']
{'⏰': 2, '⚽': 3, '⭐': 2}
2⏰3⚽2⭐
[(0, '⏰'), (1, '⚽'), (2, '⚽'), (3, '⚽'), (4, '⭐'), (5, '⭐'), (6, '⏰')]
{0: '⏰', 1: '⚽', 2: '⚽', 3: '⚽', 4: '⭐', 5: '⭐', 6: '⏰'}
7
['0⏰', '2⚽', '3⭐']


In [25]:
# finding occurences misses left to right sequence; because of working with counts

sen = '⏰⚽⚽⚽⭐⭐⏰'#"AABCCCD EEEE"
split = [letter for char in sen.split() for letter in char]
print(split)
occ = {letter : split.count(letter) for letter in split}
print(occ)
#enc = [letter, split.count(letter) for letter in split]
#print(enc)
# enumerate string:
rl = ''
for key,val in occ.items():
    if val == 1:
        val = ''
        rl += val + str(key)
    else:
        rl += str(val) + str(key)
print(rl)

['⏰', '⚽', '⚽', '⚽', '⭐', '⭐', '⏰']
{'⏰': 2, '⚽': 3, '⭐': 2}
2⏰3⚽2⭐


In [45]:
sen = '⏰⚽⚽⚽⭐ ⭐⏰'
sen.replace(' ','')

'⏰⚽⚽⚽⭐⭐⏰'

In [54]:
# iterate over string way, no list comprehension

sen = '⏰⚽⚽⚽⭐⭐⏰⏰'
rle = ''
count = 0
p = sen[0]
for el in sen:
    #print(el)
    if el == p:
        count += 1
        #p = el
    else:
        part = str(count) + p
        rle += part
        print(rle)
        count = 1
        #p = el
    p = el
rle += str(count) + el
print(rle)

1⏰
1⏰3⚽
1⏰3⚽2⭐
1⏰3⚽2⭐2⏰
