## ASCII Art Compression


**HINT 1:** Doing something like this will technically meet the requirements of this challenge:

In [26]:
# json.dumps(encodeString(text))

However, I hope you can find a more efficient compression algorithm than that!

**HINT 2:** Writing a list of tuples, there are a lot of instances of "),(" and lots of extra quotes and things, which is a lot of characters to devote to where perhaps a single comma would suffice...

**HINT 3:** If you're looking for a longer challenge, you can look into writing bytes to a file. This is absolutely not necessary, however!

In [27]:
import os
import ast

def encodeString(stringVal):
    encodedList = []
    prevChar = None
    count = 0
    for char in stringVal:
        if prevChar != char and prevChar is not None:
            encodedList.append((prevChar, count))
            count = 0
        prevChar = char
        count = count + 1
    encodedList.append((prevChar, count))
    return encodedList

def decodeString(encodedList):
    decodedStr = ''
    for item in encodedList:
        decodedStr = decodedStr + item[0] * item[1]
    return decodedStr

In [28]:
def encodeFile(filename, newFilename):
    dataInStr = ''
    with open(filename, 'r') as f1, open(newFilename,
    'w') as f2:
        for line in f1.readlines():
            dataInStr += line
        f2.write(str(encodeString(dataInStr)))
        f1.close()
        f2.close()

def decodeFile(filename):
    text = ''
    with open(filename, 'r') as f:
        for line in f.readlines():
            text += line
        f.close()

    # converting the text to list
    data_list = ast.literal_eval(text)

    return decodeString(data_list)



In [29]:
print(f'Original file size: {os.path.getsize("10_04_challenge_art.txt")}')

encodeFile('10_04_challenge_art.txt', '10_04_challenge_art_encoded.txt')

print(f'New file size: {os.path.getsize("10_04_challenge_art_encoded.txt")}')

Original file size: 2152
New file size: 2041


In [30]:
print(decodeFile('10_04_challenge_art_encoded.txt'))





                               %%%%%%%%%%%%%%%%%%%
                        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
                    %%%%%%%%                         %%%%%%%%
                %%%%%%%                                   %%%%%%
              %%%%%%                                         %%%%%%
           %%%%%%                                               %%%%%
          %%%%%                                                   %%%%%
        %%%%%                                                       %%%%%
       %%%%                 %%%%%              %%%%%                  %%%%
      %%%%                 %%%%%%%            %%%%%%%                  %%%%
     %%%%                  %%%%%%%            %%%%%%%                   %%%%
    %%%%                   %%%%%%%            %%%%%%%                    %%%%
    %%%%                    %%%%%              %%%%%                     %%%%
   %%%%                                                                   %%%%
   %%%%      

We can see that our filesize has reduced from __2150__ to __2041__, which is not much but still it's compressed.

Let's see how we can compress it even more - 
Currently our encoded text looks like this - `[('A', 2), ('B', 31) ...` and in here we can see their are 5 characters/delimiters in here between our essential information `), ('` these takes a lot of memory.

So, if we can write it in an efficient way like - `A|2~B|31~...` now we are using delimiters like `|` and `~` which are single characters and if we write our code such that it handle this formatting correctly then this will compress the file even more (something like - __974__).

If you want to compress it even more then converting the data into __bytes__ could be another option. Now you can store any integer from 0 to 255 in single byte so rather than using 3 bytes to store `123` you can just store it in a single byte which when used correctly compresses the file a lot. (something like - __438__)

> These little more compressions will become really helpful when storing large amounts of data.

In [31]:
# haha car go vroom vroom boom boom brrrrrrr......