## ASCII Art Compression

Use the "encodeString" and "decodeString" functions from the Chapter 4 challenge, provided below

Read in the ASCII art text file 10_04_challenge_art.txt and write it back to a new file that has a smaller file size than the original file. 
For example, the original 10_04_challenge_art.txt has a file size of 2.757kB (or 2,757 ASCII characters).

- Any compression is great!
- Is there any way you could get this file to 1kb?
- Less than 1kb?

After compressing the file, make sure to check your work by opening and decoding it again!



In [None]:
import os

def encodeString(stringVal):
    encodedList = []
    prevChar = None
    count = 0
    for char in stringVal:
        if prevChar != char and prevChar is not None:
            encodedList.append((prevChar, count))
            count = 0
        prevChar = char
        count = count + 1
    encodedList.append((prevChar, count))
    return encodedList

def decodeString(encodedList):
    decodedStr = ''
    for item in encodedList:
        decodedStr = decodedStr + item[0] * item[1]
    return decodedStr

## Explanation of encodeString and decodeString

This code provides two functions: one for encoding a string using run-length encoding (`encodeString`) and one for decoding a run-length encoded list back to a string (`decodeString`).

Let's break it down:

1. **encodeString function**:

    - The function `encodeString` takes a string `stringVal` as its argument.

    - It initializes an empty list `encodedList`, which will store tuples of characters and their respective counts.

    - It also initializes two variables `prevChar` (to keep track of the previous character) and `count` (to count consecutive appearances of a character).

    - The function then iterates through each character of `stringVal`.

    - If the current character (`char`) is different from the previous character (`prevChar`), it appends the previous character and its count to `encodedList`. It then resets the count.

    - The count is incremented for every character.

    - After the loop completes, it appends the last character and its count to `encodedList`.

    - The function returns `encodedList`, which is a list of tuples where each tuple consists of a character and its consecutive count.

    For example: `encodeString("aaabbc")` would produce `[('a', 3), ('b', 2), ('c', 1)]`.

2. **decodeString function**:

    - The function `decodeString` takes a list `encodedList` as its argument. This list is expected to have the format produced by `encodeString`, i.e., a list of tuples with characters and counts.

    - It initializes an empty string `decodedStr`, which will store the decoded string.

    - The function then iterates over each tuple in `encodedList`.

    - For each tuple, it multiplies the character by its count and appends the result to `decodedStr`.

    - The function returns `decodedStr`.

    For example: `decodeString([('a', 3), ('b', 2), ('c', 1)])` would produce the string "aaabbc".

In essence, these two functions demonstrate a simple form of lossless data compression called run-length encoding (RLE). RLE is particularly efficient for sequences with long runs of repeated data elements. In this case, the `encodeString` function compresses a string by encoding runs of repeated characters, and the `decodeString` function decompresses the encoded data back into its original string form.


[Python Tutor link for encodeString](https://pythontutor.com/render.html#code=def%20encodeString%28stringVal%29%3A%0A%20%20%20%20encodedList%20%3D%20%5B%5D%0A%20%20%20%20prevChar%20%3D%20None%0A%20%20%20%20count%20%3D%200%0A%20%20%20%20for%20char%20in%20stringVal%3A%0A%20%20%20%20%20%20%20%20if%20prevChar%20!%3D%20char%20and%20prevChar%20is%20not%20None%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20encodedList.append%28%28prevChar,%20count%29%29%0A%20%20%20%20%20%20%20%20%20%20%20%20count%20%3D%200%0A%20%20%20%20%20%20%20%20prevChar%20%3D%20char%0A%20%20%20%20%20%20%20%20count%20%3D%20count%20%2B%201%0A%20%20%20%20encodedList.append%28%28prevChar,%20count%29%29%0A%20%20%20%20return%20encodedList%0A%20%20%20%20%0AencodeString%28%22aaabbc%22%29&cumulative=false&curInstr=34&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

[Python Tutor link for decodeString](https://pythontutor.com/render.html#code=def%20decodeString%28encodedList%29%3A%0A%20%20%20%20decodedStr%20%3D%20''%0A%20%20%20%20for%20item%20in%20encodedList%3A%0A%20%20%20%20%20%20%20%20decodedStr%20%3D%20decodedStr%20%2B%20item%5B0%5D%20*%20item%5B1%5D%0A%20%20%20%20return%20decodedStr%0A%20%20%20%20%0AdecodeString%28%5B%28'a',%203%29,%20%28'b',%202%29,%20%28'c',%201%29%5D%29&cumulative=false&curInstr=12&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

In [15]:
import os
import json
import zlib

def encodeString(stringVal):
    encodedList = []
    prevChar = None
    count = 0
    for char in stringVal:
        if prevChar != char and prevChar is not None:
            encodedList.append((prevChar, count))
            count = 0
        prevChar = char
        count = count + 1
    encodedList.append((prevChar, count))
    return encodedList

def decodeString(encodedList):
    decodedStr = ''
    for item in encodedList:
        decodedStr = decodedStr + item[0] * item[1]
    return decodedStr



art_file = '../Exercise Files/exercise_files/10_04_challenge_art.txt'
encoded_art = 'encoded-art.json'

def encodeFile(filename, newFilename):
    # open filename, use file.read() to read it
    with open(filename, 'r') as file:
        e_file = file.read()
    # make a variable data assigned to encodedString with e_file passed to it
    data = encodeString(e_file)
    # convert the tuples to lists in data
    data_as_list = [list(item) for item in data]
    # write data_as_list to newFilename as json.dump()
    with open('encoded-art.json', 'w') as newFilename:
        json.dump(data_as_list, newFilename)



print(f'Original file size: {os.path.getsize(art_file)}')

encodeFile(art_file, encoded_art)

print(f'New file size: {os.path.getsize(encoded_art)}')



def decodeFile(filename):
    # read the encoded json file, assign the read data to a d_file, a list
    with open(filename, 'r') as file:
        d_file = json.load(file)
    # convert list from json back to list of tuples
    loaded_data = [tuple(item) for item in d_file]
    # pass loaded_data into decodeString
    result = decodeString(loaded_data)
    print(result)
    



decodeFile(encoded_art)


Original file size: 2757
New file size: 2441

                                                                                
                                                                                
                               %%%%%%%%%%%%%%%%%%%                              
                        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%                       
                    %%%%%%%%                         %%%%%%%%                   
                %%%%%%%                                   %%%%%%                
              %%%%%%                                         %%%%%%             
           %%%%%%                                               %%%%%           
          %%%%%                                                   %%%%%         
        %%%%%                                                       %%%%%       
       %%%%                 %%%%%              %%%%%                  %%%%      
      %%%%                 %%%%%%%            %%%%%%%          

In [None]:
print(f'Original file size: {os.path.getsize("10_04_challenge_art.txt")}')

encodeFile('10_04_challenge_art.txt', '10_04_challenge_art_encoded.txt')

print(f'New file size: {os.path.getsize('10_04_challenge_art_encoded.txt')}')


In [None]:
decodeFile('10_04_challenge_art_encoded.txt')

In [19]:
import os
import json
import zlib

def encodeString(stringVal):
    encodedList = []
    prevChar = None
    count = 0
    for char in stringVal:
        if prevChar != char and prevChar is not None:
            encodedList.append((prevChar, count))
            count = 0
        prevChar = char
        count = count + 1
    encodedList.append((prevChar, count))
    return encodedList

def decodeString(encodedList):
    decodedStr = ''
    for item in encodedList:
        decodedStr = decodedStr + item[0] * item[1]
    return decodedStr


art_file = '../Exercise Files/exercise_files/10_04_challenge_art.txt'
encoded_art = 'encoded-art.json'

def encodeFile(filename, newFilename):
    # open filename, use file.read() to read it
    with open(filename, 'r') as file:
        e_file = file.read()
    # make a variable data assigned to encodedString with e_file passed to it
    data = encodeString(e_file)
    # convert the tuples to lists in data
    data_as_list = [list(item) for item in data]
   
    # Compress the data with the zlib library
    compressed_data = zlib.compress(json.dumps(data_as_list).encode('utf-8'))
    with open(newFilename, 'wb') as outfile:  # Notice 'wb' for binary write mode
        outfile.write(compressed_data)

encodeFile(art_file, encoded_art)

print(f'Original file size: {os.path.getsize(art_file)}')

encodeFile(art_file, encoded_art)

print(f'New file size: {os.path.getsize(encoded_art)}')


def decodeFile(filename):
    # read the encoded json file, assign the read data to a d_file, a list
    with open(filename, 'rb') as file:
        compressed_data = file.read()
    
    # Decompress the data before loading from JSON
    decompressed_data = zlib.decompress(compressed_data).decode('utf-8')
    
    d_file = json.loads(decompressed_data)
    # convert back to a tuple
    loaded_data = [tuple(item) for item in d_file]
    result = decodeString(loaded_data)
    print(result)
    

decodeFile(encoded_art)


Original file size: 2757
New file size: 309

                                                                                
                                                                                
                               %%%%%%%%%%%%%%%%%%%                              
                        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%                       
                    %%%%%%%%                         %%%%%%%%                   
                %%%%%%%                                   %%%%%%                
              %%%%%%                                         %%%%%%             
           %%%%%%                                               %%%%%           
          %%%%%                                                   %%%%%         
        %%%%%                                                       %%%%%       
       %%%%                 %%%%%              %%%%%                  %%%%      
      %%%%                 %%%%%%%            %%%%%%%           

Certainly, further compression can be added to the code. One simple method would be to use the `zlib` library which is a part of the standard Python library. `zlib` provides functions for compression and decompression using the DEFLATE algorithm.

Here's how you could modify your code to incorporate `zlib` compression:

1. **Before writing the encoded data to a JSON file**: Compress the data using `zlib.compress()`.

2. **After reading the data from the JSON file**: Decompress the data using `zlib.decompress()`.

Let's modify your code:

```python
import os
import json
import zlib

def encodeString(stringVal):
    # ... [same as before]

def decodeString(encodedList):
    # ... [same as before]

art_file = '../Exercise Files/exercise_files/10_04_challenge_art.txt'
encoded_art = 'encoded-art.json'

def encodeFile(filename, newFilename):
    with open(filename, 'r') as file:
        e_file = file.read()
    data = encodeString(e_file)
    data_as_list = [list(item) for item in data]
    
    # Convert the list to string and compress before writing
    compressed_data = zlib.compress(json.dumps(data_as_list).encode('utf-8'))
    with open(newFilename, 'wb') as outfile:  # Notice 'wb' for binary write mode
        outfile.write(compressed_data)

encodeFile(art_file, encoded_art)

def decodeFile(filename):
    with open(filename, 'rb') as file:  # Notice 'rb' for binary read mode
        compressed_data = file.read()
    
    # Decompress the data before loading from JSON
    decompressed_data = zlib.decompress(compressed_data).decode('utf-8')
    d_file = json.loads(decompressed_data)
    loaded_data = [tuple(item) for item in d_file]
    result = decodeString(loaded_data)
    print(result)

decodeFile(encoded_art)
```

Key changes:

- The data is compressed after encoding and before writing to the JSON file.
- The data is decompressed after reading from the JSON file and before decoding.
- Since compressed data is binary, the file read and write modes have been updated to binary modes (`'rb'` and `'wb'`, respectively).
  
The combination of run-length encoding and `zlib` compression should yield a significant size reduction for the artwork file, especially if it contains a lot of repeated sequences.