# WordArt

Written by John Wallin

This is a short code that changes text into art.  The basic idea is to use
an image as a mask for text, so the image is showing through the text. 
It allows you to experiment with graphical representations of text.

The text used in this project is downloaded from Project Gutenberg, 
but you could use other sources as well.  The art templates used
could come form anywhere, but for most of the simple examples I have
played with creating a black and white stencil in your favorite
painting and drawing program works well.

For the graphics, simple is better. Blocks of black and white areas
are the best option.

This was a pretty easy code to write. The main time sink for me
was looking up how some of the library functions worked.  The logic 
behind the code is very straightforward.  

For the output of the code, I decided write the images as Powerpoint
slides. This seems a bit strange, but Powerpoint is a relatively universal
format that most people can use. The target of this project
is to create large poster-sized art projects.  Since my home
printer can't make 20x16 prints, I will need to bring the output
to some place like Staples or Fex Ex printing.  Powerpoint files
work very well at these kinds of commerical sites.  Of course,
any OpenOffice software should be able to read these files as well. 

Three main libraries are used for the project:
- The python-pptx library to create Powerpoint slides in python.
- The python image library (PIL) to read and manipulate the images.
- The requests library to download data from Gutenburg.
- The regular expression library (re) to help clean up weird unicode characters in the text.

Share and enjoy!






In [None]:
#pip install python-pptx

In [1]:
# If some library is missing you might need to install them.
#pip install python-pptx
#pip install requests
#pip install PIL


# Python image library routines
import PIL.ImageGrab
import PIL.ImageOps    
import PIL.Image


# The python presentation library - this allows us to make powerpoint
from pptx import Presentation
from pptx.util import Inches
from pptx.util import Pt

# download requests
import requests

# regular expressiosn
import re
#import time



In [2]:
def clean_text(text):

    # get rid of the line feeds and new lines
    text = re.sub("\r"," ",text)
    text = re.sub("\n"," ",text)
    
    # eliminate asterisk and lines and quotes
    text = re.sub("\*", "", text)
    text = re.sub("\_", "", text)
    text = re.sub('"', '', text)
    
    # remove all the weird quirky control characters 
    # associated with the UTF-8 format.
    text = text.replace("â\x80\x9d", "")
    text = text.replace("â\x80\x9c", "")
    text = text.replace("â\x80\x9c", "''")
    text = text.replace("â\x80\x99","''")
    text = text.replace("â\x80\x94","")
    text = text.replace("â\x80\x98", "")
    
    # convert all multple spaces into a single space
    text = re.sub("\s+"," ", text)


    return text

In [3]:

def create_paragraphs(img, font_size, slide_width, slide_height, 
                     slide_left_margin, slide_top_margin, 
                     threshhold, invertBW, text):
    
    # set up the margins for the text
    text_width = slide_width -  2*slide_left_margin
    text_height= slide_height - 2*slide_top_margin
    
    # define the vertical and horizontal scale
    # the 60 and 120 were found experimentally, but 
    # someone who is better at typesetting can probably
    # explain why this works a bit better than I could
    vscale = 60 / font_size
    hscale = 120 / font_size


    # resize the image based on the available text field
    # we will associated each pixel with a character 
    nw = int(text_width * hscale)
    nh = int(text_height * vscale)
    img3 = img.resize((nw, nh))


    # convert the image to a gray scale and invert it if desired
    gray = img3.convert("L")
    if invertBW:
        gray = PIL.ImageOps.invert(gray)
    
    # turn the gray scale pixels into a list
    a = list(gray.getdata())

    # set the text counter to 0 - this is associated
    # with the actual character location in the original book
    text_counter = 0

    # set the current character counter to 0 - this is
    # associated with pixel location
    pixel_counter = 0

    # set up the paragraph list
    plist = []

    # loop over rows - each row will be a paragraph
    for i in range(nh):
        fT = ""  # set the string for the paragraph to null
        for j in range(nw):  # loop over the image columns
            
            # get the current pixel data - determine if its is above the
            # threshold
            c = a[pixel_counter]
            add_character = ( c>= threshold)
            
            # if the character is above the threshold, grab the next
            # character from the story and add it to the current line
            if add_character:
                fT = fT + text[text_counter]
                text_counter = text_counter + 1
            else:
                # if this is below the threshold, add a blank
                fT = fT + " "

            # update the character counter associated with pixel locations
            pixel_counter = pixel_counter + 1

        # add the string to the list of paragraphs
        plist.append(fT)
    return plist

In [4]:
def create_powerpoint(output_name, plist, slide_width, slide_height, 
                        slide_left_margin, slide_top_margin):

    
    # create a presentation and set the height and width
    prs=Presentation()
    prs.slide_width = Inches(slide_width)
    prs.slide_height =  Inches(slide_height)

    # set the slide layout to a blank slide
    # 6 = blank slide in this library
    lyt=prs.slide_layouts[6] 


    # calculate the bounds for the textbox
    text_width = slide_width - 2*slide_left_margin
    text_height= slide_height - 2*slide_top_margin
    
    left = Inches(slide_left_margin)
    top = Inches(slide_top_margin)
    width = Inches(text_width)
    height = Inches(text_height)
    
    # create the slide and the textbox
    slide=prs.slides.add_slide(lyt)  
    text_box=slide.shapes.add_textbox(left, top, width, height)
    tb=text_box.text_frame

    # add the paragraphs to the textbox
    # we could add runs of text with different fonts and colors if desired
    for pp in plist:
        pgr = tb.add_paragraph()
        pgr.text=pp  # start at character 1 to avoid formatting problems

    # set the paragraphs to the correct font size and font
    # We use Courier because it is a monospaced font.  Each character
    # is the same width.
    for i in range(len(tb.paragraphs)):
        tb.paragraphs[i].font.size = Pt(font_size)  # the 1 index is correct - a null paragraph is created in the text box
        tb.paragraphs[i].font.name = 'Courier'  # use a monospace font

    # save the file
    prs.save(output_name) 

In [5]:

# Set up the data for the books.  All the book data
# used in this project downloaded from Project Gutenburg.
# 
# The format of the data fields are:
# the title of the book
# the author
# the URL of the UTF-8 book
# the text from the first line of the story

book_data = [
    {
        "title":"A CHRISTMAS CAROL IN PROSE BEING A Ghost Story of Christmas",
        "author":"Charles Dickens",
        "url":"https://www.gutenberg.org/cache/epub/46/pg46.txt",
        "firstLine":"MARLEY was dead: to begin with"
    
    },
    {
        "title":"Alice’s Adventures in Wonderland",
        "author":"Lewis Carroll",
        "url":"https://www.gutenberg.org/files/11/11-0.txt",
        "firstLine":"Alice was beginning to get very tired"
    },
    {
        "title":"War of the Worlds",
        "author":"H. G. Wells",
        "url":"https://www.gutenberg.org/cache/epub/36/pg36.txt",
        "firstLine":"No one would have believed"
    },
    {
        "title":"Candide",
        "author":"Voltaire",
        "url":"https://www.gutenberg.org/cache/epub/19942/pg19942.txt",
        "firstLine":"In a castle of Westphalia"
    },
    {
        "title":"Sense and Sensibility",
        "author":"Jane Austin",
        "url":"https://www.gutenberg.org/files/161/161-0.txt",
        "firstLine":"The family of Dashwood had long been"
    },
    {
        "title":"Pride and Prejudice", 
         "author":"Jane Austin",
         "url":"https://www.gutenberg.org/files/1342/1342-0.txt",
        "firstLine":"It is a truth universally acknowledged"
    }
]





In [6]:
# this could include other directories or be modified to download files
# from the web
image_names = [
    "merrychristmas.png",
    "scrooge.png",
    "cat.png",
    "alice.jpeg",
    "mad-hatter.png"
]

In [7]:

# prompt the user to about which book they wish to use
# and which image to use.

print("This is the list of books currently configured for download: ")
print()
for i, b in enumerate(book_data):
    print(i, b["title"])
print()

print("-----\n")
print("This is the list of images currently configured available to use: ")
print()
for i, image_name in enumerate(image_names):
    print(i, image_name)

print()

This is the list of books currently configured for download: 

0 A CHRISTMAS CAROL IN PROSE BEING A Ghost Story of Christmas
1 Alice’s Adventures in Wonderland
2 War of the Worlds
3 Candide
4 Sense and Sensibility
5 Pride and Prejudice

-----

This is the list of images currently configured available to use: 

0 merrychristmas.png
1 scrooge.png
2 cat.png
3 alice.jpeg
4 mad-hatter.png



In [8]:
# configure the data choices and output

book_number = 1
image_number= 4
output_name = "wordart.pptx"

book = book_data[book_number]
image_name  = image_names[image_number]

# set the font size, threshold between black and white, and the invert color flag
font_size = 6
threshhold = 128
invertBW = True

# set the slide with and margins in inches
# note: this project is designed to be create poster-sized
# art, so big generally looks better
slide_width = 20
slide_height = 30
slide_left_margin = 1
slide_top_margin = 1

# load the book data
data = requests.get( book["url"])
istart = data.text.find(book["firstLine"])
text = data.text[istart:]

# clean the text
text = clean_text(text)


# load the image
img = PIL.Image.open(image_name)

# create the paragraphs
plist = create_paragraphs(img, font_size, slide_width, slide_height, 
                         slide_left_margin, slide_top_margin, 
                          threshhold, invertBW, text )

# create the powerpoint slide
create_powerpoint(output_name, plist, slide_width, slide_height, 
                        slide_left_margin, slide_top_margin)


FileNotFoundError: [Errno 2] No such file or directory: 'mad-hatter.png'