In [1]:
%autosave 0

Autosave disabled


My code requires a number of secret API keys to run. They are stored in my env file.

The imports are basic. I need regex to clean text data, requests to interact with APIs, and openai to interact with their specific API.

In [2]:
import re
import requests
import openai
import unicodedata

from env import API_KEY, OPENAI_API_KEY, MEME_KEY

I need to set openai's api key variable in order to properly use the openai library.

In [3]:
openai.api_key = OPENAI_API_KEY

Declaring the url for the news API.

In [4]:
url = f"https://newsapi.org/v2/top-headlines?country=us&apiKey={API_KEY}"

Making a request of the news API and verifying I get an acceptable response.

In [5]:
response = requests.get(url)
response

<Response [200]>

Checking out the contents, I can see a number of useful fields:
- Title
- Description
- Content

In [6]:
response.json()

{'status': 'ok',
 'totalResults': 35,
 'articles': [{'source': {'id': 'the-wall-street-journal',
    'name': 'The Wall Street Journal'},
   'author': 'Roque Ruiz, Summer Said, Katy Stech Ferek, Dov Lieber, Emma Brown, Omar Abdel-Baqui, Sabrina Siddiqui, Shoshanna Soloman, Margherita Stancati, Dion Nissenbaum, Ari Flanzraich, Sadie Gurman, Saleh Al-Batati, Anas Baba, Vivian Salama',
   'title': 'Israel-Hamas War Live Updates: Israel Says Blast Struck Jabalia Refugee Camp, Killing Hamas Commander - The Wall Street Journal',
   'description': 'The latest news on the escalating conflict in Gaza between Israel and Hamas.',
   'url': 'https://www.wsj.com/livecoverage/israel-hamas-gaza-war-latest',
   'urlToImage': 'https://images.wsj.net/im-878580/social',
   'publishedAt': '2023-10-31T20:18:00Z',
   'content': 'Israel said its military continued to widen its operations and reported the deaths of two soldiers, after Prime Minister Benjamin Netanyahu defended the escalation.Israeli operations

Using a simple list comprehension, I can iterate through the response object and extract the pieces of information I find most useful.

In [7]:
titles = []
descriptions = []
contents = []

regexp = r"^(.*?)\s\-\s.{0,25}$"

for article in response.json()['articles']:
    
    titles.append(re.search(regexp, article['title']).groups()[0])
    
    descriptions.append(article['description'])
    
    contents.append(article['content'])
    
len(titles), len(descriptions), len(contents)    

(20, 20, 20)

The titles are concise and descriptive.

In [8]:
titles

['Israel-Hamas War Live Updates: Israel Says Blast Struck Jabalia Refugee Camp, Killing Hamas Commander',
 'Sources - Commanders agree to trade Chase Young to 49ers',
 'Matthew Perry Remembered by ‘Odd Couple’ Co-Star Thomas Lennon: He ‘Was Always Trying to Get Better’',
 "Max streaming service attaches Matthew Perry memorial to 'Friends'",
 'Apple just started rolling out new AirTag firmware',
 'Senate confirms Jack Lew as US ambassador to Israel following vocal GOP opposition over Iran deal',
 'U.S. Orders Grand Canyon University to Pay Record $37.7 Million Fine',
 "Bud Light parent Anheuser-Busch InBev's sales tumble further in US",
 'Who won James Harden trade? Grading and debating 76ers-Clippers deal',
 'Robert De Niro Testifies Against Former Employee: “The Whole Case Is Nonsense!”',
 'Dinosaur-killing impact did its dirty work with dust',
 'Sam Bankman-Fried defense rests in criminal trial, closing arguments kick off Wednesday',
 "FBI Director Wray warns terror threat to America

The descriptions can be hit or miss. The first description fails to offer any insight into the Israel-Hamas conflict. Other descriptions lend additional context to headlines, such as the one regarding Grand Canyon University where it explains why the school was fined the record amount.

In [9]:
descriptions

['The latest news on the escalating conflict in Gaza between Israel and Hamas.',
 'The 49ers, who have had pass rush struggles of late, acquired defensive end Chase Young from the Commanders in exchange for a third-round draft pick.',
 'Matthew Perry\'s "The Odd Couple" and "17 Again" co-star Thomas Lennon pays tribute to the late actor.',
 'Warner Bros. Discovery-owned Max has included a memorial to the late actor Matthew Perry to its streaming of the sitcom "Friends." Perry died over the weekend.',
 'Lest you think a space black MacBook Pro was the biggest news of the week, Apple is back with an...',
 'The Senate on Tuesday confirmed former Treasury Secretary\xa0Jack\xa0Lew\xa0as the new US ambassador to Israel, despite stiff opposition from Senate Republicans over his involvement in the Iran nuclear deal during the Obama administration.',
 'Education Department accuses school of misleading some students about tuition costs',
 'Anheuser-Busch InBev reported a 17.1% decline in sales i

I'm going to feed one of these headlines to ChatGPT and see if it can describe a funny meme.

In [10]:
model = 'gpt-3.5-turbo'

This model prompt is based on the intial response I got from ChatGPT. When I asked it in the browser to make a meme based on a headline, it informed me that it cannot directly generate a meme. Instead, it's able to provide the following pieces of information: a description of the image, the top text, and the bottom text. I decided to format all future prompts with the same structure that was suggested by ChatGPT.

In [11]:
user_content = f"""Please create a meme about the following headline: {titles[17]}.

                   Format your response like this:
                   
                   Image:
                   Text at the top:
                   Text at the bottom:
                """

Making the API call and collecting the response.

In [12]:
response = openai.ChatCompletion.create(model=model,
                                        messages=[{'role': 'user', 'content': user_content}])

Response objects from openai tend to be complex. Thanks to previous projects, I already know how to unpack the response object to retrieve the contents.

In [13]:
response

<OpenAIObject chat.completion id=chatcmpl-8GDPml2wfgvfLNyeZxiPqoelbYUpR at 0x108020ae0> JSON: {
  "id": "chatcmpl-8GDPml2wfgvfLNyeZxiPqoelbYUpR",
  "object": "chat.completion",
  "created": 1698875506,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Image: A picture of Donald Trump with a confident expression, standing on a podium.\n\nText at the top: \"When you dominate the South Carolina GOP primary\"\n\nText at the bottom: \"But Haley is like, 'Hey, at least I got second place!'\""
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 58,
    "completion_tokens": 54,
    "total_tokens": 112
  }
}

Using my past code, I get the response. The text is a little messy, but the content matches the format I supplied in the prompt.

In [14]:
response_content = response['choices'][0]['message']['content']
response_content

'Image: A picture of Donald Trump with a confident expression, standing on a podium.\n\nText at the top: "When you dominate the South Carolina GOP primary"\n\nText at the bottom: "But Haley is like, \'Hey, at least I got second place!\'"'

With a little regex, I can clean up some of the undesireable characters. I'm able to get rid of the escape characters that exist before the quotation marks.

In [15]:
clean_content = re.sub(r"[^a-zA-Z0-9':\s\.]", "", response_content)
clean_content

"Image: A picture of Donald Trump with a confident expression standing on a podium.\n\nText at the top: When you dominate the South Carolina GOP primary\n\nText at the bottom: But Haley is like 'Hey at least I got second place'"

Like past code, I can separate the responses into three parts using the double newline characters.

In [16]:
image, top, bottom = clean_content.split('\n\n')
image, top, bottom

('Image: A picture of Donald Trump with a confident expression standing on a podium.',
 'Text at the top: When you dominate the South Carolina GOP primary',
 "Text at the bottom: But Haley is like 'Hey at least I got second place'")

Sometimes there are additional newline characters I need to weed out.

In [17]:
image = re.sub("\n", "", image)
image

'Image: A picture of Donald Trump with a confident expression standing on a podium.'

I can define a basic function to capture everything after the colon and space in the response objects. This removes the formatting (image, text at the top, text at the bottom) from the responses and leaves me with the response. I can use these responses to generate memes.

In [18]:
def extract(text):
    
    regexp = r"^.*?:\s(.*)"
    
    return re.search(regexp, text).groups()[0]

Testing my function.

In [19]:
extract(top)

'When you dominate the South Carolina GOP primary'

My function works for all three pieces of information.

In [20]:
ex_image = extract(image)
ex_top = extract(top)
ex_bottom = extract(bottom)
ex_image, ex_top, ex_bottom

('A picture of Donald Trump with a confident expression standing on a podium.',
 'When you dominate the South Carolina GOP primary',
 "But Haley is like 'Hey at least I got second place'")

The phrase "Donald Trump" causes my prompt to be disallowed by the openai safety measures. I substitute "a man made of oranges" for his name to bypass the filter.

In [21]:
mod_image = re.sub(r"Donald Trump", "a man made of oranges", ex_image)
mod_image

'A picture of a man made of oranges with a confident expression standing on a podium.'

I can ship off the prompt to DALL-E. It returns a url, from which I can request the image itself.

In [40]:
gen_image = openai.Image.create(prompt=mod_image, n=1, size="512x512")
image_url = gen_image['data'][0]['url']

The image url is a complete mess. Good thing I can access it programmatically.

In [41]:
image_url

'https://oaidalleapiprodscus.blob.core.windows.net/private/org-fnMsz8paaqVdLmnQawrkIEqr/user-VJwsZ9FQz2cBdKVVJFIYFkF4/img-NBM4byFGnOFyxsHvvJMAkicD.png?st=2023-11-01T20%3A35%3A05Z&se=2023-11-01T22%3A35%3A05Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-11-01T18%3A34%3A14Z&ske=2023-11-02T18%3A34%3A14Z&sks=b&skv=2021-08-06&sig=iKY5s3GQ4SIAs5vSlDLB7iqLj7FqCjgV31VznjMCUys%3D'

I can make a follow-up request of the image url and save the image to a file. I need to specify the filename and make sure I am writing to the file in binary with 'wb'.

In [42]:
img_data = requests.get(image_url).content

with open('really_orange_man.jpeg', 'wb') as f:
    f.write(img_data)

The following code is copied from the meme generation endpoint testing website. It doesn't work, because it returns a 500 response from the post request.

In [23]:
upload_url = "https://ronreiter-meme-generator.p.rapidapi.com/images"

host = "ronreiter-meme-generator.p.rapidapi.com"

files = {'image': "open('really_orange_man.jpeg', 'rb')"}

headers = {
    'X-RapidAPI-Key': MEME_KEY,
    'X-RapidAPI-Host': host
}

response = requests.post(upload_url, files=files, headers=headers)
response

<Response [500]>

If I open the file and then save it to the files dictionary, I'm able to post my image to the API.

In [46]:
with open('really_orange_man.jpeg', 'rb') as f:
    files = {'image': f}
    
    response = requests.post(upload_url, files=files, headers=headers)
    response

I get a response object from my post request. It tells me my request was successful and it supplies the name of the uploaded image. I can use the name of the image to create memes using the image in future API calls.

In [47]:
response.json()

{'status': 'success', 'name': 'really-orange-man'}

I can make a request of the upload url to verify my image was uploaded successfully.

In [42]:
response = requests.get(upload_url, headers=headers)
response

<Response [200]>

There are a lot of images available. Mine are added at the bottom, which seems to be caused by my lowercase filenames.

In [43]:
response.json()

['10-Guy',
 '1950s-Middle-Finger',
 '1990s-First-World-Problems',
 '1st-World-Canadian-Problems',
 '2nd-Term-Obama',
 'Aaaaand-Its-Gone',
 'Ace-Primo',
 'Actual-Advice-Mallard',
 'Adalia-Rose',
 'Admiral-Ackbar-Relationship-Expert',
 'Advice-Dog',
 'Advice-Doge',
 'Advice-God',
 'Advice-Peeta',
 'Advice-Tam',
 'Advice-Yoda',
 'Afraid-To-Ask-Andy',
 'Afraid-To-Ask-Andy-Closeup',
 'Aint-Nobody-Got-Time-For-That',
 'Alan-Greenspan',
 'Alarm-Clock',
 'Albert-Cagestein',
 'Albert-Einstein-1',
 'Alien-Meeting-Suggestion',
 'Alright-Gentlemen-We-Need-A-New-Idea',
 'Always-Has-Been',
 'Alyssa-Silent-Hill',
 'Am-I-The-Only-One-Around-Here',
 'American-Chopper-Argument',
 'Ancient-Aliens',
 'And-everybody-loses-their-minds',
 'And-then-I-said-Obama',
 'Angry-Asian',
 'Angry-Baby',
 'Angry-Birds-Pig',
 'Angry-Bride',
 'Angry-Chef-Gordon-Ramsay',
 'Angry-Chicken-Boss',
 'Angry-Dumbledore',
 'Angry-Koala',
 'Angry-Rant-Randy',
 'Angry-Toddler',
 'Annoying-Childhood-Friend',
 'Annoying-Facebook-Girl

I can generate memes using existing images at the API.

In [26]:
gen_url = "https://ronreiter-meme-generator.p.rapidapi.com/meme"

querystring = {"top": ex_top, "bottom": ex_bottom, "meme": "Trump-Bill-Signing",
               "font_size": "30", "font": "Impact"}

response = requests.get(gen_url, headers=headers, params=querystring)

response

<Response [200]>

Checking out the headers that accompany the response from the API.

In [30]:
response.headers

{'Date': 'Wed, 01 Nov 2023 21:52:41 GMT', 'Content-Type': 'image/jpeg', 'Content-Length': '144940', 'Connection': 'keep-alive', 'NEL': '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}', 'CF-Cache-Status': 'DYNAMIC', 'alt-svc': 'h3=":443"; ma=86400', 'CF-RAY': '81f75dbf69b829bc-IAD', 'Report-To': '{"endpoints":[{"url":"https:\\/\\/a.nel.cloudflare.com\\/report\\/v3?s=X1aE5WAbALBKKGb2hJUub%2Fp3H3TwU%2FPtdZCcfffuDMrA0jZH8jNY8fdGKrh2ymu97s7W7A%2BkDMRblKkwWCgqQ9Thtxy7fieGw2oA%2FVOmOAz8qC2Rg0ouBMX4HJbxwA%3D%3D"}],"group":"cf-nel","max_age":604800}', 'X-RateLimit-requests-Limit': '1000', 'X-RateLimit-requests-Remaining': '990', 'X-RateLimit-requests-Reset': '2590380', 'Server': 'RapidAPI-1.2.8', 'X-RapidAPI-Version': '1.2.8', 'X-RapidAPI-Region': 'AWS - us-east-1'}

I can write the content of the response (the meme) to my filesystem.

In [32]:
with open('first_gen.jpeg', 'wb') as f:
    f.write(response.content)