# Generate Podcast Synopsis for Various Genres

In [1]:
%load_ext autoreload
%autoreload 2

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

## Set Up Azure OpenAI

In [2]:
import os
import openai

# Set up Azure OpenAI
openai.api_type = "azure"
openai.api_base = 'https://azure-openai-test21.openai.azure.com/'
openai.api_version = "2023-03-15-preview"
openai.api_key = 'd1869cee351446e2bc6b6ffef2207576'


## Deploy a Model

In [3]:
# id of desired_model
desired_model = 'gpt-35-turbo' # suitable for text generation
desired_capability = 'completion'

# list models deployed with
deployment_id = None
result = openai.Deployment.list()

for deployment in result.data:
    if deployment["status"] != "succeeded":
        continue
    
    model = openai.Model.retrieve(deployment["model"])

    # check if desired_model is deployed, and if it has 'completion' capability
    if model["id"] == desired_model and model['capabilities'][desired_capability]:
        deployment_id = deployment["id"]
        
# if no model deployed, deploy one
if not deployment_id:
    print('No deployment with status: succeeded found.')

    # Deploy the model
    print(f'Creating a new deployment with model: {desired_model}')
    result = openai.Deployment.create(model=desired_model, scale_settings={"scale_type":"standard"})
    deployment_id = result["id"]
    print(f'Successfully created {desired_model} that supports text {desired_capability} with id: {deployment_id}.')
else:
    print(f'Found a succeeded deployment of "{desired_model}" that supports text {desired_capability} with id: {deployment_id}.')

Found a succeeded deployment of "gpt-35-turbo" that supports text completion with id: gpt-35-turbo.


## Text chunks generator

In [4]:
# A generator that split a text into smaller chunks of size n, preferably ending at the end of a sentence
def chunk_generator(text, n, tokenizer):
    tokens = tokenizer.encode(text)
    i = 0
    while i < len(tokens):
        # Find the nearest end of sentence within a range of 0.5 * n and 1.5 * n tokens
        j = min(i + int(1.5 * n), len(tokens))
        while j > i + int(0.5 * n):
            # Decode the tokens and check for full stop or newline
            chunk = tokenizer.decode(tokens[i:j])
            if chunk.endswith(".") or chunk.endswith("\n"):
                break
            j -= 1
        # If no end of sentence found, use n tokens as the chunk size
        if j == i + int(0.5 * n):
            j = min(i + n, len(tokens))
        yield tokens[i:j]
        i = j


## Request API

In [5]:
def request_api(document, prompt_postfix, max_tokens):
    prompt = prompt_postfix.replace('<document>',document)
    #print(f'>>> prompt : {prompt}')

    response = openai.Completion.create(  
    deployment_id=deployment_id, 
    prompt=prompt,
    temperature=0.5,
    max_tokens=max_tokens,
    top_p=1,
    frequency_penalty=1,
    presence_penalty=1,
    stop='###')

    return response['choices'][0]['text']

## Generate Synopsis

In [6]:
def get_synopsis(content, prompt_postfix):
    import tiktoken

    synopsis_chunck = []
    n = 2000 # max tokens for chuncking
    max_tokens = 1000 # max tokens for response

    tokenizer = tiktoken.get_encoding('p50k_base')

    # Generate chunkcs    
    chunks = chunk_generator(content, n, tokenizer)

    # Decode chunk of text
    text_chunks = [tokenizer.decode(chunk) for chunk in chunks]

    # Request api
    for chunk in text_chunks:
        synopsis_chunck.append(request_api(chunk, prompt_postfix, max_tokens))
        #print(chunk)
        #print('>>> synopsis: \n' + synopsis_chunck[-1])

    # Synopsis
    synopsis = ' '.join(synopsis_chunck)

    return synopsis

### Genre : Comedy

With no further information provided in the prompt, the response is rather formal and dry for the genre of comedy.

In [7]:
fname = "../data/comedy-booking-online-transcript.txt"

with open(fname, 'r') as f:
    content = f.readlines()

# convert list to str
content = ' '.join(content) 

In [8]:
# Prompt postfix
prompt_postfix = """ <document>
  \n###
  \nSummarise the transcript of a podcast above into a synopsis. 
  \nSynopsis : 
"""

synopsis = get_synopsis(content, prompt_postfix)
print(synopsis)

Michael McKintyre talks about his experience with booking tickets online. He finds the process of registering and confirming emails to be quite difficult, especially when he is bombarded by companies that want to have his email address for promotional purposes. He also discusses the challenges of selecting passwords, answering security questions and dealing with booking fees. Michael's humorous take on these issues provides a commentary on how frustrating it can be for people who are not tech-savvy or do not enjoy using technology.
  



Adding more information may help in achieving a more desirable results:
- genre
- desired style

Be creative and explicit about the desired outcome when desiging the prompt. 

In [9]:
# Prompt postfix
prompt_postfix = """ <document>
  \n###
  \nCreate a synopsis to capture audience curiosy and heighten anticipation. This is a stand-up comedy. 
  \nSynopsis : 
"""
synopsis = get_synopsis(content, prompt_postfix)
print(synopsis)

Michael McKintyre is back with his latest stand-up comedy show and this time he's taking on the world of online booking. From passwords to security questions, Michael hilariously reflects on how we all struggle with the challenges that come with buying tickets or products online. With witty observations about our daily lives, Michael’s show will have you laughing out loud as he takes us through his own personal journey of trying to book a ticket for The Who concert.
  



### Genre : Informational

In [11]:
fname = "../data/ft-interview-transcription.txt"

with open(fname, 'r',errors='replace') as f:
    content = f.readlines()

# convert list to str
content = ' '.join(content) 

In [12]:
# Prompt postfix
prompt_postfix = """ <document>
  \n###
  \nGenerate a short synopsis from the transcription of an interview.  
  \nSynopsis : 
"""

synopsis = get_synopsis(content, prompt_postfix)
print(synopsis)

Silicon Valley Bank's collapse was caused by bad decisions and a rapid increase in interest rates. The bank had two weird features on its balance sheet; it was almost entirely funded by business depositors, who demand more interest when interest rates go up, and they bought Treasury bonds which have fixed interests. When the price of money went up but the assets' price did not, Silicon Valley Bank watched its profits disappear leading to depositors pulling their funds out and hence a run on the bank. However, this is not another 2008 financial crisis because SVB looks like an outlier with poor balance sheet management that other banks do not share to such a degree. Furthermore, unlike 2008 where there were all these terrible loans undercapitalised on housing that melted down causing credit event(s), there is no credit event per se here as we learned from last time around that even world's largest financial institutions were inadequately capitalised while this time around big banks are 

In [14]:
# Prompt postfix
prompt_postfix = """ <document>
  \n###
  \nGenerate a short synopsis from the transcription of an interview, such that it trigger curiosity, include a thought provoking question. Add "Let's find out!" at the end.  
  \nSynopsis : 
"""

synopsis = get_synopsis(content, prompt_postfix)
print(synopsis)

The collapse of Silicon Valley Bank has caused panic in the financial world, with many worried that this is the start of another 2008-style crisis. However, Robert Armstrong, US Financial Commentator for The FT argues that there are key differences between what happened then and what's happening now. In this episode of Behind the Money, he explains why we shouldn't be panicking about bank runs just yet... Let's find out!<|im_end|> The collapse of Silicon Valley Bank's high-yield bond fund has raised concerns about the broader US banking system. In this episode, Robert Armstrong explains how banks make money and why it is risky. He also provides insights into what entrepreneurs should be asking their banks in the future. The key takeaway from this interview is that bank failures cannot be vanished entirely by regulation; there will always be risks associated with investing in them. So, where does this leave us? Let's find out!
 
<|im_end|>
