# Generate Podcast Synopsis

In [18]:
%load_ext autoreload
%autoreload 2

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Load Data

In [19]:
fname = "../data/ft-interview-transcription.txt"

with open(fname,  'r', errors='replace') as f:
    content = f.readlines()

# convert list to str
content =' '.join(content) 
#print(content)

## Set Up Azure OpenAI

In [21]:
import os
import openai

# Set up Azure OpenAI
openai.api_type = "azure"
openai.api_base = 'https://azure-openai-test21.openai.azure.com/'
openai.api_version = "2023-03-15-preview"
openai.api_key = 'd1869cee351446e2bc6b6ffef2207576'


## Deploy a Model

In [22]:
# id of desired_model
desired_model = 'gpt-35-turbo' # suitable for text generation
desired_capability = 'completion'

# list models deployed with
deployment_id = None
result = openai.Deployment.list()

for deployment in result.data:
    if deployment["status"] != "succeeded":
        continue
    
    model = openai.Model.retrieve(deployment["model"])

    # check if desired_model is deployed, and if it has 'completion' capability
    if model["id"] == desired_model and model['capabilities'][desired_capability]:
        deployment_id = deployment["id"]
        
# if no model deployed, deploy one
if not deployment_id:
    print('No deployment with status: succeeded found.')

    # Deploy the model
    print(f'Creating a new deployment with model: {desired_model}')
    result = openai.Deployment.create(model=desired_model, scale_settings={"scale_type":"standard"})
    deployment_id = result["id"]
    print(f'Successfully created {desired_model} that supports text {desired_capability} with id: {deployment_id}.')
else:
    print(f'Found a succeeded deployment of "{desired_model}" that supports text {desired_capability} with id: {deployment_id}.')

Found a succeeded deployment of "gpt-35-turbo" that supports text completion with id: gpt-35-turbo.


## Text chunks generator

In [23]:
# A generator that split a text into smaller chunks of size n, preferably ending at the end of a sentence
def chunk_generator(text, n, tokenizer):
    tokens = tokenizer.encode(text)
    i = 0
    while i < len(tokens):
        # Find the nearest end of sentence within a range of 0.5 * n and 1.5 * n tokens
        j = min(i + int(1.5 * n), len(tokens))
        while j > i + int(0.5 * n):
            # Decode the tokens and check for full stop or newline
            chunk = tokenizer.decode(tokens[i:j])
            if chunk.endswith(".") or chunk.endswith("\n"):
                break
            j -= 1
        # If no end of sentence found, use n tokens as the chunk size
        if j == i + int(0.5 * n):
            j = min(i + n, len(tokens))
        yield tokens[i:j]
        i = j


## Request API

In [24]:
def request_api(document, prompt_postfix, max_tokens):
    prompt = prompt_postfix.replace('<document>',document)
    #print(f'>>> prompt : {prompt}')

    response = openai.Completion.create(  
    deployment_id=deployment_id, 
    prompt=prompt,
    temperature=0,
    max_tokens=max_tokens,
    top_p=1,
    frequency_penalty=1,
    presence_penalty=1,
    stop='###')

    return response['choices'][0]['text']

## Generate Synopsis

In [25]:
def get_synopsis(content, prompt_postfix):
    import tiktoken

    synopsis_chunck = []
    n = 2000 # max tokens for chuncking
    max_tokens = 1000 # max tokens for response

    tokenizer = tiktoken.get_encoding('p50k_base')

    # Generate chunkcs    
    chunks = chunk_generator(content, n, tokenizer)

    # Decode chunk of text
    text_chunks = [tokenizer.decode(chunk) for chunk in chunks]

    # Request api
    for chunk in text_chunks:
        synopsis_chunck.append(request_api(chunk, prompt_postfix, max_tokens))
        #print(chunk)
        #print('>>> synopsis: \n' + synopsis_chunck[-1])

    # Synopsis
    synopsis = ' '.join(synopsis_chunck)

    return synopsis

In [26]:
# Prompt postfix
prompt_postfix = """ <document>
  \n###
  \nSummarise the transcript of a podcast above into a synopsis. 
  \nSynopsis : 
"""
#print(prompt_postfix)

synopsis = get_synopsis(content, prompt_postfix)

print(synopsis)

Silicon Valley Bank (SVB) collapsed last week, the biggest bank failure since 2008. The collapse was caused by bad decisions at the bank and a rapid increase in interest rates. SVB's balance sheet had two weird features: on the deposit side they were almost entirely funded by business depositors who demand more interest when rates go up; on the lending side, all of these small companies that are Silicon Valley Bankâ€™s core customers got huge amounts of money in and deposited it at Silicon Valley Bank. So their deposits quadrupled in a couple of years, or at least tripled in a couple of years. They had so much money they couldn't even loan it out as fast as it was coming in so bought Treasury bonds which have fixed interest rates - this meant that while costs for getting money went up with rising interest rates, profits from giving out loans did not rise because those assets' prices remained fixed.
 
The government has stepped into to ensure SVB's customers would still get their money 

## Translate Synopsis

In [27]:
# Prompt postfix
prompt_postfix = """ <document>
  \n###
  \nTranslate synopsis into Mandarin.  
  \nTranslation : 
"""
#print(prompt_postfix)

In [28]:
max_tokens = 1000
translation = request_api(synopsis, prompt_postfix, max_tokens)
print(translation)

美国小型银行First NBC Bank Holding Company上周倒闭，这是自2008年以来最大的银行破产。该崩溃是由于该银行的错误决策和利率迅速上升造成的。SVB的资产负债表有两个奇怪的特点：在存款方面，他们几乎完全由商业存款人提供资金，当利率上涨时要求更高的利息；在放贷方面，所有这些作为硅谷银行核心客户群体中小企业得到了巨额资金，并将其存入硅谷银行。因此，在几年内他们的存款增加了四倍或至少三倍。他们有那么多钱，甚至无法像它进来一样快地发放贷款，所以购买了固定收益率国库券-这意味着尽管随着利率上涨而获取资金成本增加但给出贷款带来收益并没有增长因为那些资产价格保持不变。
政府已经介入确保SVB客户仍然能够拿回自己的钱, 但人们对我们是否足够安全感到担忧. 然而, Robert Armstrong认为只要我们不惊慌就应该没问题, 因为大多数单个银行都是健康且如果您账户里少于$250k美元，则受到美国政府保险覆盖。

Armstrong还指出现在与2008年期间发生情况之间存在三个区别:首先阅读建议SVB看起来像一个异常值; 这里并没有必然发生信用事件; 大型银行这次真正非常安全.

一家小型美国 First NBC Bank Holding Company 的破产引起了关注其他美国 银 行 健 康 的 忧 虑 。 但 是 ，《 金 融 时 报》 （FT）首席财务记者罗伯特·阿姆斯特朗（Robert Armstrong）认为 ，虽然监管和成熟度转换 - 放置长期货物需要短期货物 - 存 在风险 , 但很可能不会再有另一家主要美国 银 行 沉 浸 在 深 刻 的 麻 烦 中 ，因为它们比2008年前更具备充足资本 。Armstrong 还建议企业家询问他们是否有充足的资本 ; 各种各样类型 的 存 款 客 户 ; 平衡表 上 不同 类 型 的 贷 款 .<|im_sep|>


美国金融评论家罗伯特·阿姆斯特朗（Robert Armstrong）在《金融时报》上就硅谷银行的崩溃进行了讨论，并且说明这不是2008年金融危机的复制。他解释说，导致SVB倒闭的原因有两方面：一是银行内部出现了不当决定; 二是利率急剧上升。此外，Rob 还概述了银行如何运作、SVB具体出现什么问题以及是否存在更大的体系性原因。此外，他还详细说明了为什么我们在监管方面存在双标准制度, 以及道德弗兰克法案(Dodd-Frank Act) 的退减对此情况会不会造成影响。最后,  Robert Armstrong 向听众保证, 只要人民不惊慌失措, 情况应当一切安好, 250k 美元之下的储户由美国政府承保.
Robert Armstrong 金融专家提出了 SVB 最新崩盘及其对相关监管带来的影响。 他表明由于 2008 年危机之后所施加的法律法规使得如今的银行已然比 2008 年时更加强壮。 此外， 也告诫投者将存款存入需要留意看看能耐性情况。 最后 ， Robert Armstrong 预测随之考勒将使得 bank equity capital 更加昂���耗时曲緩效应将随之考勒使效能受到影响.

## Generate Tag Lines

In [29]:
# Prompt postfix
prompt_postfix = """ <document>
  \n###
  \nGenerate 2 to 3 tag lines based on the podcast synopsis above.
"""
#print(prompt_postfix)

In [30]:
max_tokens = 500
tag_lines = request_api(synopsis, prompt_postfix, max_tokens)
print(tag_lines)

  
1. "Is the US banking system secure enough? The Financial Times' Robert Armstrong weighs in."
2. "The collapse of First NBC Bank Holding Company raises concerns about other banks in America."
3. "Entrepreneurs, are you asking your bank these three questions?"<|im_sep|>


## Generate Search Engine Optimised (SEO) Keywords

In [31]:
# Prompt postfix
prompt_postfix = """ <document>
  \n###
  \nGenerate 5 search engine optimised keywords based on text above.  
"""
#print(prompt_postfix)

In [32]:
def get_keywords(content, prompt_postfix):
    import tiktoken

    keywords_chunck = []
    n = 2000 # max tokens for chuncking
    max_tokens = 100

    tokenizer = tiktoken.get_encoding('p50k_base')

    # Generate chunkcs    
    chunks = chunk_generator(content, n, tokenizer)

    # Decode chunk of text
    text_chunks = [tokenizer.decode(chunk) for chunk in chunks]

    # Request api
    for chunk in text_chunks:
        keywords_chunck.append(request_api(chunk, prompt_postfix, max_tokens))

    # Keywords
    keywords = ' '.join(keywords_chunck)
    return keywords

In [33]:
keywords = get_keywords(synopsis, prompt_postfix)
print(keywords)

 
1. US banks: are they safe?
2. First NBC Bank Holding Company collapse
3. Silicon Valley Bank failure
4. Banking regulation and maturity transformation risks 
5. Entrepreneurs' banking advice<|im_end|>
