Note:
This project has been setup for podcast episodes that have a run-time between 25-30 minutes. The bottleneck is the transcription time using Whisper and as a result how long a GPU will be kept available in the Google Colab environment.

Transcribing longer podcasts is possible and can be achieved by splitting up the podcast into different chunks.

In [None]:
!pip install feedparser

Collecting feedparser
  Downloading feedparser-6.0.10-py3-none-any.whl (81 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/81.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.1/81.1 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting sgmllib3k (from feedparser)
  Downloading sgmllib3k-1.0.0.tar.gz (5.8 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: sgmllib3k
  Building wheel for sgmllib3k (setup.py) ... [?25l[?25hdone
  Created wheel for sgmllib3k: filename=sgmllib3k-1.0.0-py3-none-any.whl size=6048 sha256=cef970910644acdb16edb5d13baf04e618599fc832708b5a848a9552f13120d4
  Stored in directory: /root/.cache/pip/wheels/f0/69/93/a47e9d621be168e9e33c7ce60524393c0b92ae83cf6c6e89c5
Successfully built sgmllib3k
Installing collected packages: sgmllib3k, feedparser
Successfully installed feedparser-6.0.10 sgmllib3k-1.0.0


# Step 1 : Podcast Extraction

RSS feed URL of your selected podcast in the below cell.


---



In [None]:
import feedparser
podcast_feed_url = "https://feeds.simplecast.com/G0yLKQmm"
podcast_feed = feedparser.parse(podcast_feed_url)

In [None]:
print ("The number of podcast entries is ", len(podcast_feed.entries))

The number of podcast entries is  171


Let's get the URL of the most recent episode from the feed and then download the corresponding MP3 file and save it on Google Colab as `podcast_episode.mp3`

In [None]:
for item in podcast_feed.entries[0].links:
  if (item['type'] == 'audio/mpeg'):
    episode_url = item.href
!wget -O 'podcast_episode.mp3' {episode_url}

--2023-08-20 16:47:11--  https://pdcn.co/e/cdn.simplecast.com/audio/43fdc339-85ec-4947-a05d-66132f42ff72/episodes/606cfbf0-5c2a-4c0b-9959-9eea641eeb49/audio/41cec9ca-0e53-4abd-adaa-d2a765f2a7e2/default_tc.mp3?aid=rss_feed
Resolving pdcn.co (pdcn.co)... 54.183.48.91, 54.177.168.223
Connecting to pdcn.co (pdcn.co)|54.183.48.91|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn.simplecast.com/audio/43fdc339-85ec-4947-a05d-66132f42ff72/episodes/606cfbf0-5c2a-4c0b-9959-9eea641eeb49/audio/41cec9ca-0e53-4abd-adaa-d2a765f2a7e2/default_tc.mp3?aid=rss_feed [following]
--2023-08-20 16:47:12--  https://cdn.simplecast.com/audio/43fdc339-85ec-4947-a05d-66132f42ff72/episodes/606cfbf0-5c2a-4c0b-9959-9eea641eeb49/audio/41cec9ca-0e53-4abd-adaa-d2a765f2a7e2/default_tc.mp3?aid=rss_feed
Resolving cdn.simplecast.com (cdn.simplecast.com)... 54.230.112.81, 54.230.112.70, 54.230.112.96, ...
Connecting to cdn.simplecast.com (cdn.simplecast.com)|54.230.112.81|:443... conne

## Step 2 - Transcribe the Audio file

I will use [Whisper](https://github.com/openai/whisper) as our speech to text model. This model has been open-sourced by OpenAI and we can just download it and use it directly. We first install the whisper package and then use the `medium` model to transcribe our downloaded podcast.

Please note that some of the below cells may take upto a minute to run - as it downloads a model of size 1.5 GB and then loads it into GPU memory.

In [None]:
!pip install git+https://github.com/openai/whisper.git  -q

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m23.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for openai-whisper (pyproject.toml) ... [?25l[?25hdone


In [None]:
%%time

import pathlib
import whisper
# Perform download only once and save to Network storage
model_path = pathlib.Path("/content/podcast/medium.pt")
if model_path.exists():
  print ("Model has been downloaded, no re-download necessary")
else:
  print ("Starting download of Whisper Model")
  whisper._download(whisper._MODELS["medium"], '/content/podcast/', False)

Starting download of Whisper Model


100%|██████████████████████████████████████| 1.42G/1.42G [00:15<00:00, 101MiB/s]


CPU times: user 7.14 s, sys: 3.98 s, total: 11.1 s
Wall time: 20.9 s


In [None]:
# Load model from saved location
model = whisper.load_model('medium', device='cuda', download_root='/content/podcast/')

Now we need to pass in the location of our downloaded podcast file to get the transcript.

**NOTE**:
- This step will probably take longer to complete depending on the length of the podcast episode.
- I have created this notebook with the GPU runtime enabled and this will already speed things up. In the free version of Google Colab the notebook will most likely be assigned a T4 GPU which takes roughly a fifth (20%) of the episode runtime for the transcription.
- If you have a paid version of Google Colab, you can choose a different GPU like V100 or A100 to speed things up.

In [None]:
%%time
result = model.transcribe("/content/podcast_episode.mp3")

CPU times: user 4min 11s, sys: 1.25 s, total: 4min 12s
Wall time: 4min 18s


In [None]:
# Check the transcription happened correctly by peeking into the first 1000 characters
podcast_transcript = result['text']
result['text'][:1000]

" I'm Brian Scordato and this is the Idea to Startup Podcast, a podcast that's been called quote, easily 10 times more useful than my MBA, which probably says more about higher education than our pod, but it was a nice review. We're going to start sending the pod along with some deeper content each week, so if you're a power listener of Idea to Startup, head to gettacklebox.beehive.com or the link in the show notes. Beehive is spelled a bit wildly, so it's gettacklebox.beehive.com. On to it. Today we're going to help you find a differentiator sturdy enough to support a business. If this feels a little daunting, good. It's probably the most important decision you'll make, and since it's so important, entrepreneurs tend to shy away from it. Humans are wired to avoid discomfort, and what is less comfortable than making a decision is making this seemingly permanent, the thing you'll lean on to make or break your business. But remember from last week, to be a good entrepreneur, you've got t

To avoid being in the situation that the Colab notebook shuts down and I loose the Python variable holding the transcription and to allow for faster testing of subsequent sections, I have created a local variable that holds the podcast transcript.


In [None]:
podcast_transcript

" I'm Brian Scordato and this is the Idea to Startup Podcast, a podcast that's been called quote, easily 10 times more useful than my MBA, which probably says more about higher education than our pod, but it was a nice review. We're going to start sending the pod along with some deeper content each week, so if you're a power listener of Idea to Startup, head to gettacklebox.beehive.com or the link in the show notes. Beehive is spelled a bit wildly, so it's gettacklebox.beehive.com. On to it. Today we're going to help you find a differentiator sturdy enough to support a business. If this feels a little daunting, good. It's probably the most important decision you'll make, and since it's so important, entrepreneurs tend to shy away from it. Humans are wired to avoid discomfort, and what is less comfortable than making a decision is making this seemingly permanent, the thing you'll lean on to make or break your business. But remember from last week, to be a good entrepreneur, you've got t

In [None]:
podcast_transcript = """
 I'm Brian Scordato and this is the Idea to Startup Podcast, a podcast that's been called quote, easily 10 times more useful than my MBA, which probably says more about higher education than our pod, but it was a nice review. We're going to start sending the pod along with some deeper content each week, so if you're a power listener of Idea to Startup, head to gettacklebox.beehive.com or the link in the show notes. Beehive is spelled a bit wildly, so it's gettacklebox.beehive.com. On to it. Today we're going to help you find a differentiator sturdy enough to support a business. If this feels a little daunting, good. It's probably the most important decision you'll make, and since it's so important, entrepreneurs tend to shy away from it. Humans are wired to avoid discomfort, and what is less comfortable than making a decision is making this seemingly permanent, the thing you'll lean on to make or break your business. But remember from last week, to be a good entrepreneur, you've got to be a buffalo. Run into the storm. Work to rewire your brain to get excited when you start to get that pit in your stomach. That means you're on to something. The uncomfortable path is the one that leads you to places most other people never go, which is where all the interesting stuff in life tends to be. Stop uncomfortable for opportunity in your mind, and you're going to be alright. So today we'll do that, to find your differentiator. We're going to start by getting everyone on the same page about what a differentiator is, and more importantly, what it'll help you do. Then we'll talk about bagels. Finally, we'll give you a framework to find your differentiator. Nice little summer episode. And yes, fine. Before we start, let's close the loop. I did watch Interstellar. It was good. I enjoyed it. You all win. I'm not sure it's email someone from a podcast who is really only using it as a literary structural device to yell at them for not liking it good, but I'm not going to tell you how to eat your Cheerios. I did also get some emails about my failure to launch jokes, saying that that is a great movie I shouldn't slander either, but I have to draw the line somewhere, and I'm drawing it here. A movie that got a 23% on Rotten Tomatoes and has reviews like this one from Alberto that described it as, quote, a very weak film in which very few things are good is not breaking into the roughly 17 minutes of leisure time I get each day while running a business and trying to get an eight month old to eat smushed up peaches while Ruby pouts in the corner with a you used to be cool man look on her face that shatters my heart. It just isn't in the cards. Anyway, let's talk differentiators. The term itself is broad and surprisingly hard to wrap your arms around, which is coincidentally exactly how one reviewer described failure to launch. Sorry, I'm done. If you're early on in your startup journey and I ask you what your differentiator is, you're probably going to struggle to answer. Your thoughts might go right to product. What are you going to build that will be sufficiently different from your competitors? Maybe it's a feature. Maybe it's a bunch of features that together make you different. Or you might think I'm tricking you and you know that I love customer stuff, so you might move there. Maybe your differentiator is focusing on a customer that competitors can't, don't, or won't focus on for whatever reason. Maybe you think it'll be a servicy differentiator like how Zappos says customer support is theirs. Or maybe it'll be something completely vague like, quote, we'll pay attention to the details or quote, we'll work harder than anyone. If you aren't clear on your differentiator, then you don't have one. And I don't blame you because the term differentiator isn't clear. People use it to describe lots of things and without knowing in the parlance of last week's episode that differentiator's job to be done is hard to have one, which means you won't benefit from it. So let's start there. What is the, quote, job of a differentiator? As a founder, what are you hiring one to do? The answer is that they help you make decisions fast. Differentiators are about speed. Decisions are oxygen for your startup and they're exceedingly hard to make for entrepreneurs. Not because we're indecisive, although a lot of us are. More because we never have enough information to make a truly informed decision and that scenario takes a while to get used to. It's like buying a car only knowing that it, quote, has some wheels and might work. There's the emotional component too. Decisions feel weightier when it feels like your self-worth as an entrepreneur is on the line. You're buying that car you know nothing about and you're spending your life savings on it and your friends will judge you based on how it drives. Or at least it feels that way. So a differentiator, one thing that supersedes everything else, is really important because it can act as your organizing principle. It serves as the source of truth amongst all the incomplete information you have. You pick a differentiator, then every decision you make is in support of that thing. This might not make a ton of sense in the abstract, so here's a quick example. Let's say you're starting a jeans brand. When you go to create a landing page in social ads and emails and everything else, you'll hit a decision wall. If in your mind you see yourself as a jeans company that makes organic and stretchy jeans in the USA and every pair you sell, you also donate a pair to a homeless shelter and you're going to grow through small grassroots influencers on TikTok, you're going to be paralyzed. If there's no hierarchy, no organizing principle, you don't know what to include or to leave out. The story gets overwhelming. It'll become an exercise in the attention pie. If you don't remember this mental device from our good friend Joey Cofone, the attention pie is the idea that each new message you create dilutes all the others. The more things you have, the more clutter the pie, the less importance any one thing gets. The harder it is for your customer to know what you do and make a decision about you. But if you have a differentiator, one thing you believe anchors the company, one thing truly different that's going to break through, you can organize around it. So maybe you're making jeans and your differentiator is that you make high end jeans that are a custom fit for women under five foot one, because no other jeans company is focusing on that customer. Then your landing page becomes easy. Every bit of it supports that differentiator. Maybe your sizing is different and the pockets are in different shapes and you work with women under five foot one on TikTok and on and on and on. And you can just put those things on the website. But each of the subsequent decisions supports the big one. We make high quality jeans for women under five foot one. It's clear why you exist, who you're for, what your secret is. It's actionable. Think about how much easier it is to make those landing pages and social ads. Every decision waterfalls from that core differentiator. Where should you market? Who should you partner with? What materials should you source? These decisions become fast because you have a purpose for each support the core message. Best of all, since you are so different from your competitors, you can use their strengths against them like some sort of entrepreneurship jujitsu move. They can't possibly compete because while their jeans might be beautiful, the inseam doesn't flatter people under five one and the pockets don't sit like yours. And all the things they're set up to do are actually harmful for building something specific to your customer. A clear differentiator, one your customer cares about, one obviously different from competitive offerings when you keep making decisions to support becomes a moat. Because to copy you, competitors might need to do 10 things as well as you, which just isn't happening. There's a great book by a real jerk called Leading with the Heart. It's written by Coach K, the old Duke basketball coach who was a real cheater. He also lost his last game to UNC for the record. Anyway, the book is grudgingly fantastic. I read it as a sort of keep your enemies closer thing. One of the core themes is that courage plus confidence leads to decision making. Slow decision making comes from either a lack of confidence or a lack of courage. A clear differentiator shifts this equation in your favor and lets you move fast. You're confident in what matters and this makes the courageous part much easier. Plus you're a buffalo. You'll be good. I visualized the structure of a great company like a pyramid with your differentiator, something meaningfully obviously uniquely different sitting at the top. Then a latticework of thousands of decisions are the blocks that support it. Today we'll help you think through this and figure out your differentiator. And we'll start by talking about the best bagel in New York City, which just so happens to come from a shop in the suburbs of Connecticut. We'll sort through that. I go throw up in the bathroom because I complimented my sworn enemy, Coach K. Then we listen to some smooth jazz. If you've got a startup idea and a full time job and want to test out the former before you leave the latter, come and work with us. Apply at GetTackleBox.com. Over 400 startups have tested and built ideas through our program and those businesses are now collectively worth over a billion dollars. Our program helps you prioritize and execute and our members and me and the team keep you accountable and give you feedback along the way. Come build with us at GetTackleBox.com. Back to it. The best bagels in New York City are in Fairfield County, Connecticut. This is true, somehow. Pop-Up Bagels, a bagel shop that started during the pandemic when a guy was messing around with sourdough recipes and then decided to try his hand at bagels, has won Brooklyn Bagel Fest's Best Bagel Two Years Running. I have them just about every Sunday. They're great. The fastest growing bagel shop anyone has ever seen has raised millions of dollars from people like Paul Rudd, Michael Strahan, and Michael Phelps. The guy who runs it, Adam Goldberg, at least as of fairly recently, which is the last I heard, still works a full time job selling flood mitigation systems. So if you're scoring at home, a guy who had never made bagels before now makes the best bagels in New York City, a city world renowned for bagels, and he does it from the suburbs of Connecticut. A business that traditionally doesn't scale because of the thin margins, a bagel shop, is scaling aggressively. A space that everyone assumed was saturated, there's a bagel shop in every town in America, or at least the Northeast, apparently isn't. So what the heck is going on? A differentiator? I've followed Pop-Up Bagels pretty closely because I love bagels and I love unexpected stuff and I love figuring out why something is growing disproportionately fast to everything else that looks like it. Early on, I saw an interview where Goldberg was talking about why he thought there was opportunity. I'm paraphrasing, but he said something about the magic of a hot bagel, but that most bagels you get aren't hot. And that is true. When you go to every bagel shop in the country, you'll see a bunch of wire bins with bagels that have been cooked some time that day sitting in there. Every once in a while, you win the bagel lotto and you get a fresh piping hot everything bagel with steam coming out. Most of the time, you get one that's lukewarm or cold. One of the bagel shops I used to live near in Union Square in New York City was called David's and they'd hang signs that said hot when they added a new batch of hot bagels to one of those wire bins and I always picked the type of bagel exclusively based on that. I hate pumpernickel bagels, but a fresh one out of the oven is better than a cold sesame eight days a week. Goldberg realized this. And this was his differentiator. He saw the gulf between a hot bagel and a not hot one as not a trivial thing, but as a totally new business opportunity. At Pop-Up Bagels, you would always get a piping hot bagel. Now again, I don't know this guy. We've never talked. I don't even know if he did this on purpose, but Pop-Up Bagels is set up as if he did. The core differentiator of the thing that matters is that every bagel you ever get from Pop-Up is hot. And every other part of the business supports this differentiator. During the early days, before he had permanent space, he'd borrow the kitchens of restaurants or rent kitchens to host his pop-ups. Since he needed the bagels to be hot when people picked them up, he had customers pre-order a few days before and select a pickup time so that the logistics would work. This way, he could plan out the waves of hot bagels and perfectly predict inventory. This decision, starting with hot bagels, makes all the subsequent ones easy. For example, most bagel shops let you walk in, order your bagel, have the people work in there, put scion cream cheese on it, then you go to the cashier, pay $5 and you're on your way. But this would be logistically impossible if your goal was to give everyone piping hot bagels. It'd take too much time. So Pop-Up can't do it. And since they can't do it, they can make another decision. They don't need spaces with storefronts. They just need industrial kitchen space. Then, if you're pre-ordering hot bagels already, it makes no sense to order just one bagel, especially if you can't get cream cheese on it. So it's dozens only. And two containers of quote, schmears, a selection of cream cheeses that come in little cardboard containers. This model, Pop-Up shops, bagels by the dozen, pre-order and pick up at a set time so your bagels are always fresh out of the oven, worked. Word spread. During the early days, Pop-Up got so busy that you had to book your bagel slot a week early. This led to a subscription offering where you could subscribe and have a standing appointment for your dozen hot bagels, say, every Saturday at 10 a.m. From a pricing perspective, if you have a real differentiator, you should be able to overcharge for it. And Pop-Up Bagels does. They charge $42 for a dozen bagels and two schmears. For reference, the average cost of a dozen bagels in New York City is about $15. The margin comes from the value and the lack of competition. Sure, there are other bagel shops, but not places that guarantee hot bagels and let you grab and go. No payment, no waiting on people getting bacon, egg and cheeses in front of you, on and on. The willingness to overpay shows the value over the alternative. Finally, from a bagel quality perspective, people overpaying gives some financial wiggle room. The New York Times describes Pop-Up Bagels' product as smaller, airier and crispier than a traditional New York bagel, with a texture similar to a baguette. Goldberg said the dough is double-proofed, which adds flavor and creates a softer interior and more robust crust. When I asked my friend, a bread baker, what the heck double-proofed meant, he basically said it's a better way to make bread and it's not really a secret. It just takes more time and effort. And bagels are cheap, so people crank them out quicker and tend to not do that. It's a volume business, not quality. So the waterfall happens. Since hot bagels are something people will pay extra for, Pop-Up can spend more money and time on each bagel and make them higher quality. So they win awards. And the whole thing reinforces itself. As Pop-Up has grown, Goldberg has stuck to the formula. There are now six kitchens that you can schedule pickup orders from. The location I go to appears to be run by a bunch of high school kids. You pull up in your car at the required time, they walk to the window and ask you for your name, then come back a minute later with your bag of piping hot bagels. You drive home and you eat. For parents, the overwhelming demographic ordering bagels each weekend in Fairfield County and the other pickup location, the Hamptons, this is an extraordinary experience. Not having to get out of the car, having a bunch of hot delicious food kids are going to eat, having a routine, these are all things parents value. And the locations are filled with pretty affluent folks able to pay $42 for bagels. The differentiator matches with the customer need, the customer's process, and it's sticky. Goldberg realized that hot bagels matter a whole lot and basically nothing else from a traditional bagel shop matters at all, at least for his customer. So when you break it down, it's not actually all that surprising that the best bagels in New York City come from the suburbs of Connecticut. They're playing a fundamentally different game than anyone else. One that's much easier to win. A differentiator generator. I have no idea if pop-up bagels is going to work at scale, but no startup can ever guarantee their differentiator is going to work forever. The goal is to get escape velocity, to get enough momentum to move into the next stage of the business, to build a little cult of early customers in the good way, not the creepy way, then decide on a new strategy if growth requires it. But the differentiator gives you options, gives you speed. Now for your differentiator, how do you come up with one? More realistically, how do you decide which of the things you're doing is worth being the focal point of your business? I've got a list of five things that'll hopefully jar something loose for you. First, your differentiator needs to be aggressively, diametrically opposed to the competition. The best way to do this is to start with a tight customer segment and realize why their needs aren't being met and start from first principles to figure out a way that meets them exactly where they are. This does two things. First, it lets you do that competitor jujitsu move where you use their strengths against them. It might be easy to say, well, hey, the local bagel shops can get apps that allow people to pre-order. And sure, they could, but they could never go all in on this strategy because of all the legacy and sunk costs they have. They've got locations with stations for bacon, egg, and cheese and a method for getting bagels out. They have cashiers, they have staff. If Pop-Up Bagels does 10 things that support the big differentiator of pre-ordering and picking up hot bagels, maybe existing bagel shops could do each of those 10 things at 40 to 50% as well as Pop-Up. Add that up and you get a terrible product. Being aggressively different from competition creates a moat. Second, being different creates word of mouth and when something is meaningfully different, it's eminently shareable. Best of all, what people share is going to be consistent. Before I tried Pop-Up Bagels, I had four or five friends tell me basically the same thing. Quote, you pre-order your bagels and pick them up so they're right out of the oven when you get there. It's amazing. Everyone said the bagels were great, but they said they were great because they were hot. Real differentiators travel. My grandfather and father both have a saying they repeat like their parents. To be a difference, a difference has to make a difference. The thing that separates you from however your customer solves their problem now needs to be seriously different. A good sign here for you is lots of people being highly skeptical of your differentiator. Ideally, you'd like 95% of people you meet to say what you're doing is a terrible idea and 5% to want to run through traffic to get it. It's a bad sign if everyone is in agreement that your differentiator is a good idea. That means it's too safe. Push farther away from the competition. It should make people feel a bit uncomfortable. Ideally, one of your friends is going to pull you aside and say something like, hey, don't risk too much on this or hey, I'm saying this as a friend. This isn't a good idea. All the obvious ideas are already taken. That type of reaction is a good sign. Most great differentiators remove or ignore 95% of what competitors do and pick 5% and focus on it for a specific customer. That is where the value is. No one is ever going to pay 40 bucks for a dozen bagels until they do. Second, people will overpay for your differentiator. I happily overpay for hot bagels. I overpay for shirts made for tall lanky guys. I overpay for stuff that solves a specific problem for me and is significantly better than all of my other options. A great test for a viable differentiator is to charge a huge margin for it. Do it early. Do it with tests. You can't do it without what you've got. Third, most differentiators come after some sort of shift. There's a question founders get a lot. Why now? What's happened that makes this opportunity viable today when it wasn't a year or two ago? For pop-up bagels, the pandemic changed people's behavior around ordering and picking up food. In general, we've seen the rise of ghost kitchens or working kitchens that didn't have a traditional storefront or seating atmosphere. Also, apps for ordering and reserving time slots are now white labelable and cheap. You don't need a developer to build you one. You just need 50 bucks a month and a Stripe plugin and you're off and running. Tech barriers, cooking barriers, and mental barriers all broke down leading up to the rise of pop-up. What is this for you? Fourth, great differentiators smush stuff together. If you step back, pop-up bagels looks a little bit more like a SaaS business than a bagel shop. Monthly subscribers, predictable demand, one product delivered to many without customization. Bringing a SaaS subscription simplified product approach to bagels was brilliant. This smushing generally comes from the founder's experience. What's the secret you know about your customer mixed with something you learned from another industry? And finally, the hardest part of any differentiator, sticking to it. What blows me away about pop-up bagels isn't that it got started during the pandemic by a guy that had never cooked bagels before or that it gained this amount of traction. By the way, Goldberg is 47, so all you 32 year old saying you're too old to start something or learn something new, come on. What really blows me away is that the guy who started it has been so disciplined. Can you imagine how many people have said something like, oh, you have to open up a sit down restaurant and oh, you want margin, you got to serve mimosas or you got to freeze these things and get them into Whole Foods. And some of the people suggesting those things might have been like Michael Phelps and Paul Rudd. How many people told him early on that this was a fad, that people wouldn't continue to get his bagels and what he was doing was nuts? One of the hardest parts of a differentiator is sticking with it. Humans love sabotaging themselves as soon as something good starts to happen. Apparently we've been doing it since 27 AD when Patronius, a stoic I think, said this quote which I found on a random article. I was to learn later in life that we tend to meet any new situation by reorganizing and what a wonderful method it can be for creating the illusion of progress while producing confusion, inefficiency, and demoralization. Founders take time to show themselves. Give them that time. Then once you have them, you want to lean into them for a while without changing course and reorganizing constantly. So many founders have a brand new differentiator every week. You don't give them a chance to develop. And differentiators are hard to predict. Like most things in the startup world, the magic happens after you get in with customers and thrash around a bit, not before. Thrash, test, then lean in. That is the plan. We'll end with one of many quotes from Interstellar I didn't quite understand but I'll pretend I did because it sounded deep. Quote, we used to look up at the sky and wonder our place in the stars. Now we just look down and worry about our place in the dirt. Indeed, McConaughey. Indeed. This was the Idea to Startup podcast brought to you by Tacklebox. If you have a startup idea and a full-time job, head to gettacklebox.com and apply. We'll get back to you in 72 hours. Have a great week.
"""

## Step 3 - Creating a summary of the podcast

As part of the information extraction, I want to first create a summary of the podcast. I want this to be concise while still conveying the gist of the episode and trying to catch the attention of the user. I have used the OpenAI `gpt-3.5-turbo` model to generate this summary by passing in the generated transcript. I am asking the LLM to go through the entire transcript we provide and summarize it for us.

 Required libraries - `openai` and `tiktoken` libraries. The openai library is the Python package that allows us to make calls to the API. The tiktoken library allows us to determine the number of tokens in our transcript and that gives us an indication of costs and also whether we will need to change the model that we use to one with a larger context window. While we can use the API and make calls directly, it's much easier to work with the Python library provided by OpenAI

In [None]:
!pip install openai
!pip install tiktoken

Collecting openai
  Downloading openai-0.27.8-py3-none-any.whl (73 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/73.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.6/73.6 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: openai
Successfully installed openai-0.27.8


In [None]:
import openai
from getpass import getpass

openai.api_key = getpass('Enter the OpenAI API Key in the cell  ')

Enter the OpenAI API Key in the cell  ··········


In [None]:
# we can confirm that the API key works by listing all the OpenAI models
models = openai.Model.list()
for model in models["data"]:
  print (model["root"])

text-davinci-001
text-search-curie-query-001
gpt-3.5-turbo
davinci
gpt-3.5-turbo-0613
babbage
text-babbage-001
curie-instruct-beta
text-davinci-003
davinci-similarity
code-davinci-edit-001
text-similarity-curie-001
ada-code-search-text
text-search-ada-query-001
gpt-3.5-turbo-16k-0613
babbage-search-query
ada-similarity
text-curie-001
gpt-3.5-turbo-16k
text-search-ada-doc-001
text-search-babbage-query-001
code-search-ada-code-001
curie-search-document
text-search-davinci-query-001
text-search-curie-doc-001
babbage-search-document
babbage-code-search-text
text-embedding-ada-002
davinci-instruct-beta
davinci-search-query
text-similarity-babbage-001
text-davinci-002
code-search-babbage-text-001
text-search-davinci-doc-001
code-search-ada-text-001
ada-search-query
text-similarity-ada-001
ada-code-search-code
whisper-1
text-davinci-edit-001
davinci-search-document
curie-search-query
babbage-similarity
ada
ada-search-document
text-ada-001
text-similarity-davinci-001
curie-similarity
babbage-c

**Context Window**

It's important to understand the concept of a context window. This is the maximum of the combined text that can be used in one API call to the gpt-3.5-turbo model. It is not only a combination of the input text sent to the model but also takes into consideration the output response as well. Also keep in mind that this is measured in terms of tokens and not words. While we could treat them as analogous, it's technically not the same as one word may actually be broken down into multiple tokens.

We use the tiktoken package to determine the number of tokens in your text.

In [None]:
import tiktoken
enc = tiktoken.encoding_for_model("gpt-3.5-turbo")
print ("Number of tokens in input prompt ", len(enc.encode(podcast_transcript)))

Number of tokens in input prompt  5360


As you can see in the case of the above podcast episode, this number of 5360 tokens, which is higher than the 4096 tokens that is accepted by the default [GPT-3.5-turbo model](https://platform.openai.com/docs/models/gpt-3-5). What it means is that I have to make use of the larger, higher capacity model `gpt-3.5-turbo-16k` that has a context size of 16,384 tokens.

In [None]:
instructPrompt = """
please give a summary of the following for a newletter. make it sound interesting and catchy.
"""

request = instructPrompt + podcast_transcript

In [None]:
chatOutput = openai.ChatCompletion.create(model="gpt-3.5-turbo-16k",
                                            messages=[{"role": "system", "content": "You are a helpful assistant."},
                                                      {"role": "user", "content": request}
                                                      ]
                                            )



In [None]:
podcastSummary = chatOutput.choices[0].message.content
podcastSummary

'Introducing the Idea to Startup Podcast, a podcast that has been hailed as "easily 10 times more useful than an MBA." And now, they\'re taking things to the next level by sending out deeper content each week. But before we dive into today\'s episode, let\'s close the loop on a few things. Yes, the host did watch Interstellar, and surprisingly, enjoyed it. But the real excitement lies in finding your differentiator, the key to success in building a business. This episode will guide you through the process of identifying your differentiator, with the help of some unexpected inspiration: bagels. Discover how a bagel shop in the suburbs of Connecticut has managed to win awards for the best bagels in New York City, by focusing on one simple differentiator: hot bagels. Learn how this unique selling point helped the shop stand out from the competition and create a loyal customer base. The podcast will provide a framework for finding your own differentiator, one that is aggressively different

## Step 4 - Using `functions` to extract additional information to provide additional context on the podcast episode

We can provide additional context to the user about a certain episode if we are able to identify the guest/key note speaker and add a summary of their background and experience.

We can easily find information about the guest using Wikipedia or Google but first we also need to extract the name of the podcast guest. Since we are looking to pass the extracted name of the podcast guest to a subsequent function, we need to ensure that the output we recieve from the API is as structured as possible.

To achieve this, I am going to make use of the `function calling` capability of the OpenAI API


- Typically in a podcast episode the guest will be introduced in the first half and therefore it's not necessary to use the entire transcript to extract this information.
- We pass in only the first 5000 characters and this will save us token usage as well and therefore we can use the non 16k model.

In [None]:
request = podcast_transcript[:5000]
enc = tiktoken.encoding_for_model("gpt-3.5-turbo")
print ("Number of tokens in input prompt ", len(enc.encode(request)))

Number of tokens in input prompt  1109


In [None]:
completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": request}],
    functions=[
    {
        "name": "get_podcast_guest_information",
        "description": "please extract and summarise some information about the guest mentioned in the podcast",
        "parameters": {
            "type": "object",
            "properties": {
                "guest_name": {
                    "type": "string",
                    "description": "please extract the guest's name",
                },
                "unit": {"type": "string"},
            },
            "required": ["guest_name"],
        },
    }
    ],
    function_call={"name": "get_podcast_guest_information"}
    )



We can directly see how the output from the API is formatted by checking the response object `completion`

In [None]:
completion

<OpenAIObject chat.completion id=chatcmpl-7pfzJBK5XokMOfnKHzBfqpUGHGzZL at 0x7afcd08a3e70> JSON: {
  "id": "chatcmpl-7pfzJBK5XokMOfnKHzBfqpUGHGzZL",
  "object": "chat.completion",
  "created": 1692550485,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "get_podcast_guest_information",
          "arguments": "{\n  \"guest_name\": \"Brian Scordato\"\n}"
        }
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1188,
    "completion_tokens": 13,
    "total_tokens": 1201
  }
}

In [None]:
import json

podcast_guest = ""
response_message = completion["choices"][0]["message"]
if response_message.get("function_call"):
  function_name = response_message["function_call"]["name"]
  function_args = json.loads(response_message["function_call"]["arguments"])
  podcast_guest=function_args.get("guest_name")

print ("Podcast Guest is ", podcast_guest)

Podcast Guest is  Brian Scordato


In the following step, we install the wikipedia python library and then query Wikipedia to find more information about the podcast guest. We use the extracted information as the input to the call.

In [None]:
!pip install wikipedia

Collecting wikipedia
  Downloading wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11680 sha256=bd522d76c0c656fed67922eaf801fcb4e4d166bfc7bffe216f5958941223ea20
  Stored in directory: /root/.cache/pip/wheels/5e/b6/c5/93f3dec388ae76edc830cb42901bb0232504dfc0df02fc50de
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0


In [None]:
import wikipedia
input = wikipedia.page(podcast_guest, auto_suggest=False)

PageError: ignored

In [None]:
podcast_guest_info = input.summary
print (podcast_guest_info)

### Extensions

1. Sometimes it's possible that guest extraction may happen partially or not at all. We can still attempt to find more information about the podcast guest by extracting additional information about them like their organization or title.
2. Depending on the podcast and generated transcript, it's also possible that the extraction is incorrect and we need to include error handling for these conditions.
3. It's not necessary that Wikipedia is the best resource for pulling information about the podcast guest.

As an optimised version - i will try to use some other google api for the search or use the paid gpt-4 now that it can search for real time data

#### Extension Solution 1

Let's consider the case where podcast guest name may not be enough OR it has not been extracted well or completely. One way to resolve this would be to extract additional information:

- Podcast Guest Organization
- Podcast Guest Title

We can also provide more context by including the larger portion (first 10k characters) from the start of the podcast transcript as that's where organization, title and such details might be covered during their introduction.



In [None]:
request = podcast_transcript[:10000]
enc = tiktoken.encoding_for_model("gpt-3.5-turbo")
print ("Number of tokens in input prompt ", len(enc.encode(request)))

Number of tokens in input prompt  2153


In [None]:
completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": request}],
    functions=[
    {
        "name": "get_podcast_guest_information",
        "description": "Get information on the podcast guest using their full name and the name of the organization they are part of to search for them on Wikipedia or Google",
        "parameters": {
            "type": "object",
            "properties": {
                "guest_name": {
                    "type": "string",
                    "description": "The full name of the guest who is speaking in the podcast",
                },
                "guest_organization": {
                    "type": "string",
                    "description": "The full name of the organization that the podcast guest belongs to or runs",
                },
                "guest_title": {
                    "type": "string",
                    "description": "The title, designation or role of the podcast guest in their organization",
                },
            },
            "required": ["guest_name"],
        },
    }
],
function_call={"name": "get_podcast_guest_information"}
)

In [None]:
import json

podcast_guest = ""
podcast_guest_org = ""
podcast_guest_title = ""
response_message = completion["choices"][0]["message"]
if response_message.get("function_call"):
  function_name = response_message["function_call"]["name"]
  function_args = json.loads(response_message["function_call"]["arguments"])
  podcast_guest=function_args.get("guest_name")
  podcast_guest_org=function_args.get("guest_organization")
  podcast_guest_title=function_args.get("guest_title")

In [None]:
print (podcast_guest)
print (podcast_guest_org)
print (podcast_guest_title)

Brian Scordato
None
None


In [None]:
if podcast_guest_org is None:
  podcast_guest_org = ""
if podcast_guest_title is None:
  podcast_guest_title = ""

In [None]:
input = wikipedia.page(podcast_guest + " " + podcast_guest_org + " " + podcast_guest_title, auto_suggest=True)

In [None]:
input.summary

'Rodeo: Four Dance Episodes (also stylized as Rōdē,ō: Four Dance Episodes) is a one-act ballet choreographed by Justin Peck to "Four Dance Episodes" from Copland\'s Rodeo. The ballet premiered on February 4, 2015, at the David H. Koch Theater, danced by the New York City Ballet.'

## Step 5: Extract the highlights of the podcast

We've provided the user with a summary of the podcast and more information about the guest on this episode. What if we could also give them a peak into the conversation?

In this step, i try to extract some key moments in the podcast. These are typically interesting insights from the guest or critical questions that the host might have put forward.

In [None]:
instructPrompt = """
please extract the highlights and important moments in the podacast episode in the format highlight 1: ... , highlight 2: ..., etc
"""

request = instructPrompt + podcast_transcript

In [None]:

request = instructPrompt + podcast_transcript
chatOutput = openai.ChatCompletion.create(model="gpt-3.5-turbo-16k",
                                            messages=[{"role": "system", "content": "You are a helpful assistant."},
                                                      {"role": "user", "content": request}
                                                      ]
                                            )


In [None]:
chatOutput.choices[0].message.content

"In this podcast episode, the host discusses the importance of finding a differentiator for your business and provides a framework to help you discover yours. Here are the highlights:\n\n1. Differentiators are about speed: The purpose of a differentiator is to help you make decisions fast. It serves as the source of truth and guides your decision-making process.\n\n2. Differentiators should be aggressively, diametrically opposed to the competition: Your differentiator needs to be significantly different from what your competitors offer. This allows you to use their strengths against them and create a moat.\n\n3. People will overpay for your differentiator: A great test for a viable differentiator is whether people are willing to pay a premium for it. If they see the value in your unique offering, they will be willing to pay more.\n\n4. Differentiators often come after a shift: Look for opportunities that have emerged due to changes in customer behavior, technology, or other factors. Wh

In [None]:
podcastHighlights = chatOutput.choices[0].message.content

### Extensions

1. There are additional pieces of information that one might choose to extract like
    - the key topics that are being discussed in the episode
    - extract the timestamp along with the highlights so that a user could navigate directly to the location in the podcast where that discussion happens
    - build chapters of the podcast and identify their title


# Using RSS feed to get the podcast details

In this part, we will see approaches by which we can build our back-end and front-end services to achieve this.

In this part i will combine all the information extraction steps done  previously into an on-demand cloud function. The goal is to have this as our backend service that can process a RSS feed provided by the user, perform the necessary steps and return the final output with all the extracted information.


First, let's encapsulate the podcast retrieval and transcription steps (Steps 1 and 2 of the previous section) into a function and run it locally. Once this is done we will make the necessary changes to convert this to a cloud function.

In [None]:
!pip install feedparser
!pip install git+https://github.com/openai/whisper.git  -q
!pip install requests

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [None]:
def get_transcribe_podcast(rss_url, local_path):
  print ("Starting Podcast Transcription Function")
  print ("Feed URL: ", rss_url)
  print ("Local Path:", local_path)

  # Read from the RSS Feed URL
  import feedparser
  intelligence_feed = feedparser.parse(rss_url)
  for item in intelligence_feed.entries[0].links:
    if (item['type'] == 'audio/mpeg'):
      episode_url = item.href
  episode_name = "podcast_episode.mp3"
  print ("RSS URL read and episode URL: ", episode_url)

  # Download the podcast episode by parsing the RSS feed
  from pathlib import Path
  p = Path(local_path)
  p.mkdir(exist_ok=True)

  print ("Downloading the podcast episode")
  import requests
  with requests.get(episode_url, stream=True) as r:
    r.raise_for_status()
    episode_path = p.joinpath(episode_name)
    with open(episode_path, 'wb') as f:
      for chunk in r.iter_content(chunk_size=8192):
        f.write(chunk)

  print ("Podcast Episode downloaded")

  # Load the Whisper model
  import os
  import whisper
  print ("Download and Load the Whisper model")
  model = whisper.load_model("medium")
  print (model.device)

  # Perform the transcription
  print ("Starting podcast transcription")
  result = model.transcribe(local_path + episode_name)

  # Return the transcribed text
  print ("Podcast transcription completed, returning results...")
  return result

In [None]:
output = get_transcribe_podcast("https://anchor.fm/s/e24424dc/podcast/rss", "/content/podcast/")

Starting Podcast Transcription Function
Feed URL:  https://anchor.fm/s/e24424dc/podcast/rss
Local Path: /content/podcast/
RSS URL read and episode URL:  https://anchor.fm/s/e24424dc/podcast/play/74477327/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2023-7-10%2F99fa81e5-cb3f-3fc7-ef2d-284c95ad079c.mp3
Downloading the podcast episode
Podcast Episode downloaded
Download and Load the Whisper model


100%|█████████████████████████████████████| 1.42G/1.42G [00:20<00:00, 73.7MiB/s]


cuda:0
Starting podcast transcription
Podcast transcription completed, returning results...


Let's check the transcription to make sure that our function worked.

In [None]:
output['text'][:1000]

" Hi guys, thank you for tuning in to the Really Good Podcast. My name is Bobbi Althoff and I'm here today with my guest. Can you introduce yourself please? My name is Tyga. That's your government name? My government name? No. What's your real name? Michael. You know what it means? Closest to God. It means what? Closest to God. Michael. Michael does? Yeah. And what does Tyga mean? Thank you God always. That's why I spell Tyga. That makes sense. When did you come up with that name? When I was like maybe 13, 14. How old are you now? 33. So a few years ago. A few years ago. Yeah. 13, 15, I don't know. My math is kind of off. So you're a singer? Rapper, artist. A rapper? Don't let the animal touch your drink. What is he eating? Corn. Do you like corn? It's okay. It does nothing for the body. That's true. Do you think it does anything for his body? I mean the way he's killing it, it's just so interesting. It's a porcupine sitting here. I think sex is so awkward. What are we doing here today

## Step 1 - Create a cloud transcription function

I will now proceed to make changes to the function that will allow it to run on the cloud. In order to achieve this I have used [Modal Labs](https://modal.com/). This is a service that allows you to convert any Python function to run on-demand in the cloud. The service allows you to run the same Python function which is running locally in the cloud with almost zero effort. Additionally, it supports the use of GPUs which is important in this case given the transcription step. And finally, you only pay for the compute when your function is actually running in the cloud.

Of course, this is not the only way to run our function in the cloud. A more traditional approach would be to encapsulate your code in a Docker container and use cloud providers like Azure, GCP, AWS etc. to run it. I chose to showcase this method as I found it easy and approachable for anyone withouth having in-depth knowledge about containers, kubernetes, cloud infrastructure etc.

2. The next step is to install the `modal` package using the simple command - `pip install modal`

In [None]:
!pip install modal

Collecting modal
  Downloading modal-0.51.3085-py3-none-any.whl (1.2 kB)
Collecting modal-client==0.51.3085 (from modal)
  Downloading modal_client-0.51.3085-py3-none-any.whl (284 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m284.8/284.8 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
Collecting aiostream (from modal-client==0.51.3085->modal)
  Downloading aiostream-0.4.5-py3-none-any.whl (35 kB)
Collecting asgiref (from modal-client==0.51.3085->modal)
  Downloading asgiref-3.7.2-py3-none-any.whl (24 kB)
Collecting fastapi (from modal-client==0.51.3085->modal)
  Downloading fastapi-0.101.1-py3-none-any.whl (65 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.8/65.8 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting grpclib==0.4.3 (from modal-client==0.51.3085->modal)
  Downloading grpclib-0.4.3.tar.gz (62 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
!modal token new --source corise > authenticationURL.txt

In [None]:
import getpass
import subprocess

def set_modal_token():
  token_id = getpass.getpass('Please enter your Modal token ID in the cell: ')
  token_secret = getpass.getpass('Please enter your Modal token secret in the cell:  ')

  # Using subprocess to execute the command
  subprocess.run(f"!modal token set --token-id (token_id) --token-secret (token_secret)", shell=True)

In [None]:
set_modal_token()

Please enter your Modal token ID in the cell: ··········
Please enter your Modal token secret in the cell:  ··········


I am creating a python file for the same so that it will run through a command line.

We add the `%%writefile` line before the function definition and specify the filename where we want to save the Python script.

In [None]:
%%writefile /content/podcast/podcast_backend.py
import modal

def download_whisper():
  # Load the Whisper model
  import os
  import whisper
  print ("Download the Whisper model")

  # Perform download only once and save to Container storage
  whisper._download(whisper._MODELS["medium"], '/content/podcast/', False)


stub = modal.Stub("corise-podcast-project")
corise_image = modal.Image.debian_slim().pip_install("feedparser",
                                                     "https://github.com/openai/whisper/archive/9f70a352f9f8630ab3aa0d06af5cb9532bd8c21d.tar.gz",
                                                     "requests",
                                                     "ffmpeg").apt_install("ffmpeg").run_function(download_whisper)

@stub.function(image=corise_image, gpu="any")
def get_transcribe_podcast(rss_url, local_path):
  print ("Starting Podcast Transcription Function")
  print ("Feed URL: ", rss_url)
  print ("Local Path:", local_path)

  # Read from the RSS Feed URL
  import feedparser
  intelligence_feed = feedparser.parse(rss_url)
  for item in intelligence_feed.entries[0].links:
    if (item['type'] == 'audio/mpeg'):
      episode_url = item.href
  episode_name = "podcast_episode.mp3"
  print ("RSS URL read and episode URL: ", episode_url)

  # Download the podcast episode by parsing the RSS feed
  from pathlib import Path
  p = Path(local_path)
  p.mkdir(exist_ok=True)

  print ("Downloading the podcast episode")
  import requests
  with requests.get(episode_url, stream=True) as r:
    r.raise_for_status()
    episode_path = p.joinpath(episode_name)
    with open(episode_path, 'wb') as f:
      for chunk in r.iter_content(chunk_size=8192):
        f.write(chunk)

  print ("Podcast Episode downloaded")

  # Load the Whisper model
  import os
  import whisper

  # Load model from saved location
  print ("Load the Whisper model")
  model = whisper.load_model('medium', device='cuda', download_root='/content/podcast/')

  # Perform the transcription
  print ("Starting podcast transcription")
  result = model.transcribe(local_path + episode_name)

  # Return the transcribed text
  print ("Podcast transcription completed, returning results...")
  return result

@stub.local_entrypoint()
def main(url, path):
  output = get_transcribe_podcast.call(url, path)
  print (output['text'])

Overwriting /content/podcast/podcast_backend.py


We invoke the function from the command line and this will start the remote execution in the cloud environment. Note that we have requested for a GPU since it speeds up the transcription.

In [None]:
!modal run /content/podcast/podcast_backend.py --url https://access.acast.com/rss/d556eb54-6160-4c85-95f4-47d9f5216c49 --path /content/podcast/

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
[2K[1A[2K[1A[2K[34m⠼[0m Creating objects...
[37m├── [0m[34m⠙[0m Creating get_transcribe_podcast...
[2K[1A[2K[1A[2K[33mPreparing to unpack .../025-libdav1d4_0.7.1-3_amd64.deb ...
[0m[34m⠼[0m[33m Creating objects...[0m[33m
[0m[37m├── [0m[34m⠙[0m[33m Creating get_transcribe_podcast...[0m[33m
[2K[1A[2K[1A[2K[33mUnpacking libdav1d4:amd64 (0.7.1-3) ...
[0m[34m⠼[0m[33m Creating objects...[0m[33m
[0m[37m├── [0m[34m⠙[0m[33m Creating get_transcribe_podcast...[0m[33m
[2K[1A[2K[1A[2K[33mSelecting previously unselected package libglib2.0-0:amd64.
[0m[34m⠴[0m[33m Creating objects...[0m[33m
[0m[37m├── [0m[34m⠹[0m[33m Creating get_transcribe_podcast...[0m[33m
[2K[1A[2K[1A[2K[33mPreparing to unpack .../026-libglib2.0-0_2.66.8-1_amd64.deb ...
[0m[34m⠴[0m[33m Creating objects...[0m[33m
[0m[37m├── [0m[34m⠹[0m[33m Creating get_transcribe_podcast...[0m[3

## Step 2 - Create a cloud information extraction function

In the previous step we encapsulated only the transcription function and in this step we want to create functions for all the information extraction functions and deploy our end to end backend pipeline.

In [None]:
%%writefile /content/podcast/podcast_backend.py
import modal

def download_whisper():
  # Load the Whisper model
  import os
  import whisper
  print ("Download the Whisper model")

  # Perform download only once and save to Container storage
  whisper._download(whisper._MODELS["medium"], '/content/podcast/', False)


stub = modal.Stub("corise-podcast-project")
corise_image = modal.Image.debian_slim().pip_install("feedparser",
                                                     "https://github.com/openai/whisper/archive/9f70a352f9f8630ab3aa0d06af5cb9532bd8c21d.tar.gz",
                                                     "requests",
                                                     "ffmpeg",
                                                     "openai",
                                                     "tiktoken",
                                                     "wikipedia",
                                                     "ffmpeg-python").apt_install("ffmpeg").run_function(download_whisper)

@stub.function(image=corise_image, gpu="any", timeout=600)
def get_transcribe_podcast(rss_url, local_path):
  print ("Starting Podcast Transcription Function")
  print ("Feed URL: ", rss_url)
  print ("Local Path:", local_path)

  # Read from the RSS Feed URL
  import feedparser
  intelligence_feed = feedparser.parse(rss_url)
  podcast_title = intelligence_feed['feed']['title']
  episode_title = intelligence_feed.entries[0]['title']
  episode_image = intelligence_feed['feed']['image'].href
  for item in intelligence_feed.entries[0].links:
    if (item['type'] == 'audio/mpeg'):
      episode_url = item.href
  episode_name = "podcast_episode.mp3"
  print ("RSS URL read and episode URL: ", episode_url)

  # Download the podcast episode by parsing the RSS feed
  from pathlib import Path
  p = Path(local_path)
  p.mkdir(exist_ok=True)

  print ("Downloading the podcast episode")
  import requests
  with requests.get(episode_url, stream=True) as r:
    r.raise_for_status()
    episode_path = p.joinpath(episode_name)
    with open(episode_path, 'wb') as f:
      for chunk in r.iter_content(chunk_size=8192):
        f.write(chunk)

  print ("Podcast Episode downloaded")

  # Load the Whisper model
  import os
  import whisper

  # Load model from saved location
  print ("Load the Whisper model")
  model = whisper.load_model('medium', device='cuda', download_root='/content/podcast/')

  # Perform the transcription
  print ("Starting podcast transcription")
  result = model.transcribe(local_path + episode_name)

  # Return the transcribed text
  print ("Podcast transcription completed, returning results...")
  output = {}
  output['podcast_title'] = podcast_title
  output['episode_title'] = episode_title
  output['episode_image'] = episode_image
  output['episode_transcript'] = result['text']
  return output

@stub.function(image=corise_image, secret=modal.Secret.from_name("my-openai-secret"))
def get_podcast_summary(podcast_transcript):
  import openai
  instructPrompt = """
  please give a summary of the following for a newletter. make it sound interesting and catchy.
  """
  request = instructPrompt + podcast_transcript
  chatOutput = openai.ChatCompletion.create(model="gpt-3.5-turbo-16k",
                                              messages=[{"role": "system", "content": "You are a helpful assistant."},
                                                        {"role": "user", "content": request}
                                                        ]
                                              )
  podcastSummary = chatOutput.choices[0].message.content
  return podcastSummary



@stub.function(image=corise_image, secret=modal.Secret.from_name("my-openai-secret"))
def get_podcast_guest(podcast_transcript):
  import openai
  import wikipedia
  import json
  request = podcast_transcript[:5000]
  completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": request}],
    functions=[
    {
        "name": "get_podcast_guest_information",
        "description": "please extract and summarise some information about the guest mentioned in the podcast",
        "parameters": {
            "type": "object",
            "properties": {
                "guest_name": {
                    "type": "string",
                    "description": "please extract the guest's name",
                },
                "unit": {"type": "string"},
            },
            "required": ["guest_name"],
        },
    }
    ],
    function_call={"name": "get_podcast_guest_information"}
    )

  podcast_guest = ""
  response_message = completion["choices"][0]["message"]
  if response_message.get("function_call"):
    function_name = response_message["function_call"]["name"]
    function_args = json.loads(response_message["function_call"]["arguments"])
    podcast_guest=function_args.get("guest_name")

  if podcast_guest is None:
    podcast_guest = "No guest"

  podcastGuest = podcast_guest
  return podcastGuest

@stub.function(image=corise_image, secret=modal.Secret.from_name("my-openai-secret"))
def get_podcast_highlights(podcast_transcript):
  import openai
  instructPrompt = """
  please extract the highlights and important moments in the podacast episode in the format highlight 1: ... , highlight 2: ..., etc
  """

  request = instructPrompt + podcast_transcript
  chatOutput = openai.ChatCompletion.create(model="gpt-3.5-turbo-16k",
                                              messages=[{"role": "system", "content": "You are a helpful assistant."},
                                                        {"role": "user", "content": request}
                                                        ]
                                              )
  podcastHighlights = chatOutput.choices[0].message.content
  return podcastHighlights

@stub.function(image=corise_image, secret=modal.Secret.from_name("my-openai-secret"), timeout=1200)
def process_podcast(url, path):
  output = {}
  podcast_details = get_transcribe_podcast.call(url, path)
  podcast_summary = get_podcast_summary.call(podcast_details['episode_transcript'])
  podcast_guest = get_podcast_guest.call(podcast_details['episode_transcript'])
  podcast_highlights = get_podcast_highlights.call(podcast_details['episode_transcript'])
  output['podcast_details'] = podcast_details
  output['podcast_summary'] = podcast_summary
  output['podcast_guest'] = podcast_guest
  output['podcast_highlights'] = podcast_highlights
  return output

@stub.local_entrypoint()
def test_method(url, path):
  output = {}
  podcast_details = get_transcribe_podcast.call(url, path)
  print ("Podcast Summary: ", get_podcast_summary.call(podcast_details['episode_transcript']))
  print ("Podcast Guest Information: ", get_podcast_guest.call(podcast_details['episode_transcript']))
  print ("Podcast Highlights: ", get_podcast_highlights.call(podcast_details['episode_transcript']))

Overwriting /content/podcast/podcast_backend.py


Now we are all set and let's run this integrated function with the local_entrypoint to check that our entire information extraction works.

In [None]:
!modal run /content/podcast/podcast_backend.py --url https://anchor.fm/s/e24424dc/podcast/rss --path /content/podcast/

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
[0m[34m⠼[0m[33m Creating objects...[0m[33m
[0m[37m├── [0m[34m⠇[0m[33m Creating get_transcribe_podcast...[0m[33m
[2K[1A[2K[1A[2K[33mGet:22 http://deb.debian.org/debian bullseye/main amd64 libxcb-shm0 amd64 1.14-3 [101 kB]
[0m[34m⠼[0m[33m Creating objects...[0m[33m
[0m[37m├── [0m[34m⠇[0m[33m Creating get_transcribe_podcast...[0m[33m
[2K[1A[2K[1A[2K[33mGet:23 http://deb.debian.org/debian bullseye/main amd64 libxrender1 amd64 1:0.9.10-1 [33.0 kB]
[0m[34m⠼[0m[33m Creating objects...[0m[33m
[0m[37m├── [0m[34m⠇[0m[33m Creating get_transcribe_podcast...[0m[33m
[2K[1A[2K[1A[2K[33mGet:24 http://deb.debian.org/debian bullseye/main amd64 libcairo2 amd64 1.16.0-5 [694 kB]
[0m[34m⠼[0m[33m Creating objects...[0m[33m
[0m[37m├── [0m[34m⠇[0m[33m Creating get_transcribe_podcast...[0m[33m
[2K[1A[2K[1A[2K[33mGet:25 http://deb.debian.org/debian bullseye/main amd64 

In [None]:
!modal deploy /content/podcast/podcast_backend.py

[2K[34m⠴[0m Creating objects...
[37m├── [0m[34m⠋[0m Creating get_transcribe_podcast...
[37m└── [0m[34m⠋[0m Creating mount /content/podcast/podcast_backend.py: Uploaded 0/0 inspected
[2K[1A[2K[1A[2K[1A[2K[34m⠇[0m Creating objects...
[37m├── [0m[34m⠸[0m Creating get_transcribe_podcast...
[2K[1A[2K[1A[2K[34m⠹[0m Creating objects...
[37m├── [0m[34m⠦[0m Creating get_transcribe_podcast...
[2K[1A[2K[1A[2K[34m⠴[0m Creating objects...
[37m├── [0m[34m⠏[0m Creating get_transcribe_podcast...
[2K[1A[2K[1A[2K[34m⠇[0m Creating objects...
[37m├── [0m[34m⠹[0m Creating get_transcribe_podcast...
[2K[1A[2K[1A[2K[34m⠙[0m Creating objects...
[37m├── [0m[34m⠴[0m Creating get_transcribe_podcast...
[2K[1A[2K[1A[2K[34m⠼[0m Creating objects...
[37m├── [0m[34m⠇[0m Creating get_transcribe_podcast...
[37m├── [0m[32m🔨[0m Created mount /content/podcast/podcast_backend.py
[37m├── [0m[34m⠋[0m Creating download_whisper...
[37m└── [0

In [None]:
# Trying to call the deployed function from another python session
import modal
f = modal.Function.lookup("corise-podcast-project", "process_podcast")
output = f.call('https://feeds.megaphone.fm/MLN2155636147', '/content/podcast/')
# output = f.call('https://feeds.libsyn.com/478017/rss', '/content/podcast/')


<ipython-input-81-dfb88953c10d>:4: DeprecationError: 2018-08-16: `f.call(...)` is deprecated. It has been renamed to `f.remote(...)`
  output = f.call('https://feeds.megaphone.fm/MLN2155636147', '/content/podcast/')


In [None]:
import json
with open("/content/podcast/podcast-3.json", "w") as outfile:
  json.dump(output, outfile)

### Extension

There are multiple ways to speed-up the transcription process which is what takes the most amount of time.

- There is a super fast [implementation](https://github.com/sanchit-gandhi/whisper-jax) of Whisper using JAX which could be a drop-in replacement
- There is alternate approach of splitting up the audio into chunks by detecting silences and then parallelising it using multiple Modal GPU containers. They provide a very nice [example](https://github.com/modal-labs/modal-examples/tree/main/06_gpu_and_ml/openai_whisper/pod_transcriber) of how to achieve this

# Part 3 - Deploying the front-end application

In the final part of this project i am creating a front-end for the podcast summarizer application.

Since i wanted to keep it simple i chose to go with a Streamlit application for the front-end.

In [None]:
%%writefile /content/podcast/podcast_frontend.py
import modal
import streamlit as st
import json
import os

def main():
    st.title("Newsletter Dashboard")

    available_podcast_info = create_dict_from_json_files('.')

    # Left section - Input fields
    st.sidebar.header("Podcast RSS Feeds")

    # Dropdown box
    st.sidebar.subheader("Available Podcasts Feeds")
    selected_podcast = st.sidebar.selectbox("Select Podcast", options=available_podcast_info.keys())

    if selected_podcast:

        podcast_info = available_podcast_info[selected_podcast]

        # Right section - Newsletter content
        st.header("Newsletter Content")

        # Title and Image side by side
        col1, col2 = st.columns([3, 1])

        with col1:
            # Display the podcast title
            st.subheader("Episode Title")
            st.write(podcast_info['podcast_details']['episode_title'])

        with col2:
            # Display the podcast cover image
            st.image(podcast_info['podcast_details']['episode_image'], caption="Podcast Cover", width=100, use_column_width=True)

        # Display the podcast guest
        st.subheader("Podcast Guest")
        st.write(podcast_info['podcast_guest']['name'])

        # Button to view guest details
        if st.button("View Guest Details"):
            st.subheader("Podcast Guest Details")
            st.write(podcast_info["podcast_guest"]['summary'])

        # Display the podcast summary with "Read More" button
        st.subheader("Podcast Episode Summary")
        if st.button("Read More"):
            st.write(podcast_info['podcast_summary'])

        # Button to view key highlights
        if st.button("View Key Highlights"):
            st.subheader("Key Highlights")
            key_highlights = podcast_info['podcast_highlights']
            for moment in key_highlights.split('\n'):
                st.markdown(moment)

    # User Input box
    st.sidebar.subheader("Add and Process New Podcast Feed")
    url = st.sidebar.text_input("Link to RSS Feed")

    process_button = st.sidebar.button("Process Podcast Feed")
    st.sidebar.markdown("**Note**: Podcast processing can take upto 5 mins, please be patient.")

    if process_button:

        # Call the function to process the URLs and retrieve podcast guest information
        podcast_info = process_podcast_info(url)

        # Right section - Newsletter content
        st.header("Newsletter Content")

        # Display the podcast title
        st.subheader("Episode Title")
        st.write(podcast_info['podcast_details']['episode_title'])

        # Display the podcast summary and the cover image in a side-by-side layout
        col1, col2 = st.columns([7, 3])

        with col1:
            # Display the podcast episode summary
            st.subheader("Podcast Episode Summary")
            st.write(podcast_info['podcast_summary'])

        with col2:
            st.image(podcast_info['podcast_details']['episode_image'], caption="Podcast Cover", width=300, use_column_width=True)

        # Display the podcast guest and their details in a side-by-side layout
        col3, col4 = st.columns([3, 7])

        with col3:
            st.subheader("Podcast Guest")
            st.write(podcast_info['podcast_guest']['name'])

        with col4:
            st.subheader("Podcast Guest Details")
            st.write(podcast_info["podcast_guest"]['summary'])

        # Display the five key moments
        st.subheader("Key Moments")
        key_moments = podcast_info['podcast_highlights']
        for moment in key_moments.split('\n'):
            st.markdown(
                f"<p style='margin-bottom: 5px;'>{moment}</p>", unsafe_allow_html=True)

def create_dict_from_json_files(folder_path):
    json_files = [f for f in os.listdir(folder_path) if f.endswith('.json')]
    data_dict = {}

    for file_name in json_files:
        file_path = os.path.join(folder_path, file_name)
        with open(file_path, 'r') as file:
            podcast_info = json.load(file)
            podcast_name = podcast_info['podcast_details']['podcast_title']
            # Process the file data as needed
            data_dict[podcast_name] = podcast_info

    return data_dict

def process_podcast_info(url):
    f = modal.Function.lookup("corise-podcast-project", "process_podcast")
    output = f.call(url, '/content/podcast/')
    return output

if __name__ == '__main__':
    main()


Writing /content/podcast/podcast_frontend.py


In [None]:
%%writefile /content/podcast/podcast_frontend_correct.py
import streamlit as st
import modal
import json
import os



def main():
    st.title("Newsletter Dashboard")

    available_podcast_info = create_dict_from_json_files('.')

    # Left section - Input fields
    st.sidebar.header("Podcast RSS Feeds")

    # Dropdown box
    st.sidebar.subheader("Available Podcasts Feeds")
    selected_podcast = st.sidebar.selectbox("Select Podcast", options=available_podcast_info.keys())

    if selected_podcast:

        podcast_info = available_podcast_info[selected_podcast]

        # Right section - Newsletter content
        st.header("Newsletter Content")

        # Display the podcast title
        st.subheader("Episode Title")
        st.write(podcast_info['podcast_details']['episode_title'])

        # Display the podcast summary and the cover image in a side-by-side layout
        col1, col2 = st.columns([7, 3])

        with col1:
            # Display the podcast episode summary
            st.subheader("Podcast Episode Summary")
            st.write(podcast_info['podcast_summary'])

        with col2:
            st.image(podcast_info['podcast_details']['episode_image'], caption="Podcast Cover", width=300, use_column_width=True)

        # Display the podcast guest and their details in a side-by-side layout
        col3, col4 = st.columns([3, 7])

        with col3:
            st.subheader("Podcast Guest")
            st.write(podcast_info['podcast_guest']['name'])

        with col4:
            st.subheader("Podcast Guest Details")
            st.write(podcast_info["podcast_guest"]['summary'])

        # Display the five key moments
        st.subheader("Key Moments")
        key_moments = podcast_info['podcast_highlights']
        for moment in key_moments.split('\n'):
            st.markdown(
                f"<p style='margin-bottom: 5px;'>{moment}</p>", unsafe_allow_html=True)

    # User Input box
    st.sidebar.subheader("Add and Process New Podcast Feed")
    url = st.sidebar.text_input("Link to RSS Feed")

    process_button = st.sidebar.button("Process Podcast Feed")
    st.sidebar.markdown("**Note**: Podcast processing can take upto 5 mins, please be patient.")

    if process_button:

        # Call the function to process the URLs and retrieve podcast guest information
        podcast_info = process_podcast_info(url)

        # Right section - Newsletter content
        st.header("Newsletter Content")

        # Display the podcast title
        st.subheader("Episode Title")
        st.write(podcast_info['podcast_details']['episode_title'])

        # Display the podcast summary and the cover image in a side-by-side layout
        col1, col2 = st.columns([7, 3])

        with col1:
            # Display the podcast episode summary
            st.subheader("Podcast Episode Summary")
            st.write(podcast_info['podcast_summary'])

        with col2:
            st.image(podcast_info['podcast_details']['episode_image'], caption="Podcast Cover", width=300, use_column_width=True)

        # Display the podcast guest and their details in a side-by-side layout
        col3, col4 = st.columns([3, 7])

        with col3:
            st.subheader("Podcast Guest")
            st.write(podcast_info['podcast_guest']['name'])

        with col4:
            st.subheader("Podcast Guest Details")
            st.write(podcast_info["podcast_guest"]['summary'])

        # Display the five key moments
        st.subheader("Key Moments")
        key_moments = podcast_info['podcast_highlights']
        for moment in key_moments.split('\n'):
            st.markdown(
                f"<p style='margin-bottom: 5px;'>{moment}</p>", unsafe_allow_html=True)

def create_dict_from_json_files(folder_path):
    json_files = [f for f in os.listdir(folder_path) if f.endswith('.json')]
    data_dict = {}

    for file_name in json_files:
        file_path = os.path.join(folder_path, file_name)
        with open(file_path, 'r') as file:
            podcast_info = json.load(file)
            podcast_name = podcast_info['podcast_details']['podcast_title']
            # Process the file data as needed
            data_dict[podcast_name] = podcast_info

    return data_dict

def process_podcast_info(url):
    f = modal.Function.lookup("corise-podcast-project", "process_podcast")
    output = f.call(url, '/content/podcast/')
    return output

if __name__ == '__main__':
    main()

Writing /content/podcast/podcast_frontend_correct.py


In [None]:
from google.colab import files

# Download the file locally
files.download('/content/podcast/podcast_frontend_correct.py')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
%%writefile /content/podcast/requirements.txt
streamlit
modal

Writing /content/podcast/requirements.txt


In [None]:
from google.colab import files

# Download the file locally
files.download('/content/podcast/requirements.txt')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Finally, i want to pre-populate the streamlit app with some pre-processed podcasts.

In [None]:
from google.colab import files

# Download the file locally
files.download('/content/podcast/podcast-1.json')
files.download('/content/podcast/podcast-2.json')
files.download('/content/podcast/podcast-3.json')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>