# Working with RSS Feeds Lab

Complete the following set of exercises to solidify your knowledge of parsing RSS feeds and extracting information from them.

In [29]:
%pip install feedparser
import feedparser
import pandas as pd

Note: you may need to restart the kernel to use updated packages.


### 1. Use feedparser to parse the following RSS feed URL.

In [3]:
url = 'https://rss.art19.com/apology-line'

In [7]:
apol= feedparser.parse(url)

### 2. Obtain a list of components (keys) that are available for this feed.

In [8]:
apol.keys()

dict_keys(['bozo', 'entries', 'feed', 'headers', 'updated', 'updated_parsed', 'href', 'status', 'encoding', 'version', 'namespaces'])

### 3. Obtain a list of components (keys) that are available for the *feed* component of this RSS feed.

In [9]:
apol['feed'].keys()

dict_keys(['title', 'title_detail', 'subtitle', 'subtitle_detail', 'authors', 'author', 'author_detail', 'rights', 'rights_detail', 'generator_detail', 'generator', 'links', 'link', 'publisher_detail', 'summary', 'summary_detail', 'language', 'itunes_explicit', 'tags', 'itunes_type', 'image'])

### 4. Extract and print the feed title, subtitle, author, and link.

In [10]:
print(apol['feed']['title'])
print(apol['feed']['subtitle'])
print(apol['feed']['author'])
print(apol['feed']['link'])

The Apology Line
<p>If you could call a number and say you’re sorry, and no one would know…what would you apologize for? For fifteen years, you could call a number in Manhattan and do just that. This is the story of the line, and the man at the other end who became consumed by his own creation. He was known as “Mr. Apology.” As thousands of callers flooded the line, confessing to everything from shoplifting to infidelity, drug dealing to murder, Mr. Apology realized he couldn’t just listen. He had to do something, even if it meant risking everything. From Wondery the makers of Dr. Death and The Shrink Next Door, comes a story about empathy, deception and obsession. Marissa Bridge, who knew Mr. Apology better than anyone, hosts this six episode series.</p><p>All episodes are available now. You can binge the series ad-free on Wondery+ or on Amazon Music with a Prime membership or Amazon Music Unlimited subscription. </p>
Wondery
https://wondery.com/shows/the-apology-line/?utm_source=rss


### 5. Count the number of entries that are contained in this RSS feed.

In [11]:
len(apol['feed'].keys())

21

### 6. Obtain a list of components (keys) available for an entry.

*Hint: Remember to index first before requesting the keys*

In [15]:
apol['entries'][0].keys()

dict_keys(['title', 'title_detail', 'summary', 'summary_detail', 'itunes_title', 'itunes_episodetype', 'content', 'id', 'guidislink', 'published', 'published_parsed', 'itunes_explicit', 'image', 'tags', 'itunes_duration', 'links'])

### 7. Extract a list of entry titles.

In [27]:
titles = [apol.entries[i].title for i in range(len(apol.entries))]
print(titles)

['Listen Now - Think Twice: Michael Jackson', 'Listen Now: Suspect "Five Shots in the Dark"', 'Where to find Episodes 2-7 of The Apology Line', 'Introducing: The Generation Why', 'Who’s Sorry Now? | 1', 'Introducing: The Apology Line']


### 8. Calculate the percentage of "Four short links" entry titles.

### 9. Create a Pandas data frame from the feed's entries.

In [36]:
data = pd.DataFrame(apol.entries)

In [34]:
data

Unnamed: 0,title,title_detail,summary,summary_detail,itunes_title,itunes_episodetype,content,id,guidislink,published,published_parsed,itunes_explicit,image,tags,itunes_duration,links,itunes_episode
0,Listen Now - Think Twice: Michael Jackson,"{'type': 'text/plain', 'language': None, 'base...",<p>More than a decade since Michael Jackson’s ...,"{'type': 'text/html', 'language': None, 'base'...",Listen Now - Think Twice: Michael Jackson,trailer,"[{'type': 'text/plain', 'language': None, 'bas...",gid://art19-episode-locator/V0/RB0GUh28sKfRDnX...,False,"Mon, 31 Jul 2023 08:00:00 -0000","(2023, 7, 31, 8, 0, 0, 0, 212, 0)",True,{'href': 'https://content.production.cdn.art19...,"[{'term': 'SERIAL KILLER', 'scheme': 'http://w...",00:07:48,"[{'type': 'audio/mpeg', 'length': '7488574', '...",
1,"Listen Now: Suspect ""Five Shots in the Dark""","{'type': 'text/plain', 'language': None, 'base...",<p>Suspect is an investigative series about mi...,"{'type': 'text/html', 'language': None, 'base'...","Listen Now: Suspect ""Five Shots in the Dark""",trailer,"[{'type': 'text/plain', 'language': None, 'bas...",gid://art19-episode-locator/V0/zOcPUHGtQrWg5FP...,False,"Mon, 17 Jul 2023 08:00:00 -0000","(2023, 7, 17, 8, 0, 0, 0, 198, 0)",True,{'href': 'https://content.production.cdn.art19...,"[{'term': 'serial killer', 'scheme': 'http://w...",00:06:02,"[{'type': 'audio/mpeg', 'length': '5797093', '...",
2,Where to find Episodes 2-7 of The Apology Line,"{'type': 'text/plain', 'language': None, 'base...",<p>The Apology Line has moved. You can listen ...,"{'type': 'text/html', 'language': None, 'base'...",Where to find Episodes 2-7 of The Apology Line,bonus,"[{'type': 'text/plain', 'language': None, 'bas...",gid://art19-episode-locator/V0/FZOLMXrfpw3yiFJ...,False,"Mon, 03 Jul 2023 15:28:44 -0000","(2023, 7, 3, 15, 28, 44, 0, 184, 0)",True,{'href': 'https://content.production.cdn.art19...,"[{'term': 'SERIAL KILLER', 'scheme': 'http://w...",00:00:51,"[{'type': 'audio/mpeg', 'length': '827559', 'h...",
3,Introducing: The Generation Why,"{'type': 'text/plain', 'language': None, 'base...",<p>The Generation Why Podcast released its fir...,"{'type': 'text/html', 'language': None, 'base'...",Introducing: The Generation Why,trailer,"[{'type': 'text/html', 'language': None, 'base...",gid://art19-episode-locator/V0/sUtNdRf3gL71KX2...,False,"Mon, 03 Jul 2023 08:00:00 -0000","(2023, 7, 3, 8, 0, 0, 0, 184, 0)",True,{'href': 'https://content.production.cdn.art19...,"[{'term': 'SERIAL KILLER', 'scheme': 'http://w...",00:08:10,"[{'type': 'audio/mpeg', 'length': '7840914', '...",
4,Who’s Sorry Now? | 1,"{'type': 'text/plain', 'language': None, 'base...",<p>Marissa Bridge has only had a premonition t...,"{'type': 'text/html', 'language': None, 'base'...",Who’s Sorry Now?,full,"[{'type': 'text/plain', 'language': None, 'bas...",gid://art19-episode-locator/V0/eZqOUfJOCHVNII-...,False,"Sun, 17 Jan 2021 08:00:00 -0000","(2021, 1, 17, 8, 0, 0, 6, 17, 0)",True,{'href': 'https://content.production.cdn.art19...,"[{'term': 'Exhibit C', 'scheme': 'http://www.i...",00:37:20,"[{'type': 'audio/mpeg', 'length': '35845433', ...",1.0
5,Introducing: The Apology Line,"{'type': 'text/plain', 'language': None, 'base...",<p>If you could call a number and say you’re s...,"{'type': 'text/html', 'language': None, 'base'...",Introducing: The Apology Line,trailer,"[{'type': 'text/plain', 'language': None, 'bas...",gid://art19-episode-locator/V0/2E7Nce-ZiX0Rmo0...,False,"Tue, 05 Jan 2021 03:26:59 -0000","(2021, 1, 5, 3, 26, 59, 1, 5, 0)",True,{'href': 'https://content.production.cdn.art19...,"[{'term': 'Exhibit C', 'scheme': 'http://www.i...",00:02:24,"[{'type': 'audio/mpeg', 'length': '2320091', '...",


### 10. Count the number of entries per author and sort them in descending order.

### 11. Add a new column to the data frame that contains the length (number of characters) of each entry title. Return a data frame that contains the title, author, and title length of each entry in descending order (longest title length at the top).

In [38]:
data['title_length'] = data['title'].apply(len)
data[['title', 'title_length']].sort_values('title_length', ascending=False).head()

Unnamed: 0,title,title_length
2,Where to find Episodes 2-7 of The Apology Line,46
1,"Listen Now: Suspect ""Five Shots in the Dark""",44
0,Listen Now - Think Twice: Michael Jackson,41
3,Introducing: The Generation Why,31
5,Introducing: The Apology Line,29


### 12. Create a list of entry titles whose summary includes the phrase "machine learning."