# Working with RSS Feeds Lab

Complete the following set of exercises to solidify your knowledge of parsing RSS feeds and extracting information from them.

In [1]:
%pip install feedparser

Note: you may need to restart the kernel to use updated packages.


In [2]:
import feedparser
import pandas as pd

### 1. Use feedparser to parse the following RSS feed URL.

In [9]:
reddit = feedparser.parse('https://feeds.simplecast.com/54nAGcIl')

### 2. Obtain a list of components (keys) that are available for this feed.

In [5]:
reddit.keys()

dict_keys(['bozo', 'entries', 'feed', 'headers', 'updated', 'updated_parsed', 'href', 'status', 'encoding', 'bozo_exception', 'version', 'namespaces'])

### 3. Obtain a list of components (keys) that are available for the *feed* component of this RSS feed.

In [12]:
reddit['feed'].keys()

dict_keys(['links', 'generator_detail', 'generator', 'title', 'title_detail', 'subtitle', 'subtitle_detail', 'rights', 'rights_detail', 'language', 'published', 'published_parsed', 'updated', 'updated_parsed', 'image', 'link', 'itunes_type', 'summary', 'summary_detail', 'authors', 'author', 'author_detail', 'itunes_explicit', 'itunes_new-feed-url', 'publisher_detail', 'tags'])

### 4. Extract and print the feed title, subtitle, author, and link.

In [14]:
print(reddit['feed']['title'])
print(reddit['feed']['subtitle'])
print(reddit['feed']['author'])
print(reddit['feed']['link'])

The Daily
This is what the news should sound like. The biggest stories of our time, told by the best journalists in the world. Hosted by Michael Barbaro and Sabrina Tavernise. Twenty minutes a day, five days a week, ready by 6 a.m.

Listen to this podcast in New York Times Audio, our new iOS app for news subscribers. Download now at nytimes.com/audioapp
The New York Times
https://www.nytimes.com/the-daily


### 5. Count the number of entries that are contained in this RSS feed.

In [15]:
len(reddit['feed'].keys())

26

### 6. Obtain a list of components (keys) available for an entry.

*Hint: Remember to index first before requesting the keys*

In [16]:
reddit['entries'][0].keys()

dict_keys(['id', 'guidislink', 'title', 'title_detail', 'summary', 'summary_detail', 'published', 'published_parsed', 'authors', 'author', 'author_detail', 'links', 'link', 'content', 'itunes_title', 'itunes_duration', 'subtitle', 'subtitle_detail', 'itunes_explicit', 'itunes_episodetype'])

### 7. Extract a list of entry titles.

In [17]:
titulos = [reddit.entries[i].title for i in range(len(reddit.entries))]
print(titulos)



### 8. Calculate the percentage of "Four short links" entry titles.

### 9. Create a Pandas data frame from the feed's entries.

In [19]:
data = pd.DataFrame(reddit.entries)

In [20]:
data.head()

Unnamed: 0,id,guidislink,title,title_detail,summary,summary_detail,published,published_parsed,authors,author,...,link,content,itunes_title,itunes_duration,subtitle,subtitle_detail,itunes_explicit,itunes_episodetype,itunes_episode,image
0,e01b3eb5-e679-47d1-98ca-deb2302be1a7,False,The Charges Against Trump for Conspiring to Ov...,"{'type': 'text/plain', 'language': None, 'base...","On Tuesday afternoon, the special counsel Jack...","{'type': 'text/plain', 'language': None, 'base...","Wed, 2 Aug 2023 09:45:00 +0000","(2023, 8, 2, 9, 45, 0, 2, 214, 0)","[{'name': 'The New York Times', 'email': 'thed...",The New York Times,...,https://www.nytimes.com/the-daily,"[{'type': 'text/html', 'language': None, 'base...",The Charges Against Trump for Conspiring to Ov...,00:26:25,"On Tuesday afternoon, the special counsel Jack...","{'type': 'text/plain', 'language': None, 'base...",,full,,
1,3f6e5473-d26a-4c9c-bb97-6e17e5b00260,False,The Secret History of Gun Rights,"{'type': 'text/plain', 'language': None, 'base...","How did the National Rifle Association, Americ...","{'type': 'text/plain', 'language': None, 'base...","Tue, 1 Aug 2023 09:45:00 +0000","(2023, 8, 1, 9, 45, 0, 1, 213, 0)","[{'name': 'The New York Times', 'email': 'thed...",The New York Times,...,https://www.nytimes.com/the-daily,"[{'type': 'text/html', 'language': None, 'base...",The Secret History of Gun Rights,00:26:57,"How did the National Rifle Association, Americ...","{'type': 'text/plain', 'language': None, 'base...",,full,,
2,6a349a51-f9de-4fd0-820d-e8a9f137b199,False,Italy’s Giorgia Meloni Charts a Path for the F...,"{'type': 'text/plain', 'language': None, 'base...","Last year, Giorgia Meloni, an Italian far-righ...","{'type': 'text/plain', 'language': None, 'base...","Mon, 31 Jul 2023 09:45:00 +0000","(2023, 7, 31, 9, 45, 0, 0, 212, 0)","[{'name': 'The New York Times', 'email': 'thed...",The New York Times,...,https://www.nytimes.com/the-daily,"[{'type': 'text/html', 'language': None, 'base...",Italy’s Giorgia Meloni Charts a Path for the F...,00:31:31,"Last year, Giorgia Meloni, an Italian far-righ...","{'type': 'text/plain', 'language': None, 'base...",,full,,
3,b4282e9e-bbb4-43cb-80ca-49a76f0d9c2f,False,The Sunday Read: ‘The America That Americans F...,"{'type': 'text/plain', 'language': None, 'base...","On the weekends, when Roy Gamboa was a little ...","{'type': 'text/plain', 'language': None, 'base...","Sun, 30 Jul 2023 10:00:00 +0000","(2023, 7, 30, 10, 0, 0, 6, 211, 0)","[{'name': 'The New York Times', 'email': 'thed...",The New York Times,...,https://www.nytimes.com/the-daily,"[{'type': 'text/html', 'language': None, 'base...",The Sunday Read: ‘The America That Americans F...,01:43:56,"On the weekends, when Roy Gamboa was a little ...","{'type': 'text/plain', 'language': None, 'base...",,full,,
4,bacbf190-63d7-446c-a92e-4b15823e482f,False,Menopause Is Having a Moment,"{'type': 'text/plain', 'language': None, 'base...",Some of the worst symptoms of menopause — incl...,"{'type': 'text/plain', 'language': None, 'base...","Fri, 28 Jul 2023 09:45:00 +0000","(2023, 7, 28, 9, 45, 0, 4, 209, 0)","[{'name': 'The New York Times', 'email': 'thed...",The New York Times,...,https://www.nytimes.com/the-daily,"[{'type': 'text/html', 'language': None, 'base...",Menopause Is Having a Moment,00:32:08,Some of the worst symptoms of menopause — incl...,"{'type': 'text/plain', 'language': None, 'base...",,full,,


### 10. Count the number of entries per author and sort them in descending order.

### 11. Add a new column to the data frame that contains the length (number of characters) of each entry title. Return a data frame that contains the title, author, and title length of each entry in descending order (longest title length at the top).

In [21]:
data['title_length'] = data['title'].apply(len)
data[['title', 'title_length']].sort_values('title_length', ascending=False).head()

Unnamed: 0,title,title_length
50,"Special Episode: A Crash Course in Dembow, a M...",114
732,Bonus: The N-Word is Both Unspeakable and Ubiq...,109
779,The Sunday Read: ‘The Amateur Cloud Society Th...,92
456,The Sunday Read: ‘Animals That Infect Humans A...,92
344,"The Sunday Read: ‘How Houston Moved 25,000 Peo...",91


### 12. Create a list of entry titles whose summary includes the phrase "machine learning."