# Working with RSS Feeds Lab

Complete the following set of exercises to solidify your knowledge of parsing RSS feeds and extracting information from them.

In [1]:
import feedparser

### 1. Use feedparser to parse the following RSS feed URL.

In [14]:
burner = feedparser.parse('http://feeds.feedburner.com/oreilly/radar/atom')
burner

{'bozo': 0,
 'encoding': 'UTF-8',
 'entries': [{'author': 'Mac Slocum',
   'author_detail': {'name': 'Mac Slocum'},
   'authors': [{'name': 'Mac Slocum'}],
   'comments': 'https://www.oreilly.com/radar/highlights-from-software-architecture-berlin-2019/#respond',
   'content': [{'base': 'http://feeds.feedburner.com/oreilly/radar/atom',
     'language': None,
     'type': 'text/html',
     'value': '<p>Experts from across the software architecture world came together in Berlin for the <a href="https://conferences.oreilly.com/software-architecture/sa-eu">O&#8217;Reilly Software Architecture Conference</a>. Below you&#8217;ll find links to highlights from the event.</p>\n<h2>Cognitive biases in the architect&#8217;s life</h2>\n<p>Birgitta Boeckeler covers some of the cognitive biases that can trip up architects.</p>\n<ul>\n<li>Watch &#8220;<a href="https://www.oreilly.com/radar/cognitive-biases-in-the-architects-life">Cognitive biases in the architect&#8217;s life</a>&#8220;</li>\n</ul>\n<

### 2. Obtain a list of components (keys) that are available for this feed.

In [15]:
burner.keys()

dict_keys(['feed', 'entries', 'bozo', 'headers', 'etag', 'updated', 'updated_parsed', 'href', 'status', 'encoding', 'version', 'namespaces'])

### 3. Obtain a list of components (keys) that are available for the *feed* component of this RSS feed.

In [16]:
burner.feed.keys()

dict_keys(['title', 'title_detail', 'links', 'link', 'subtitle', 'subtitle_detail', 'updated', 'updated_parsed', 'language', 'sy_updateperiod', 'sy_updatefrequency', 'generator_detail', 'generator', 'feedburner_info', 'geo_lat', 'geo_long', 'feedburner_emailserviceid', 'feedburner_feedburnerhostname'])

### 4. Extract and print the feed title, subtitle, author, and link.

In [37]:
print (burner.feed.title)
print ('')
print (burner.feed.subtitle)
print ('')
print (burner.entries[0]['author'])
print ('')
print (burner.feed.link)


Radar

Now, next, and beyond: Tracking need-to-know trends at the intersection of business and technology

Mac Slocum

https://www.oreilly.com/radar


### 5. Count the number of entries that are contained in this RSS feed.

In [25]:
len(burner.entries)

18

### 6. Obtain a list of components (keys) available for an entry.

*Hint: Remember to index first before requesting the keys*

In [43]:
burner.entries[0].keys()

dict_keys(['title', 'title_detail', 'links', 'link', 'comments', 'published', 'published_parsed', 'authors', 'author', 'author_detail', 'tags', 'id', 'guidislink', 'summary', 'summary_detail', 'content', 'wfw_commentrss', 'slash_comments', 'feedburner_origlink'])

### 7. Extract a list of entry titles.

In [23]:
titles = [burner.entries[i].title for i in range(len(burner.entries))]
print(titles)

['Highlights from the O’Reilly Software Architecture Conference in Berlin 2019', 'Highlights from the O’Reilly Velocity Conference in Berlin 2019', 'From the trenches: Patrick Kua', '5 things Go taught me about open source?', 'Building high-performing engineering teams, one pixel at a time', 'How to deploy infrastructure in just 13.8 billion years', 'Controlled chaos: The inevitable marriage of DevOps and security', 'The ultimate guide to complicated systems', 'Cognitive biases in the architect’s life', 'The three-headed dog: Architecture, process, structure', 'A world of deepfakes', 'Radar trends to watch: November 2019', 'Four short links: 7 November 2019', 'Modern machine learning architectures: Data and hardware and platform, oh my', 'The new norms of cloud native', 'Observability: Understanding production through your customers’ eyes', 'Secure reliable systems', 'My love letter to computer science is very short and I also forgot to mail it']


### 8. Calculate the percentage of "Four short links" entry titles.

In [51]:
def percentaje(tit):
    contar = 0
    for i in titles:
        if title == i:
            contar += 1
    percentage = contar/len(titles)
    return percentage

### 9. Create a Pandas data frame from the feed's entries.

In [52]:
import pandas as pd

In [53]:
df = pd.DataFrame(burner.entries)
df

Unnamed: 0,author,author_detail,authors,comments,content,feedburner_origlink,guidislink,id,link,links,published,published_parsed,slash_comments,summary,summary_detail,tags,title,title_detail,wfw_commentrss
0,Mac Slocum,{'name': 'Mac Slocum'},[{'name': 'Mac Slocum'}],https://www.oreilly.com/radar/highlights-from-...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/highlights-from-...,False,https://www.oreilly.com/radar/?p=10569,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:10:44 +0000","(2019, 11, 7, 20, 10, 44, 3, 311, 0)",0,Experts from across the software architecture ...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...",Highlights from the O’Reilly Software Architec...,"{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/highlights-from-...
1,Mac Slocum,{'name': 'Mac Slocum'},[{'name': 'Mac Slocum'}],https://www.oreilly.com/radar/highlights-from-...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/highlights-from-...,False,https://www.oreilly.com/radar/?p=10577,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:09:07 +0000","(2019, 11, 7, 20, 9, 7, 3, 311, 0)",0,People from across the cloud native and distri...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...",Highlights from the O’Reilly Velocity Conferen...,"{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/highlights-from-...
2,Patrick Kua and Neal Ford,{'name': 'Patrick Kua and Neal Ford'},[{'name': 'Patrick Kua and Neal Ford'}],https://www.oreilly.com/radar/from-the-trenche...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/from-the-trenche...,False,https://www.oreilly.com/radar/?p=10503,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:00:48 +0000","(2019, 11, 7, 20, 0, 48, 3, 311, 0)",0,This is a keynote highlight from the O&#8217;R...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...",From the trenches: Patrick Kua,"{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/from-the-trenche...
3,Dave Cheney,{'name': 'Dave Cheney'},[{'name': 'Dave Cheney'}],https://www.oreilly.com/radar/5-things-go-taug...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/5-things-go-taug...,False,https://www.oreilly.com/radar/?p=10549,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:00:38 +0000","(2019, 11, 7, 20, 0, 38, 3, 311, 0)",0,This is a keynote highlight from the O&#8217;R...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...",5 things Go taught me about open source?,"{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/5-things-go-taug...
4,Lena Reinhard,{'name': 'Lena Reinhard'},[{'name': 'Lena Reinhard'}],https://www.oreilly.com/radar/building-high-pe...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/building-high-pe...,False,https://www.oreilly.com/radar/?p=10556,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:00:35 +0000","(2019, 11, 7, 20, 0, 35, 3, 311, 0)",0,This is a keynote highlight from the O&#8217;R...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...","Building high-performing engineering teams, on...","{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/building-high-pe...
5,Ingrid Burrington,{'name': 'Ingrid Burrington'},[{'name': 'Ingrid Burrington'}],https://www.oreilly.com/radar/how-to-deploy-in...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/how-to-deploy-in...,False,https://www.oreilly.com/radar/?p=10536,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:00:34 +0000","(2019, 11, 7, 20, 0, 34, 3, 311, 0)",0,This is a keynote highlight from the O&#8217;R...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...",How to deploy infrastructure in just 13.8 bill...,"{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/how-to-deploy-in...
6,Kelly Shortridge,{'name': 'Kelly Shortridge'},[{'name': 'Kelly Shortridge'}],https://www.oreilly.com/radar/controlled-chaos...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/controlled-chaos...,False,https://www.oreilly.com/radar/?p=10561,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:00:30 +0000","(2019, 11, 7, 20, 0, 30, 3, 311, 0)",0,This is a keynote highlight from the O&#8217;R...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...",Controlled chaos: The inevitable marriage of D...,"{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/controlled-chaos...
7,Jennifer Davis,{'name': 'Jennifer Davis'},[{'name': 'Jennifer Davis'}],https://www.oreilly.com/radar/the-ultimate-gui...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/the-ultimate-gui...,False,https://www.oreilly.com/radar/?p=10540,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:00:17 +0000","(2019, 11, 7, 20, 0, 17, 3, 311, 0)",0,This is a keynote highlight from the O&#8217;R...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...",The ultimate guide to complicated systems,"{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/the-ultimate-gui...
8,Birgitta Boeckeler,{'name': 'Birgitta Boeckeler'},[{'name': 'Birgitta Boeckeler'}],https://www.oreilly.com/radar/cognitive-biases...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/cognitive-biases...,False,https://www.oreilly.com/radar/?p=10497,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:00:07 +0000","(2019, 11, 7, 20, 0, 7, 3, 311, 0)",0,This is a keynote highlight from the O&#8217;R...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...",Cognitive biases in the architect’s life,"{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/cognitive-biases...
9,Allen Holub,{'name': 'Allen Holub'},[{'name': 'Allen Holub'}],https://www.oreilly.com/radar/the-three-headed...,"[{'type': 'text/html', 'language': None, 'base...",https://www.oreilly.com/radar/the-three-headed...,False,https://www.oreilly.com/radar/?p=10490,http://feedproxy.google.com/~r/oreilly/radar/a...,"[{'rel': 'alternate', 'type': 'text/html', 'hr...","Thu, 07 Nov 2019 20:00:06 +0000","(2019, 11, 7, 20, 0, 6, 3, 311, 0)",0,This is a keynote highlight from the O&#8217;R...,"{'type': 'text/html', 'language': None, 'base'...","[{'term': 'Next Architecture', 'scheme': None,...","The three-headed dog: Architecture, process, s...","{'type': 'text/plain', 'language': None, 'base...",https://www.oreilly.com/radar/the-three-headed...


### 10. Count the number of entries per author and sort them in descending order.

In [54]:
author = df.groupby('author',as_index = False).agg({'title':'count'})
author.columns = ['author','entries']
author.sort_values('entries',ascending = False)

Unnamed: 0,author,entries
13,Mac Slocum,2
0,Allen Holub,1
9,James Mickens,1
15,Nat Torkington,1
14,Mike Loukides,1
12,Lena Reinhard,1
11,Kelly Shortridge,1
10,Jennifer Davis,1
8,Ingrid Burrington,1
1,Ana Oprea,1


### 11. Add a new column to the data frame that contains the length (number of characters) of each entry title. Return a data frame that contains the title, author, and title length of each entry in descending order (longest title length at the top).

In [58]:
df['title_length'] = df['title'].apply(len)
df[['title', 'author', 'title_length']].sort_values('title_length', ascending=False).head()

17    My love letter to computer science is very sho...
13    Modern machine learning architectures: Data an...
0     Highlights from the O’Reilly Software Architec...
15    Observability: Understanding production throug...
6     Controlled chaos: The inevitable marriage of D...
4     Building high-performing engineering teams, on...
1     Highlights from the O’Reilly Velocity Conferen...
5     How to deploy infrastructure in just 13.8 bill...
9     The three-headed dog: Architecture, process, s...
7             The ultimate guide to complicated systems
3              5 things Go taught me about open source?
8              Cognitive biases in the architect’s life
11                 Radar trends to watch: November 2019
12                    Four short links: 7 November 2019
2                        From the trenches: Patrick Kua
14                        The new norms of cloud native
16                              Secure reliable systems
10                                 A world of de

### 12. Create a list of entry titles whose summary includes the phrase "machine learning."

In [61]:
machine = [i for i in df['title'] if 'machine learning' in i]

In [62]:
machine

['Modern machine learning architectures: Data and hardware and platform, oh my']