# ACQUIRE DATA THROUGH WEBSCRAPING

In [1]:
from requests import get
from bs4 import BeautifulSoup
import os
import pandas as pd

# 1. Codeup Blog Articles
Visit Codeup's Blog and record the urls for at least 5 distinct blog posts. For each post, you should scrape at least the post's title and content.

- Encapsulate your work in a function named get_blog_articles that will:
    - return a list of dictionaries, 
    - with each dictionary representing one article. 
The shape of each dictionary should look like this:

Plus any additional properties you think might be helpful.

Bonus: Scrape the text of all the articles linked on codeup's blog page.

In [2]:
{
    'title': 'the title of the article',
    'content': 'the full text content of the article'
}

{'title': 'the title of the article',
 'content': 'the full text content of the article'}

In [3]:
def get_blog_articles():
    # if we already have the data, read it locally
    if path.exists('article.txt'):
        with open('article.txt') as f:
            return f.read()
        
    # otherwise go fetch the data
    url = 'https://codeup.com/data-science/math-in-data-science/'
    headers = {'User-Agent': 'Codeup Data Science'}
    response = get(url, headers=headers)
    soup = BeautifulSoup(response.text)
    article = soup.find('div', id='main-content')

    # save it for next time
    with open('article.txt', 'w') as f:
        f.write(article.text)

    return article.text

---

## STEP-BY-STEP WALKTHROUGH 

#### STEPS
- STEP 1: Import the get() function from the requests module, BeautifulSoup from bs4, and pandas.
- STEP 2: Assign the address of the web page to a variable named url.
- STEP 3: Request the server the content of the web page by using get(), and store the server’s response in the variable response.

In [4]:
url1 = 'https://codeup.com/codeup-news/codeup-start-dates-for-march-2022/'
headers = {'User-Agent': 'Innis Data Science cohort'} # Some websites don't accept the pyhon-requests default user-agent  headers = {'user-agent': 'Innis Data Science Cohort'}
response = get(url1, headers=headers)

---

- STEP 4: Print the response text to ensure you have an html page.
- STEP 5: Take a look at the actual web page contents and inspect the source to understand the structure a bit.

In [32]:
print(response.text[:400])

<!DOCTYPE html>
<html lang="en-US">
<head>
	<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge">
	<link rel="pingback" href="https://codeup.com/xmlrpc.php" />

	<script type="text/javascript">
		document.documentElement.className = 'js';
	</script>
	
	<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin /><script id="diviarea-loader">window.DiviPopupData=wi


---

- STEP 6: Use BeautifulSoup to parse the HTML into a variable ('soup').

In [6]:
# Make a soup variable holding the response content
soup = BeautifulSoup(response.text, 'html.parser')

In [33]:
print(soup.prettify()[:400])

<!DOCTYPE html>
<html lang="en-US">
 <head>
  <meta charset="utf-8"/>
  <meta content="IE=edge" http-equiv="X-UA-Compatible"/>
  <link href="https://codeup.com/xmlrpc.php" rel="pingback"/>
  <script type="text/javascript">
   document.documentElement.className = 'js';
  </script>
  <link crossorigin="" href="https://fonts.gstatic.com" rel="preconnect"/>
  <script id="diviarea-loader">
   window.Di


---

- STEP 7: 
Identify the key tags you need to extract the data you are looking for.

In [8]:
title = soup.select_one('h1.entry-title').text
title

'Codeup Start Dates for March 2022'

In [34]:
content = soup.select_one('div.entry-content').text
content

'\n\n\n\n\n\nAWS, Google, Azure, Red Hat, CompTIA…these are big names in IT! And not only for their products, but also for the certifications they offer. If you’re new to tech, you might be wondering: Do certifications really matter? Welcome to IT Certifications 101! What’s the tl;dr? Certifications are critical to getting past HR for certain roles, but they won’t get you the job.\nWhat are IT Certifications?\nCertifications are official credentials given by companies, institutions, and vendors to verify that you have demonstrated a specific knowledge or skill. They are a stamp of approval from a known brand (like Amazon) saying “This person knows their stuff.” Certs are used in a lot of industries, including technology. Even within tech, there are dozens of vendors and hundreds of certs.\nWhy do IT Certifications matter to employers?\nMany roles (including security, system administration, and networking) require certain certs to make it in the door of an interview process. This filter

---

- STEP 8: 
Create a dictionary or dataframe of the data desired.

In [24]:
def get_one_blog_article(url):
    headers = {'User-Agent': 'Innis Data Science cohort'} # Some websites don't accept the pyhon-requests default user-agent  headers = {'user-agent': 'Innis Data Science Cohort'}
    response = get(url, headers=headers)

    # Make a soup variable holding the response content
    soup = BeautifulSoup(response.text, 'html.parser')
    
    output = {}
    output['title'] = soup.select_one('h1.entry-title').text
    output['contents'] = soup.select_one('div.entry-content').text
    
    return output    

In [25]:
get_one_blog_article('https://codeup.com/it-training/it-certifications-101/')

{'title': 'IT Certifications 101: Why They Matter, and Why They Don’t',
 'contents': '\n\n\n\n\n\nAWS, Google, Azure, Red Hat, CompTIA…these are big names in IT! And not only for their products, but also for the certifications they offer. If you’re new to tech, you might be wondering: Do certifications really matter? Welcome to IT Certifications 101! What’s the tl;dr? Certifications are critical to getting past HR for certain roles, but they won’t get you the job.\nWhat are IT Certifications?\nCertifications are official credentials given by companies, institutions, and vendors to verify that you have demonstrated a specific knowledge or skill. They are a stamp of approval from a known brand (like Amazon) saying “This person knows their stuff.” Certs are used in a lot of industries, including technology. Even within tech, there are dozens of vendors and hundreds of certs.\nWhy do IT Certifications matter to employers?\nMany roles (including security, system administration, and networki

#### URL 2

In [26]:
get_one_blog_article('https://codeup.com/featured/5-books-every-woman-in-tech-should-read/')

{'title': '5 Books Every Woman In Tech Should Read',
 'contents': '\nOn this International Women’s Day 2022 we wanted to tell stories about women in tech. What better way to do that than celebrate female authors! These women have written phenomenal books in the tech space to tell their stories. These are women who have walked the walk in the tech world and/or offered unique perspectives to not just women in tech, but women in the workplace. This list goes in no particular order as we believe you should add all of these to your kindle or book library asap. You can click on each book image to take you to a purchase page on Amazon.\nLet’s dive into the list below:\nReset: My Fight for Inclusion and Lasting Change by Ellen Pao\nFrom the book’s description on Amazon:\n“In 2015, Ellen K. Pao sued a powerhouse Silicon Valley venture capital firm, calling out workplace discrimination and retaliation against women and other underrepresented groups. Her suit rocked the tech world—and exposed its

#### URL 3

In [27]:
get_one_blog_article('https://codeup.com/codeup-news/dallas-campus-re-opens-with-new-grant-partner/')

{'title': 'Dallas Campus Re-opens With New Grant Partner',
 'contents': '\n\n\n\n\n\nWe are happy to announce that our Dallas campus re-opened!\xa0Better yet, we have a new grant partner that can fund up to $15,000 in tuition for eligible students. Are you a DFW resident who:\n\nIs unemployed or under-employed\nwork has been affected by COVID-19\na military veteran\nunder the age of 25\n\nIf so, follow the steps below to seek up to $15,000 in tuition funding!\n\nRegister for an account at www.workintexas.com – click “Sign in” and then select Option 3: Create a User Account\nFind the nearest Workforce Center to your residence using this tool\nAttend a WIOA information session – these sessions occur at different times at different locations. Contact your local center for more information after registering at Work in Texas\nSelect Codeup as the program for training, and notify tuition@codeup.com of your progress\n\nOur next Web Development class starts January 31st, 2022, and will be the 

#### URL 4

In [28]:
get_one_blog_article('https://codeup.com/codeup-news/codeups-placement-team-continues-setting-records/')

{'title': 'Codeup’s Placement Team Continues Setting Records',
 'contents': '\n\n\n\n\n\nOur Placement Team is simply defined as a group that manages relationships with our employer partners and our graduating students to help get our graduating students hired. Last quarter the Placement Team helped 48 students get hired to life-changing careers in tech. Last month our Placement Team has already placed 40 students with top tech companies. For that, we want to send a huge thank you to both our Placement Team and our Employer Partners who have done a tremendous job of helping Codeup empower life change for these students.\nWho exactly got hired and where? Check out the list below!\n\n\nKirsten Collier – hired at CGI as a Java Developer\nMichael Baker – hired to CGI as a Java Developer\nMichael Troia – hired to CDW as an Associate Consulting Engineer\xa0\nCarlos Padilla – hired at CGI as a Java Developer\xa0\nVictor G. Hernandez – hired at Anderson Marketing Group as a Java Developer\nNic

#### URL 5

In [29]:
get_one_blog_article('https://codeup.com/it-training/it-certifications-101/')

{'title': 'IT Certifications 101: Why They Matter, and Why They Don’t',
 'contents': '\n\n\n\n\n\nAWS, Google, Azure, Red Hat, CompTIA…these are big names in IT! And not only for their products, but also for the certifications they offer. If you’re new to tech, you might be wondering: Do certifications really matter? Welcome to IT Certifications 101! What’s the tl;dr? Certifications are critical to getting past HR for certain roles, but they won’t get you the job.\nWhat are IT Certifications?\nCertifications are official credentials given by companies, institutions, and vendors to verify that you have demonstrated a specific knowledge or skill. They are a stamp of approval from a known brand (like Amazon) saying “This person knows their stuff.” Certs are used in a lot of industries, including technology. Even within tech, there are dozens of vendors and hundreds of certs.\nWhy do IT Certifications matter to employers?\nMany roles (including security, system administration, and networki

In [35]:
def get_blog_articles(urls):
    # Create a list of dictionaries
    list_dic = [get_one_blog_article(url) for url in urls] 
    # Return a data frame
    return pd.DataFrame(list_dic).style.set_properties(**{'text-align':'left'})

In [36]:
urls = [
    'https://codeup.com/codeup-news/codeup-start-dates-for-march-2022/',
    'https://codeup.com/featured/5-books-every-woman-in-tech-should-read/',
    'https://codeup.com/codeup-news/dallas-campus-re-opens-with-new-grant-partner/',
    'https://codeup.com/codeup-news/codeups-placement-team-continues-setting-records/',
    'https://codeup.com/it-training/it-certifications-101/'
]

In [37]:
get_blog_articles(urls)

Unnamed: 0,title,contents
0,Codeup Start Dates for March 2022,"As we approach the end of January we wanted to look forward to our next start dates for all of our current programs. Full Stack Web Development – 3/7/22 Full Stack Web Development is the first program we built and also our most popular. You’ve asked and we listened! Our next Web Development cohort will start on 3/7/2022 and is ENTIRELY VIRTUAL! THESE SEATS WILL GO FAST! As one of the most in-demand jobs in the country, software and web development is the tech career with the newest jobs. In the U.S., there’s: 1.5 million developer jobs* 250,000 of them remain open a high growth rate of 13%*  Data Science – 3/22/22 Our first new Data Science class of 2022 starts Monday 3/22/2022 at our downtown campus at the Vogue building. Why consider pivoting careers to Data Science? #1 job in America from 2016-2020 (Glassdoor*) 650% increase in data science positions since 2012 Nearly 12 million new jobs between 2019 and 2029 31% ten-year growth rate The supply of data scientists remains painfully low compared to the outrageous demand. YOU can help close the gap while launching a fulfilling, secure, and high-paying career – one of the very best in the country! Employers are scrambling to find talent due to a lack of qualified applicants. YOU can help fill the gap while future-proofing your skillset. Have the flexibility, security, and salary that you’ve always wanted in a career. Are you ready to launch your career in tech? Apply today so our admissions team can save your seat and get your name on the list. Our application can be found here. Want to experience Codeup early? Join one of our workshops to get an intro to a specific coding language, learn about our financing options, or maybe even code yourself a resume! All of our events can be located here. We can’t wait to help you launch your career in tech!"
1,5 Books Every Woman In Tech Should Read,"On this International Women’s Day 2022 we wanted to tell stories about women in tech. What better way to do that than celebrate female authors! These women have written phenomenal books in the tech space to tell their stories. These are women who have walked the walk in the tech world and/or offered unique perspectives to not just women in tech, but women in the workplace. This list goes in no particular order as we believe you should add all of these to your kindle or book library asap. You can click on each book image to take you to a purchase page on Amazon. Let’s dive into the list below: Reset: My Fight for Inclusion and Lasting Change by Ellen Pao From the book’s description on Amazon: “In 2015, Ellen K. Pao sued a powerhouse Silicon Valley venture capital firm, calling out workplace discrimination and retaliation against women and other underrepresented groups. Her suit rocked the tech world—and exposed its toxic culture and its homogeneity. Her message overcame negative PR attacks that took aim at her professional conduct and her personal life, and she won widespread public support—Time hailed her as “the face of change.” Though Pao lost her suit, she revolutionized the conversation at tech offices, in the media, and around the world. In Reset, she tells her full story for the first time.”  Female Innovators at Work: Women on Top of Tech by Danielle Newnham From the book’s description on Amazon: “This book describes the experiences and successes of female innovators and entrepreneurs in the still largely male-dominated tech world in twenty candid interviews. It highlights the varied life and career stories that lead these women to the top positions in the technology industry that they are in now. Interviewees include CEOs, founders, and inventors from a wide spectrum of tech organizations across sectors as varied as mobile technology, e-commerce, online education, and video games. Interviewer Danielle Newnham, a mobile startup and e-commerce entrepreneur herself as well as an online community organizer, presents the insights, instructive anecdotes, and advice shared with her in the interviews, including stories about raising capital for one’s start-up, and about the obstacles these women encountered and how they overcame them.”  Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech by Sara Wachter-Boettcher From the book’s description on Amazon: “Buying groceries, tracking our health, finding a date: whatever we want to do, odds are that we can now do it online. But few of us ask how all these digital products are designed, or why. It’s time we change that. Many of the services we rely on are full of oversights, biases, and downright ethical nightmares. Chatbots that harass women. Signup forms that fail anyone who’s not straight. Social media sites that send peppy messages about dead relatives. Algorithms that put more black people behind bars. Technically Wrong takes an unflinching look at the values, processes, and assumptions that lead to these problems and more. Wachter-Boettcher demystifies the tech industry, leaving those of us on the other side of the screen better prepared to make informed choices about the services we use – and demand more from the companies behind them.”  Brotopia: Breaking Up the Boys’ Club of Silicon Valley by Emily Chang From the book’s description on Amazon: “Silicon Valley is not a fantasyland of unicorns, virtual reality rainbows, and 3D-printed lollipops for women in tech. Instead, it’s a “Brotopia,” where men hold the cards and make the rules. While millions of dollars may seem to grow on trees in this land of innovation, tech’s aggressive, misogynistic, work-at-all costs culture has shut women out of the greatest wealth creation in the history of the world. Brotopia reveals how Silicon Valley got so sexist despite its utopian ideals, why bro culture endures even as its companies claim the moral high ground, and how women are speaking out and fighting back. Drawing on her deep network of Silicon Valley insiders, Chang opens the boardroom doors of male-dominated venture capital firms like Kleiner Perkins, the subject of Ellen Pao’s high-profile gender discrimination lawsuit, and Sequoia, where a partner once famously said they “won’t lower their standards” just to hire women. Exposing the flawed logic in common excuses for why tech has long suffered the “pipeline” problem and invests in the delusion of meritocracy, Brotopia also shows how bias coded into AI, internet troll culture, and the reliance on pattern recognition harms not just women in tech but us all, and at an unprecedented scale.”  Life in Code: A Personal History of Technology by Ellen Ullman From the book’s description on Amazon: “The last twenty years have brought us the rise of the internet, the development of artificial intelligence, the ubiquity of once unimaginably powerful computers, and the thorough transformation of our economy and society. Through it all, Ellen Ullman lived and worked inside that rising culture of technology, and in Life in Code she tells the continuing story of the changes it wrought with a unique, expert perspective. When Ellen Ullman moved to San Francisco in the early 1970s and went on to become a computer programmer, she was joining a small, idealistic, and almost exclusively male cadre that aspired to genuinely change the world. In 1997 Ullman wrote Close to the Machine, the now classic and still definitive account of life as a coder at the birth of what would be a sweeping technological, cultural, and financial revolution. Twenty years later, the story Ullman recounts is neither one of unbridled triumph nor a nostalgic denial of progress. It is necessarily the story of digital technology’s loss of innocence as it entered the cultural mainstream, and it is a personal reckoning with all that has changed, and so much that hasn’t. Life in Code is an essential text toward our understanding of the last twenty years–and the next twenty.”  Let us know your thoughts on this list on social media! What books or authors should we add to this list for a future post? Are you a woman who is interested in launching your career in tech? Help us close the gender gap in tech and apply for our Women in Tech scholarship! You can learn more by clicking here. We have a Data Science program that starts on 3/21 and a Web Development program that starts on 4/1. Let us know if you have questions by submitting your application or reaching out to us at admissions@codeup.com!"
2,Dallas Campus Re-opens With New Grant Partner,"We are happy to announce that our Dallas campus re-opened! Better yet, we have a new grant partner that can fund up to $15,000 in tuition for eligible students. Are you a DFW resident who: Is unemployed or under-employed work has been affected by COVID-19 a military veteran under the age of 25 If so, follow the steps below to seek up to $15,000 in tuition funding! Register for an account at www.workintexas.com – click “Sign in” and then select Option 3: Create a User Account Find the nearest Workforce Center to your residence using this tool Attend a WIOA information session – these sessions occur at different times at different locations. Contact your local center for more information after registering at Work in Texas Select Codeup as the program for training, and notify tuition@codeup.com of your progress Our next Web Development class starts January 31st, 2022, and will be the first class in-person in two years! Hands-on experience and learning side-by-side with your fellow classmates is an experience hard to duplicate virtually. COVID-19 protocols are in place to keep all students and staff safe while attending the campus in person. Are you ready to start a new career in the New Year!? Seats are limited, so apply now here or email admissions@codeup.com for more info!"
3,Codeup’s Placement Team Continues Setting Records,"Our Placement Team is simply defined as a group that manages relationships with our employer partners and our graduating students to help get our graduating students hired. Last quarter the Placement Team helped 48 students get hired to life-changing careers in tech. Last month our Placement Team has already placed 40 students with top tech companies. For that, we want to send a huge thank you to both our Placement Team and our Employer Partners who have done a tremendous job of helping Codeup empower life change for these students. Who exactly got hired and where? Check out the list below! Kirsten Collier – hired at CGI as a Java Developer Michael Baker – hired to CGI as a Java Developer Michael Troia – hired to CDW as an Associate Consulting Engineer Carlos Padilla – hired at CGI as a Java Developer Victor G. Hernandez – hired at Anderson Marketing Group as a Java Developer Nicholas Martinez – hired at Seggazza as a Software Engineer Jordan Felan – hired at Seggazza as a Software Engineer Savannah Garcia – hired at CGI as a Java Developer Ricardo Figueroa – hired at CGI as a Java Developer Demetrio Tovar – hired at CGI as a Java Developer Raul Martinez – hired at CGI as a Java Developer Prachi Phatak – hired at Olo as a Software Engineer II Jesse Sosa-Leffew – hired at CGI as a Java Developer Amado Azua – hired at CGI as a Java Developer Grady Griffin – hired at Howard.pro as a Software Developer Stephen Nguyen – hired at Silotech Group as an Associate Software Developer Joshua Borreli – hired at Silotech Group as an Associate Software Developer Erik Ayalasga – hired at Accenture Federal as a Java Developer Christopher Espinoza – hired at Silotech Group as an Associate Software Developer Corey Shaw – hired at CGI as a Java Developer Samuel Bowcut – hired at Silotech Group as an Associate Software Developer Sean Lewis – hired at Accenture Federal as a Software Engineering Associate Robert Sledge – hired at Accenture Federal as a Software Engineering Associate Matthew Walker – hired at Accenture Federal as a Software Engineering Associate Matthew Dalton – hired at Oracle as an APEX Developer Dustin Martinez – hired at Accenture Federal as a Software Engineering Associate Kelvon Pointer-Patterson – hired at Accenture Federal as a Java Developer Associate Kenyon Luce – hired at Accenture Federal as a Java Developer Associate Juan Garcia – hired at Accenture Federal as a Software Engineering Associate Alexander Hernando-Avitia – hired at Accenture Federal as a Software Engineering Associate John Pierce – hired at Accenture Federal as a Software Engineering Associate David Wagnon – hired at Social Solutions as an Associate Software Engineer Evan Williams – hired at Social Solutions as an Associate Software Engineer Rose Barcus – hired at Accenture Federal as a Software Engineering Associate David Culver – hired at Accenture Federal as a Software Engineering Associate Austin Whitley – hired at Accenture Federal as a Software Engineering Associate Anna Vu – hired at Sparrow Partners as a Data Scientist Roberty Murphy – hired at Lone Star Analysis as an Analysis Professional Cindy Villanueva – hired at Colorado Community Managed Care Network as a Data Engineer Motchell Higue – hired at USAA as a Full Stack Developer Curious what it’s like to hire a Codeup graduate? Check out our hiring process here. Want to land a life changing career like these students did? Check our list of programs offered here."
4,"IT Certifications 101: Why They Matter, and Why They Don’t","AWS, Google, Azure, Red Hat, CompTIA…these are big names in IT! And not only for their products, but also for the certifications they offer. If you’re new to tech, you might be wondering: Do certifications really matter? Welcome to IT Certifications 101! What’s the tl;dr? Certifications are critical to getting past HR for certain roles, but they won’t get you the job. What are IT Certifications? Certifications are official credentials given by companies, institutions, and vendors to verify that you have demonstrated a specific knowledge or skill. They are a stamp of approval from a known brand (like Amazon) saying “This person knows their stuff.” Certs are used in a lot of industries, including technology. Even within tech, there are dozens of vendors and hundreds of certs. Why do IT Certifications matter to employers? Many roles (including security, system administration, and networking) require certain certs to make it in the door of an interview process. This filter matters for several reasons: it helps narrow down the pool of applicants, it establishes a baseline of shared knowledge or experience across all applicants, and it expedites the process so that employers don’t have to spend as much time vetting basic technical skills. Why do they matter to you? Obtaining certs in IT will provide you with a number of benefits: You will show up in recruiter searches on LinkedIn! You can follow specific, concrete paths for your career development You obtain industry-recognized, respected, and transferable credentials relevant for your whole career You develop relevant and broad-base content knowledge You will be able to pass initial HR screenings and land a seat in the interview Why DON’T they matter to employers and you? At the end of the day, certifications are just pieces of paper. They represent your ability to pass a test, not to do a job. Sometimes those things are one and the same, but usually, they are very different. Certifications are like the entrance fee to get into a tournament: you still need to fight your way to the top. And when you’re standing in the ring, it’s better to know how to fight than to have a degree in fighting. Additionally, not all jobs within tech care about certifications at all. For example, there is technically a Java programming certification from Oracle, but it’s almost never required or considered for job applications because it tests very specific, abnormal, and academic concepts that don’t apply to most real-world software development. If Certs matter AND don’t matter, what do I do? At Codeup, we like to think of our IT training as CertificationsPLUS, as in – you get certifications PLUS hands-on experience. Here’s a quick way to think about it: Amazon Web Services has basic certifications called Cloud Practitioner and Solutions Architect Associate. The official Amazon training for these certs is 1 week: 4 days of training and 1 day of exam readiness. The 1-week schedule will present to you mostly everything you need to pass the certification exam, and then it’s on you to go cram and memorize to get ready. Once you pass, you’ll have the cert, but you’ll be a paper tiger: all bark and no bite. In other words, you’ll appear knowledgeable and skillful on paper, but in reality, you won’t have any experience or know-how for completing the real work. Alternatively, Codeup spends 5 weeks on AWS, combining exam study and hands-on practice. That way, you can talk the talk and walk the walk. If you want to learn more about IT certification and training paths, reach out to us at info@codeup.com!"


In [10]:
'''
def get_blog_articles():
    output = {}
    output['title'] = soup.select_one('h1.entry-title').text
    output['contents'] = soup.select_one('div.entry-content').text
    return output
    '''

---

# 2. News Articles

We will now be scraping text data from inshorts, a website that provides a brief overview of many different topics.

Write a function that scrapes the news articles for the following topics:

- Business
- Sports
- Technology
- Entertainment
The end product of this should be a function named 
- get_news_articles 
    - that returns a list of dictionaries, where each dictionary has this shape:

In [21]:
{
    'title': 'The article title',
    'content': 'The article content',
    'category': 'business' # for example
}

{'title': 'The article title',
 'content': 'The article content',
 'category': 'business'}

Hints:

- Start by inspecting the website in your browser. Figure out which elements will be useful.
- Start by creating a function that handles a single article and produces a dictionary like the one above.
- Next create a function that will find all the articles on a single page and call the function you created in the last step for every article on the page.
- Now create a function that will use the previous two functions to scrape the articles from all the pages that you need, and do any additional processing that needs to be done.

In [22]:
def get_news_articles():
    
    
    

IndentationError: expected an indented block (870844849.py, line 4)

In [None]:
url = 'https://inshorts.com/en/read'
response = get(url, headers={'user-agent': 'Codeup DS'})
soup = BeautifulSoup(response.text)

In [None]:
print(response.text[:400])

---

# 3. Bonus: cache the data

Write your code such that the acquired data is saved locally in some form or fashion. Your functions that retrieve the data should prefer to read the local data instead of having to make all the requests everytime the function is called. Include a boolean flag in the functions to allow the data to be acquired "fresh" from the actual sources (re-writing your local cache).