In [12]:
class Content:
    """Common base class for all articles/pages"""

    def __init__(self, topic, url, title, body):
        self.topic = topic
        self.title = title
        self.body = body
        self.url = url

    def print(self):
        """
        Flexible printing function controls output
        """
        print('New article found for topic: {}'.format(self.topic))
        print('URL: {}'.format(self.url))
        print('TITLE: {}'.format(self.title))
        print('BODY:\n{}'.format(self.body))

In [13]:
class Website:
    """Contains information about website structure"""

    def __init__(self, name, url, searchUrl, resultListing, resultUrl, absoluteUrl, titleTag, bodyTag):
        self.name = name
        self.url = url
        self.searchUrl = searchUrl
        self.resultListing = resultListing
        self.resultUrl = resultUrl
        self.absoluteUrl = absoluteUrl
        self.titleTag = titleTag
        self.bodyTag = bodyTag

In [14]:
import requests
from bs4 import BeautifulSoup

class Crawler:

    def getPage(self, url):
        try:
            req = requests.get(url)
        except requests.exceptions.RequestException:
            return None
        return BeautifulSoup(req.text, 'html.parser')

    def safeGet(self, pageObj, selector):
        childObj = pageObj.select(selector)
        if childObj is not None and len(childObj) > 0:
            return childObj[0].get_text()
        return ''

    def search(self, topic, site):
        """
        Searches a given website for a given topic and records all pages found
        """
        bs = self.getPage(site.searchUrl + topic)
        searchResults = bs.select(site.resultListing)
        for result in searchResults:
            url = result.select(site.resultUrl)[0].attrs['href']
            # Check to see whether it's a relative or an absolute URL
            if(site.absoluteUrl):
                bs = self.getPage(url)
            else:
                bs = self.getPage(site.url + url)
            if bs is None:
                print('Something was wrong with that page or URL. Skipping!')
                return
            title = self.safeGet(bs, site.titleTag)
            body = self.safeGet(bs, site.bodyTag)
            if title != '' and body != '':
                content = Content(topic, title, body, url)
                content.print()


crawler = Crawler()

siteData = [
    ['O\'Reilly Media', 'http://oreilly.com', 'https://ssearch.oreilly.com/?q=',
        'article.product-result', 'p.title a', True, 'h1', 'section#product-description'],
    ['Reuters', 'http://reuters.com', 'http://www.reuters.com/search/news?blob=', 'div.search-result-content',
        'h3.search-result-title a', False, 'h1', 'div.StandardArticleBody_body_1gnLA'],
    ['Brookings', 'http://www.brookings.edu', 'https://www.brookings.edu/search/?s=',
        'div.list-content article', 'h4.title a', True, 'h1', 'div.post-body']
]
sites = []
for row in siteData:
    sites.append(Website(row[0], row[1], row[2],
                         row[3], row[4], row[5], row[6], row[7]))

topics = ['python', 'data science']
for topic in topics:
    print('GETTING INFO ABOUT: ' + topic)
    for targetSite in sites:
        crawler.search(topic, targetSite)

GETTING INFO ABOUT: python
New article found for topic: python
URL: Learning Python, 5th Edition 
TITLE: 
Get a comprehensive, in-depth introduction to the core Python language with this hands-on book. Based on author Mark Lutz’s popular training course, this updated fifth edition will help you quickly write efficient, high-quality code with Python. It’s an ideal way to begin, whether you’re new to programming or a professional developer versed in other languages. 

Complete with quizzes, exercises, and helpful illustrations,  this easy-to-follow, self-paced tutorial gets you started with both Python 2.7 and 3.3— the latest releases in the 3.X  and 2.X lines—plus all other releases in common use today. You’ll also learn some advanced language features that recently have become more common in Python code.

Explore Python’s major built-in object types such as numbers, lists, and dictionaries 
Create and process objects with Python statements, and learn Python’s general syntax model
Use f

New article found for topic: python
URL: Leveraging the disruptive power of artificial intelligence for fairer opportunities
TITLE: 
According to President Obama’s Council of Economic Advisers (CEA), approximately 3.1 million jobs will be rendered obsolete or permanently altered as a consequence of artificial intelligence technologies. Artificial intelligence (AI) will, for the foreseeable future, have a significant disruptive impact on jobs. That said, this disruption can create new opportunities if policymakers choose to harness them—including some with the potential to help address long-standing social inequities. Investing in quality training programs that deliver premium skills, such as computational analysis and cognitive thinking, provides a real opportunity to leverage AI’s disruptive power.

Author






Makada Henry-Nickie
David M. Rubenstein Fellow - Governance Studies, Race, Prosperity, and Inclusion Initiative

 Twitter
mhnickie





AI’s disruption presents a clear challe

New article found for topic: python
URL: Skills, success, and why your choice of college matters
TITLE: 


Amidst growing frustration with the cost of higher education, complaints also abound about its quality. One critique, launched in the book Academically Adrift by two sociologists, finds little evidence that college students score better on measures of critical thinking, writing, and reasoning after attending college. This is something of a paradox, since strong evidence shows that attending college tends to raise earnings power, even for students who start with mediocre preparation. 
Our recent study uses a different approach to assess the value of a college education. We find that the particular skills listed by a college’s alumni on their resumes predict how well graduates from those schools perform in terms of earning a living, meeting debt obligations, and working for high-paying or innovative companies. Since jobs requiring more valuable skills typically require at least some

New article found for topic: python
URL: Modeling with Data: Tools and Techniques for Scientific Computing
TITLE: 

PREFACE


BODY:
https://www.brookings.edu/articles/modeling-with-data-tools-and-techniques-for-scientific-computing/
New article found for topic: python
URL: Forum: Debating Bush’s Wars
TITLE: 

In the 

Winter 2007–08 issue 
of Survival, Philip Gordon argued that America’s strategy against terror is failing ‘because the Bush administration chose to wage the wrong war’. Survival invited former Bush speechwriter and Deputy Assistant to the President Peter Wehner and Kishore Mahbubani, Dean and Professor at the Lee Kuan Yew School of Public Policy in Singapore, to reflect on Gordon’s arguments. Their 
comments are available in the above PDF and Philip Gordon’s response is below.

BODY:
https://www.brookings.edu/articles/forum-debating-bushs-wars/
New article found for topic: python
URL: Appointments Apocalypse
TITLE: 

BODY:
https://www.brookings.edu/opinions/appointments-a

New article found for topic: data science
URL: Data Science Fundamentals for Marketing and Business Professionals
TITLE: 
		What do data scientists and analysts do? What software languages do they use and what soft and hard skills are required? Data science evangelist Tomi Mester answers these questions and more in this peek into the work world of data professionals.

You'll get an introduction into how to use coding, statistics and business thinking for data projects. You’ll see a demonstration of data science's three essential languages (SQL, Python, and R). As you explore the types of business thinking that data professionals use, Tomi will show you the statistical tools and methods data scientists and analysts use in their jobs, and you'll learn about the pathways you can take to become a data scientist.

 Understand what data scientists and analysts do, how they work, and how they think
 Learn about the three data languages every data scientist and analyst must know
 Improve your 

New article found for topic: data science
URL: Artificial intelligence and data analytics in India
TITLE: 
Advances in artificial intelligence and data analytics are propelling innovation in many parts of the world.[1] China, for example, has committed $150 billion towards its goal of becoming a world leader by 2030.[2] And while the United States government is investing only $1.1 billion in non-classified AI research, its private sector is spending billions in fields from finance and healthcare to retail and defense.[3] This is transforming a number of different sectors.[4]

Authors






Shamika Ravi
Director of Research - Brookings India Senior Fellow - Governance Studies Senior Fellow - Brookings India

 Twitter
@ShamikaRavi








Darrell M. West
Vice President and Director - Governance Studies Founding Director - Center for Technology Innovation

 Twitter
@DarrWest





Yet India is playing catch-up in these vital areas. It devotes only 0.6 percent of GDP to R&D, well below the 

New article found for topic: data science
URL: The opportunities and challenges of data analytics in health care
TITLE: 
Data analytics tools have the potential to transform health care in many different ways. In the near future, routine doctor’s visits may be replaced by regularly monitoring one’s health status and remote consultations. The inpatient setting will be improved by more sophisticated quality metrics drawn from an ecosystem of interconnected digital health tools. The care patients receive may be decided in consultation with decision support software that is informed not only by expert judgments but also by algorithms that draw on information from patients around the world, some of whom will differ from the “typical” patient. Support may be customized for an individual’s personal genetic information, and doctors and nurses will be skilled interpreters of advanced ways to diagnose, track, and treat illnesses. In a number of different ways, policymakers are likely to have new

New article found for topic: data science
URL: Charts of the week: Advancing women and girls in science
TITLE: 
“On this International Day, I urge commitment to end bias, greater investments in science, technology, engineering and math education for all women and girls as well as opportunities for their careers and longer-term professional advancement so that all can benefit from their ground-breaking future contributions.” — UN Secretary-General António Guterres
Three years ago, the UN proclaimed February 11 the International Day of Women and Girls in Science. This new designation was part of a larger effort toward closing gender gaps around the globe, as outline in the 2030 Sustainable Development Goals. Though more women are pursuing careers in science, technology, engineering, and mathematics (STEM), it is clear that gender gaps in these fields—and harmful biases– persist today.
Highlighted below are charts and commentary from Brookings experts on the state of gender equity in STEM

New article found for topic: data science
URL: Using big data to link poor farmers to finance
TITLE: 
Two billion adults in the world are excluded from credit. The situation is especially bad for small farmers in rural areas who are unable to access loans to invest in their farms, trapped in a vicious circle of low productivity, low yields, and poor income. The Initiative for Smallholder Finance estimates that smallholders globally access just $50 billion of the $200 billion of lending that they require to grow their operations and improve their lives.	
Authors




R



Roy Parizat
Senior Economist - World Bank





H



Heinz-Wilhelm Strubenhoff
Agribusiness Program Manager, World Bank Group





The global growth of microfinance banks has created new opportunities for financial inclusion, with outstanding lending of $100 billion to around 200 million clients. Yet the majority of lending from microfinance institutions has been to urban populations and not to the rural poor or small fa

New article found for topic: data science
URL: Using big data and artificial intelligence to accelerate global development
TITLE: 
When U.N. member states unanimously adopted the 2030 Agenda in 2015, the narrative around global development embraced a new paradigm of sustainability and inclusion—of planetary stewardship alongside economic progress, and inclusive distribution of income. This comprehensive agenda—merging social, economic and environmental dimensions of sustainability—is not supported by current modes of data collection and data analysis, so the report of the High-Level Panel on the post-2015 development agenda called for a “data revolution” to empower people through access to information.1

Authors




J



Jennifer L. Cohen
Assistant to the Vice President and Director, Global Economy and Development Program - The Brookings Institution







Homi Kharas
Interim Vice President and Director - Global Economy and Development




Today, a central development problem is that h

New article found for topic: data science
URL: Identifying student readiness through science learning progressions in the Philippines
TITLE: 
Learning progressions have been described as roadmaps that help align curriculum, pedagogy, and assessment. These roadmaps point out important locations on the typical journey from novice to expert by describing what someone at each important location on the roadmap knows and can do. The descriptions highlight what is unique about each location, ensuring that differences between locations are emphasized so the transformations in skills and knowledge along the learning journey can be recognized. By mapping the journey, learning progressions help ensure the curriculum developers, teachers, and assessment designers are all working with the same destination in mind.
Increasingly, learning progressions are being used both to construct assessment tools that will reflect the learning pathway, as well as to locate how students are meeting learning goals.

New article found for topic: data science
URL: The “smart society” of the future doesn’t look like science fiction
TITLE: 

Authors






Bhaskar Chakravorti
Non-Resident Senior Fellow - Brookings India





R



Ravi Shankar Chaturvedi
Associate Director for Research - Fletcher’s Institute for Business in the Global Context, Tufts University Doctoral Research Fellow for Innovation - Fletcher’s Institute for Business in the Global Context, Tufts University




What is a “smart” society? While flights of imagination from science-fiction writers, filmmakers, and techno-futurists involve things like flying cars and teleportation, in practice smart technology is making inroads in a piecemeal fashion, often in rather banal circumstances. In Chicago, for example, predictive analytics is improving health inspections schedules in restaurants, while in Boston city officials are collaborating with Waze, the traffic navigation app company, combining its data with inputs from street cameras and se

New article found for topic: data science
URL: Controlling Cambridge Analytica: Managing the new risks of personal data collection
TITLE: 
Revelations continue to surface about how broadly companies share personal data. This week, The New York Times reports how Facebook shared vast amounts of its user’s personal information with phone and mobile device makers.

Authors






Micah Altman
Former Brookings Expert Head/Scientist, Program on Information Science - Massachusetts Institute of Technology

 Twitter
drmaltman








Alexandra Wood
Fellow, Berkman Klein Center for Internet & Society - Harvard University




Collection of personal information has expanded beyond what anyone could have imagined. Information technologies, such as our always-on internet connections, fitness trackers, and mobile phones make it easier to automatically collect information from a wide range of human activities at frequent intervals. And, increasingly, corporations and governments are collecting, analyzi

New article found for topic: data science
URL: Views of AI, robots, and automation based on internet search data
TITLE: 
Executive Summary
Artificial intelligence, robots, and automation are rising in importance in many areas. As noted in the recent book, “The Future of Work:  Robots, AI, and Automation,” there are exciting advances in finance, transportation, national defense, smart cities, and health care, among other areas. Businesses are developing solutions that improve the efficiency and effectiveness of their operations and using these tools to improve the way their firms function.
Yet there also are concerns about the impact of these developments on jobs and personal privacy. A Pew Research Center national survey revealed considerable unease about emerging trends. It found that 65 percent of American adults think that in 50 years, robots and computers “will do much of the work currently done by humans.”1 A 2018 Brookings survey found that 49 percent on adult online users worry 

New article found for topic: data science
URL: Is college choice impacted by data in the College Scorecard?
TITLE: 
National Decision Day—the day thousands of high school students finalize their college plans—is around the corner.  Colleges want to know whether students will attend their institution and after the dust settles, they also want to know how students arrived at those decisions.  Getting inside a teenager’s head is both scary and critical for education policy.  Research has shown that, nationally, approximately 41 percent of prospective college students enroll in a college that is of lower academic quality than they could attend (“under-match”).  This is of concern because attending lower academic quality institutions may decrease the likelihood that students will persist to graduation and, as one well-known study points out, the benefits of a high academic quality education do not necessarily come at an increased cost.  So why do high schoolers make seemingly sub-optimal ed