# Module 12 Challenge
## Deliverable 1: Scrape Titles and Preview Text from Mars News

In [1]:
# Import Splinter and BeautifulSoup
from splinter import Browser
from bs4 import BeautifulSoup as soup
from webdriver_manager.chrome import ChromeDriverManager

In [4]:
executable_path = {'executable_path': ChromeDriverManager().install()}

[WDM] - Downloading: 100%|██████████| 6.58M/6.58M [00:17<00:00, 391kB/s]


In [5]:
browser = Browser('chrome', **executable_path, headless=False)

### Step 1: Visit the Website

1. Use automated browsing to visit the [Mars NASA news site](https://redplanetscience.com). Inspect the page to identify which elements to scrape.

      > **Hint** To identify which elements to scrape, you might want to inspect the page by using Chrome DevTools.

In [6]:
# Visit the Mars NASA news site: https://redplanetscience.com
mars_news_url = 'https://redplanetscience.com'
browser.visit(mars_news_url)

### Step 2: Scrape the Website

Create a Beautiful Soup object and use it to extract text elements from the website.

In [7]:
# Create a Beautiful Soup object
mars_news_html = browser.html
mars_soup = soup(mars_news_html, 'html.parser')

In [8]:
# Extract all the text elements
# Extract news title
title = mars_soup.find_all('div', class_='content_title')
title

[<div class="content_title">NASA's MAVEN Observes Martian Night Sky Pulsing in Ultraviolet Light</div>,
 <div class="content_title">NASA's Curiosity Takes Selfie With 'Mary Anning' on the Red Planet</div>,
 <div class="content_title">HiRISE Views NASA's InSight and Curiosity on Mars</div>,
 <div class="content_title">3 Things We've Learned From NASA's Mars InSight </div>,
 <div class="content_title">NASA's Mars Perseverance Rover Passes Flight Readiness Review</div>,
 <div class="content_title">With Mars Methane Mystery Unsolved, Curiosity Serves Scientists a New One: Oxygen</div>,
 <div class="content_title">7 Things to Know About the Mars 2020 Perseverance Rover Mission</div>,
 <div class="content_title">NASA's Curiosity Keeps Rolling As Team Operates Rover From Home</div>,
 <div class="content_title">NASA's New Mars Rover Will Use X-Rays to Hunt Fossils</div>,
 <div class="content_title">NASA's Mars Perseverance Rover Gets Its Sample Handling System</div>,
 <div class="content_title

In [9]:
# Extract news preview
preview = mars_soup.find_all('div', class_='article_teaser_body')
preview

[<div class="article_teaser_body">Vast areas of the Martian night sky pulse in ultraviolet light, according to images from NASA’s MAVEN spacecraft. The results are being used to illuminate complex circulation patterns in the Martian atmosphere.</div>,
 <div class="article_teaser_body">The Mars rover has drilled three samples of rock in this clay-enriched region since arriving in July.</div>,
 <div class="article_teaser_body">New images taken from space offer the clearest orbital glimpse yet of InSight as well as a view of Curiosity rolling along.</div>,
 <div class="article_teaser_body">Scientists are finding new mysteries since the geophysics mission landed two years ago.</div>,
 <div class="article_teaser_body">​The agency's Mars 2020 mission has one more big prelaunch review – the Launch Readiness Review, on July 27.</div>,
 <div class="article_teaser_body">For the first time in the history of space exploration, scientists have measured the seasonal changes in the gases that fill th

### Step 3: Store the Results

Extract the titles and preview text of the news articles that you scraped. Store the scraping results in Python data structures as follows:

* Store each title-and-preview pair in a Python dictionary. And, give each dictionary two keys: `title` and `preview`. An example is the following:

  ```python
  {'title': "Mars Rover Begins Mission!", 
        'preview': "NASA's Mars Rover begins a multiyear mission to collect data about the little-explored planet."}
  ```

* Store all the dictionaries in a Python list.

* Print the list in your notebook.

In [10]:
# Create an empty list to store the dictionaries
news_list = []

In [12]:
# Loop through the text elements
# Extract the title and preview text from the elements
# Store each title and preview pair in a dictionary
# Add the dictionary to the list
for i in range(len(title)):
    news_dict = {'title':title[i].text, 'preview':preview[i].text}
    news_list.append(news_dict)
    

In [15]:
# Print the list to confirm success
from pprint import pprint
pprint(news_list)

[{'preview': 'Vast areas of the Martian night sky pulse in ultraviolet light, '
             'according to images from NASA’s MAVEN spacecraft. The results '
             'are being used to illuminate complex circulation patterns in the '
             'Martian atmosphere.',
  'title': "NASA's MAVEN Observes Martian Night Sky Pulsing in Ultraviolet "
           'Light'},
 {'preview': 'The Mars rover has drilled three samples of rock in this '
             'clay-enriched region since arriving in July.',
  'title': "NASA's Curiosity Takes Selfie With 'Mary Anning' on the Red "
           'Planet'},
 {'preview': 'New images taken from space offer the clearest orbital glimpse '
             'yet of InSight as well as a view of Curiosity rolling along.',
  'title': "HiRISE Views NASA's InSight and Curiosity on Mars"},
 {'preview': 'Scientists are finding new mysteries since the geophysics '
             'mission landed two years ago.',
  'title': "3 Things We've Learned From NASA's Mars InSi

In [16]:
browser.quit()

### (Optional) Step 4: Export the Data

Optionally, store the scraped data in a file or database (to ease sharing the data with others). To do so, export the scraped data to either a JSON file or a MongoDB database.

In [18]:
# Export data to JSON
import json
jsonString = json.dumps(news_list)
jsonFile = open("Mars_News.json", "w")
jsonFile.write(jsonString)
jsonFile.close()

In [None]:
# Export data to MongoDB
