# Module 12 Challenge
## Deliverable 1: Scrape Titles and Preview Text from Mars News

In [1]:
# Import Splinter and BeautifulSoup
from splinter import Browser
from bs4 import BeautifulSoup
from webdriver_manager.chrome import ChromeDriverManager

#Imports used for exporting
import json
import pymongo

In [2]:
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

### Step 1: Visit the Website

1. Use automated browsing to visit the [Mars NASA news site](https://redplanetscience.com). Inspect the page to identify which elements to scrape.

      > **Hint** To identify which elements to scrape, you might want to inspect the page by using Chrome DevTools.

In [3]:
# Visit the Mars NASA news site: https://redplanetscience.com
url = 'https://redplanetscience.com/'

browser.visit(url)

### Step 2: Scrape the Website

Create a Beautiful Soup object and use it to extract text elements from the website.

In [4]:
# Create a Beautiful Soup object
html = browser.html
soup = BeautifulSoup(html, 'html.parser')

sections = soup.find_all('div', class_ = 'col-md-8')

for x in sections:
    title = x.find('div', class_='content_title').text
    preview_text = x.find('div', class_='article_teaser_body').text
    print(title)
    print(preview_text)
    print('----------')

    

Q&A with the Student Who Named Ingenuity, NASA's Mars Helicopter
As a longtime fan of space exploration, Vaneeza Rupani appreciates the creativity and collaboration involved with trying to fly on another planet.
----------
3 Things We've Learned From NASA's Mars InSight 
Scientists are finding new mysteries since the geophysics mission landed two years ago.
----------
The Man Who Wanted to Fly on Mars
The Mars Helicopter is riding to the Red Planet this summer with NASA's Perseverance rover. The helicopter's chief engineer, Bob Balaram, shares the saga of how it came into being.
----------
NASA to Hold Mars 2020 Perseverance Rover Launch Briefing
Learn more about the agency's next Red Planet mission during a live event on June 17.
----------
NASA's Perseverance Rover 100 Days Out
Mark your calendars: The agency's latest rover has only about 8,640,000 seconds to go before it touches down on the Red Planet, becoming history's next Mars car.
----------
NASA's MAVEN Maps Winds in the Marti

### Step 3: Store the Results

Extract the titles and preview text of the news articles that you scraped. Store the scraping results in Python data structures as follows:

* Store each title-and-preview pair in a Python dictionary. And, give each dictionary two keys: `title` and `preview`. An example is the following:

  ```python
  {'title': "Mars Rover Begins Mission!", 
        'preview': "NASA's Mars Rover begins a multiyear mission to collect data about the little-explored planet."}
  ```

* Store all the dictionaries in a Python list.

* Print the list in your notebook.

In [5]:
# Create an empty list to store the dictionaries
mars_dict_list = []


In [6]:
# Loop through the text elements
for x in sections:
    
    # Extract the title and preview text from the elements
    title = x.find('div', class_='content_title').text
    preview_text = x.find('div', class_='article_teaser_body').text
    
    # Store each title and preview pair in a dictionary
    new_dict = {
        'title' : title,
        'preview' : preview_text
        }
    
    # Add the dictionary to the list
    mars_dict_list.append(new_dict)


In [7]:
# Print the list to confirm success
mars_dict_list

[{'title': "Q&A with the Student Who Named Ingenuity, NASA's Mars Helicopter",
  'preview': 'As a longtime fan of space exploration, Vaneeza Rupani appreciates the creativity and collaboration involved with trying to fly on another planet.'},
 {'title': "3 Things We've Learned From NASA's Mars InSight ",
  'preview': 'Scientists are finding new mysteries since the geophysics mission landed two years ago.'},
 {'title': 'The Man Who Wanted to Fly on Mars',
  'preview': "The Mars Helicopter is riding to the Red Planet this summer with NASA's Perseverance rover. The helicopter's chief engineer, Bob Balaram, shares the saga of how it came into being."},
 {'title': 'NASA to Hold Mars 2020 Perseverance Rover Launch Briefing',
  'preview': "Learn more about the agency's next Red Planet mission during a live event on June 17."},
 {'title': "NASA's Perseverance Rover 100 Days Out",
  'preview': "Mark your calendars: The agency's latest rover has only about 8,640,000 seconds to go before it touch

In [8]:
browser.quit()

### (Optional) Step 4: Export the Data

Optionally, store the scraped data in a file or database (to ease sharing the data with others). To do so, export the scraped data to either a JSON file or a MongoDB database.

In [9]:
# Preview data to JSON
mars_json_file = json.dumps(mars_dict_list, indent = 2)
print(mars_json_file)

# Export data to JSON
with open('mars_scrape.json', 'w') as json_file:
    json.dump(mars_dict_list, json_file)

[
  {
    "title": "Q&A with the Student Who Named Ingenuity, NASA's Mars Helicopter",
    "preview": "As a longtime fan of space exploration, Vaneeza Rupani appreciates the creativity and collaboration involved with trying to fly on another planet."
  },
  {
    "title": "3 Things We've Learned From NASA's Mars InSight ",
    "preview": "Scientists are finding new mysteries since the geophysics mission landed two years ago."
  },
  {
    "title": "The Man Who Wanted to Fly on Mars",
    "preview": "The Mars Helicopter is riding to the Red Planet this summer with NASA's Perseverance rover. The helicopter's chief engineer, Bob Balaram, shares the saga of how it came into being."
  },
  {
    "title": "NASA to Hold Mars 2020 Perseverance Rover Launch Briefing",
    "preview": "Learn more about the agency's next Red Planet mission during a live event on June 17."
  },
  {
    "title": "NASA's Perseverance Rover 100 Days Out",
    "preview": "Mark your calendars: The agency's latest rover 

In [10]:
# Export data to MongoDB
conn = 'mongodb://localhost:27017'
client = pymongo.MongoClient(conn)

# Define database and collection
db = client.mars_db
collection = db.mars_scrape

# Insert each dictionary into MongoDB as a document
for x in mars_dict_list:
    collection.insert_one(x)



In [11]:
# Display the MongoDB records created above
articles = db.mars_scrape.find()
for article in articles:
    print(article)

{'_id': ObjectId('63ac9d6c1a0564db742e5e44'), 'title': "Three New Views of Mars' Moon Phobos", 'preview': "Taken with the infrared camera aboard NASA's Odyssey orbiter, they reveal temperature variations on the small moon as it drifts into and out of Mars’ shadow."}
{'_id': ObjectId('63ac9d6c1a0564db742e5e45'), 'title': 'AI Is Helping Scientists Discover Fresh Craters on Mars', 'preview': "It's the first time machine learning has been used to find previously unknown craters on the Red Planet."}
{'_id': ObjectId('63ac9d6c1a0564db742e5e46'), 'title': 'Join NASA for the Launch of the Mars 2020 Perseverance Rover', 'preview': 'No matter where you live, choose from a menu of activities to join NASA as we "Countdown to Mars" and launch the Perseverance rover to the Red Planet.'}
{'_id': ObjectId('63ac9d6c1a0564db742e5e47'), 'title': "NASA's Perseverance Rover Is Midway to Mars ", 'preview': "Sometimes half measures can be a good thing – especially on a journey this long. The agency's latest 