# Module 12 Challenge
## Deliverable 1: Scrape Titles and Preview Text from Mars News

In [9]:
# Import Splinter and BeautifulSoup
from splinter import Browser
from bs4 import BeautifulSoup as soup
from webdriver_manager.chrome import ChromeDriverManager
import json
from pprint import pprint
import pymongo

In [10]:
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

### Step 1: Visit the Website

1. Use automated browsing to visit the [Mars NASA news site](https://redplanetscience.com). Inspect the page to identify which elements to scrape.

      > **Hint** To identify which elements to scrape, you might want to inspect the page by using Chrome DevTools.

In [11]:
# Visit the Mars NASA news site: https://redplanetscience.com
url = 'https://redplanetscience.com'
browser.visit(url)

### Step 2: Scrape the Website

Create a Beautiful Soup object and use it to extract text elements from the website.

In [12]:
# Create a Beautiful Soup object
html = browser.html
soup = soup(html ,'html.parser')

### Step 3: Store the Results

Extract the titles and preview text of the news articles that you scraped. Store the scraping results in Python data structures as follows:

* Store each title-and-preview pair in a Python dictionary. And, give each dictionary two keys: `title` and `preview`. An example is the following:

  ```python
  {'title': "Mars Rover Begins Mission!", 
        'preview': "NASA's Mars Rover begins a multiyear mission to collect data about the little-explored planet."}
  ```

* Store all the dictionaries in a Python list.

* Print the list in your notebook.

In [13]:
# Extract all the text elements
list_text_list = soup.find_all(class_ = 'list_text')

# Create an empty list to store the dictionaries
title_preview_list = []

# Loop through the text elements
# Extract the title and preview text from the elements
# Store each title and preview pair in a dictionary
# Add the dictionary to the list

for list_text in list_text_list:    
    key = list_text.find("div", class_= "content_title").text
    value = list_text.find("div", class_= "article_teaser_body").text
    title_preview_dict = {'title':key, 'preview':value}
    title_preview_list.append(title_preview_dict )

    # Print the list to confirm success
print(title_preview_list) 
    
   

[{'title': "NASA's Mars Helicopter Attached to Mars 2020 Rover ", 'preview': 'The helicopter will be first aircraft to perform flight tests on another planet.'}, {'title': "Mars InSight Lander to Push on Top of the 'Mole'", 'preview': 'Engineers have a plan for pushing down on the heat probe, which has been stuck at the Martian surface for a year.'}, {'title': "NASA's MAVEN Observes Martian Night Sky Pulsing in Ultraviolet Light", 'preview': 'Vast areas of the Martian night sky pulse in ultraviolet light, according to images from NASA’s MAVEN spacecraft. The results are being used to illuminate complex circulation patterns in the Martian atmosphere.'}, {'title': 'Mars 2020 Unwrapped and Ready for More Testing', 'preview': "In time-lapse video, bunny-suited engineers remove the inner layer of protective foil on NASA's Mars 2020 rover after it was relocated for testing."}, {'title': "NASA's Perseverance Rover Goes Through Trials by Fire, Ice, Light and Sound", 'preview': "The agency's ne

### (Optional) Step 4: Export the Data

Optionally, store the scraped data in a file or database (to ease sharing the data with others). To do so, export the scraped data to either a JSON file or a MongoDB database.

In [6]:
# Export data to JSON
json_string = json.dumps(title_preview_list)
print(json_string)

[{"title": "The Man Who Wanted to Fly on Mars", "preview": "The Mars Helicopter is riding to the Red Planet this summer with NASA's Perseverance rover. The helicopter's chief engineer, Bob Balaram, shares the saga of how it came into being."}, {"title": "NASA Invites Public to Share Excitement of Mars 2020 Perseverance Rover Launch", "preview": "There are lots of ways to participate in the historic event, which is targeted for July 30."}, {"title": "NASA, ULA Launch Mars 2020 Perseverance Rover Mission to Red Planet", "preview": "The agency's Mars 2020 mission is on its way. It will land at Jezero Crater in about seven months, on Feb. 18, 2021. "}, {"title": "Q&A with the Student Who Named Ingenuity, NASA's Mars Helicopter", "preview": "As a longtime fan of space exploration, Vaneeza Rupani appreciates the creativity and collaboration involved with trying to fly on another planet."}, {"title": "NASA's Perseverance Rover Will Peer Beneath Mars' Surface ", "preview": "The agency's newest

In [7]:
with open('mars.json', 'w') as f:
    json.dump(json_string, f)

In [8]:
# Export data to MongoDB
# The default port used by MongoDB is 27017
# https://docs.mongodb.com/manual/reference/default-mongodb-port/
conn = 'mongodb://localhost:27017'
client = pymongo.MongoClient(conn)


# Define the 'classDB' database in Mongo
db = client.mars_DB

db.title_preview.drop()

#collection = db['<<INSERT NAME OF COLLECTION>>']

collection = db['title_preview']

# Insert collection
collection.insert_many(title_preview_list)


<pymongo.results.InsertManyResult at 0x1d2e46d6fc8>

In [9]:
browser.quit()