# Module 12 Challenge
## Deliverable 1: Scrape Titles and Preview Text from Mars News

In [34]:
# Import Splinter and BeautifulSoup
from splinter import Browser
from bs4 import BeautifulSoup as soup
from webdriver_manager.chrome import ChromeDriverManager
import pymongo

In [13]:
# Setup splinter
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

[WDM] - Downloading: 100%|██████████| 6.58M/6.58M [00:00<00:00, 7.56MB/s]


### Step 1: Visit the Website

1. Use automated browsing to visit the [Mars NASA news site](https://redplanetscience.com). Inspect the page to identify which elements to scrape.

      > **Hint** To identify which elements to scrape, you might want to inspect the page by using Chrome DevTools.

In [14]:
# Visit the Mars NASA news site: https://redplanetscience.com
url = 'https://redplanetscience.com'
browser.visit(url)

### Step 2: Scrape the Website

Create a Beautiful Soup object and use it to extract text elements from the website.

In [15]:
# Create a Beautiful Soup object
html = browser.html
soup = soup(html, 'html.parser')

In [29]:
# Extract all the text elements
responses = soup.find_all('div', class_= "list_text")
for response in responses:
    print(response.text)


January 4, 2023
All About the Laser (and Microphone) Atop Mars 2020, NASA's Next Rover
SuperCam is a rock-vaporizing instrument that will help scientists hunt for Mars fossils.


January 3, 2023
Robotic Toolkit Added to NASA's Mars 2020 Rover
The bit carousel, which lies at the heart of the rover's Sample Caching System, is now aboard NASA's newest rover. 


January 3, 2023
NASA's Push to Save the Mars InSight Lander's Heat Probe
The scoop on the end of the spacecraft's robotic arm will be used to 'pin' the mole against the wall of its hole.


January 3, 2023
NASA to Broadcast Mars 2020 Perseverance Launch, Prelaunch Activities
Starting July 27, news activities will cover everything from mission engineering and science to returning samples from Mars to, of course, the launch itself.


January 2, 2023
From JPL's Mailroom to Mars and Beyond
Bill Allen has thrived as the mechanical systems design lead for three Mars rover missions, but he got his start as a teenager sorting letters for t

### Step 3: Store the Results

Extract the titles and preview text of the news articles that you scraped. Store the scraping results in Python data structures as follows:

* Store each title-and-preview pair in a Python dictionary. And, give each dictionary two keys: `title` and `preview`. An example is the following:

  ```python
  {'title': "Mars Rover Begins Mission!", 
        'preview': "NASA's Mars Rover begins a multiyear mission to collect data about the little-explored planet."}
  ```

* Store all the dictionaries in a Python list.

* Print the list in your notebook.

In [30]:
# Create an empty list to store the dictionaries
articles = []

In [31]:
# Loop through the text elements
for response in responses:

# Extract the title and preview text from the elements
    title = response.find('div', class_='content_title').text
    preview = response.find('div', class_='article_teaser_body').text

# Store each title and preview pair in a dictionary
    dict = {'title': title, 'preview': preview}

# Add the dictionary to the list
    articles.append(dict)
    


In [32]:
# Print the list to confirm success
print(articles)

[{'title': "All About the Laser (and Microphone) Atop Mars 2020, NASA's Next Rover", 'preview': 'SuperCam is a rock-vaporizing instrument that will help scientists hunt for Mars fossils.'}, {'title': "Robotic Toolkit Added to NASA's Mars 2020 Rover", 'preview': "The bit carousel, which lies at the heart of the rover's Sample Caching System, is now aboard NASA's newest rover. "}, {'title': "NASA's Push to Save the Mars InSight Lander's Heat Probe", 'preview': "The scoop on the end of the spacecraft's robotic arm will be used to 'pin' the mole against the wall of its hole."}, {'title': 'NASA to Broadcast Mars 2020 Perseverance Launch, Prelaunch Activities', 'preview': 'Starting July 27, news activities will cover everything from mission engineering and science to returning samples from Mars to, of course, the launch itself.'}, {'title': "From JPL's Mailroom to Mars and Beyond", 'preview': 'Bill Allen has thrived as the mechanical systems design lead for three Mars rover missions, but he 

In [33]:
browser.quit()

### (Optional) Step 4: Export the Data

Optionally, store the scraped data in a file or database (to ease sharing the data with others). To do so, export the scraped data to either a JSON file or a MongoDB database.

In [10]:
# Export data to JSON


In [35]:
# Export data to MongoDB
# Initialize PyMongo to work with MongoDBs
conn = 'mongodb://localhost:27017'
client = pymongo.MongoClient(conn)

In [36]:
# Define database and collection
db = client.mars_news_db
collection = db.articles

In [37]:
#iterate through list and insert into db
for article in articles:
    collection.insert_one(article)

In [41]:
# Display items in MongoDB collection
stories = db.articles.find()

for story in stories:
    print(story)

{'_id': ObjectId('63b5985d0e16ec03ec23c65c'), 'title': "All About the Laser (and Microphone) Atop Mars 2020, NASA's Next Rover", 'preview': 'SuperCam is a rock-vaporizing instrument that will help scientists hunt for Mars fossils.'}
{'_id': ObjectId('63b5985e0e16ec03ec23c65d'), 'title': "Robotic Toolkit Added to NASA's Mars 2020 Rover", 'preview': "The bit carousel, which lies at the heart of the rover's Sample Caching System, is now aboard NASA's newest rover. "}
{'_id': ObjectId('63b5985e0e16ec03ec23c65e'), 'title': "NASA's Push to Save the Mars InSight Lander's Heat Probe", 'preview': "The scoop on the end of the spacecraft's robotic arm will be used to 'pin' the mole against the wall of its hole."}
{'_id': ObjectId('63b5985e0e16ec03ec23c65f'), 'title': 'NASA to Broadcast Mars 2020 Perseverance Launch, Prelaunch Activities', 'preview': 'Starting July 27, news activities will cover everything from mission engineering and science to returning samples from Mars to, of course, the launc