# Module 12 Challenge
## Deliverable 1: Scrape Titles and Preview Text from Mars News

In [1]:
# Import Splinter and BeautifulSoup
from splinter import Browser
from bs4 import BeautifulSoup as soup
from webdriver_manager.chrome import ChromeDriverManager
import pandas as pd

In [2]:
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

### Step 1: Visit the Website

1. Use automated browsing to visit the [Mars NASA news site](https://redplanetscience.com). Inspect the page to identify which elements to scrape.

      > **Hint** To identify which elements to scrape, you might want to inspect the page by using Chrome DevTools.

In [3]:
# Visit the Mars NASA news site: https://redplanetscience.com
url = 'https://redplanetscience.com'
browser.visit(url)

### Step 2: Scrape the Website

Create a Beautiful Soup object and use it to extract text elements from the website.

In [4]:
# Create a Beautiful Soup object
html = browser.html
soup = soup(html, 'html.parser')


In [5]:
# Extract all the text elements

text_elem = soup('div', class_='list_text')
print(text_elem)


[<div class="list_text">
<div class="list_date">December 18, 2022</div>
<div class="content_title">5 Hidden Gems Are Riding Aboard NASA's Perseverance Rover</div>
<div class="article_teaser_body">The symbols, mottos, and small objects added to the agency's newest Mars rover serve a variety of purposes, from functional to decorative.</div>
</div>, <div class="list_text">
<div class="list_date">December 17, 2022</div>
<div class="content_title">NASA's Mars 2020 Will Hunt for Microscopic Fossils</div>
<div class="article_teaser_body">A new paper identifies a ring of minerals at the rover's landing site that are ideal for fossilizing microbial life.</div>
</div>, <div class="list_text">
<div class="list_date">December 16, 2022</div>
<div class="content_title">Mars Is Getting a New Robotic Meteorologist</div>
<div class="article_teaser_body">Sensors on NASA's Perseverance will help prepare for future human exploration by taking weather measurements and studying dust particles.</div>
</div>,

In [6]:
type(text_elem)

bs4.element.ResultSet

### Step 3: Store the Results

Extract the titles and preview text of the news articles that you scraped. Store the scraping results in Python data structures as follows:

* Store each title-and-preview pair in a Python dictionary. And, give each dictionary two keys: `title` and `preview`. An example is the following:

  ```python
  {'title': "Mars Rover Begins Mission!", 
        'preview': "NASA's Mars Rover begins a multiyear mission to collect data about the little-explored planet."}
  ```

* Store all the dictionaries in a Python list.

* Print the list in your notebook.

In [7]:
# Create an empty list to store the dictionaries
mars_news = []


In [8]:
# Loop through the text elements
# Extract the title and preview text from the elements
# Store each title and preview pair in a dictionary
# Add the dictionary to the list

for text in text_elem:
    title = text.find("div", class_="content_title").text.strip()
    preview = text.find("div", class_="article_teaser_body").text.strip()
    
    mars_dict = {
        "title": title, 
        "preview": preview
    }
    mars_news.append(mars_dict)


In [9]:
# Print the list to confirm success
mars_news

[{'title': "5 Hidden Gems Are Riding Aboard NASA's Perseverance Rover",
  'preview': "The symbols, mottos, and small objects added to the agency's newest Mars rover serve a variety of purposes, from functional to decorative."},
 {'title': "NASA's Mars 2020 Will Hunt for Microscopic Fossils",
  'preview': "A new paper identifies a ring of minerals at the rover's landing site that are ideal for fossilizing microbial life."},
 {'title': 'Mars Is Getting a New Robotic Meteorologist',
  'preview': "Sensors on NASA's Perseverance will help prepare for future human exploration by taking weather measurements and studying dust particles."},
 {'title': 'NASA Moves Forward With Campaign to Return Mars Samples to Earth',
  'preview': 'During this next phase, the program will mature critical technologies and make critical design decisions as well as assess industry partnerships.'},
 {'title': "Celebrate Mars Reconnaissance Orbiter's Views From Above",
  'preview': 'Marking its 15th anniversary sinc

In [10]:
type(mars_news)

list

In [11]:
browser.quit()

### (Optional) Step 4: Export the Data

Optionally, store the scraped data in a file or database (to ease sharing the data with others). To do so, export the scraped data to either a JSON file or a MongoDB database.

In [12]:
mars_news_df = pd.DataFrame(mars_news)
mars_news_df

Unnamed: 0,title,preview
0,5 Hidden Gems Are Riding Aboard NASA's Perseve...,"The symbols, mottos, and small objects added t..."
1,NASA's Mars 2020 Will Hunt for Microscopic Fos...,A new paper identifies a ring of minerals at t...
2,Mars Is Getting a New Robotic Meteorologist,Sensors on NASA's Perseverance will help prepa...
3,NASA Moves Forward With Campaign to Return Mar...,"During this next phase, the program will matur..."
4,Celebrate Mars Reconnaissance Orbiter's Views ...,"Marking its 15th anniversary since launch, one..."
5,NASA's Perseverance Rover Mission Getting in S...,Stacking spacecraft components on top of each ...
6,InSight's 'Mole' Team Peers into the Pit,Efforts to save the heat probe continue.
7,"NASA Wins 4 Webbys, 4 People's Voice Awards","Winners include the JPL-managed ""Send Your Nam..."
8,The MarCO Mission Comes to an End,The pair of briefcase-sized satellites made hi...
9,NASA's Curiosity Rover Finds an Ancient Oasis ...,"New evidence suggests salty, shallow ponds onc..."


In [13]:
# Export data to JSON
mars_news_df.to_json("./part_1_mars_news_json.json")

In [14]:
# Export data to MongoDB
