# Module 12 Challenge
## Deliverable 1: Scrape Titles and Preview Text from Mars News

In [42]:
# Import Splinter and BeautifulSoup
from splinter import Browser
from bs4 import BeautifulSoup as soup
from webdriver_manager.chrome import ChromeDriverManager
import pandas as pd
import json

In [43]:
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

### Step 1: Visit the Website

1. Use automated browsing to visit the [Mars NASA news site](https://redplanetscience.com). Inspect the page to identify which elements to scrape.

      > **Hint** To identify which elements to scrape, you might want to inspect the page by using Chrome DevTools.

In [44]:
# Visit the Mars NASA news site: https://redplanetscience.com
url = 'https://redplanetscience.com'
browser.visit(url)
browser.is_element_present_by_css('div.list_text', wait_time=1)

True

### Step 2: Scrape the Website

Create a Beautiful Soup object and use it to extract text elements from the website.

In [45]:
# Create a Beautiful Soup object
html = browser.html

In [46]:
mars_soup = soup(html, "html.parser")

In [47]:
# Extract all the text elements
title = mars_soup.find_all('div', class_='content_title')
preview = mars_soup.find_all('div', class_='article_teaser_body')

### Above does not work very well. Trying something else.

### Extract all text elements.

In [48]:
news_data = mars_soup.find_all('div', class_='col-md-8')
# confirm articles are retrieved
print('Article Count: ',len(news_data))
for article in news_data:
    print(article)

Article Count:  15
<div class="col-md-8">
<div class="list_text">
<div class="list_date">January 23, 2023</div>
<div class="content_title">New Selfie Shows Curiosity, the Mars Chemist</div>
<div class="article_teaser_body">The NASA rover performed a special chemistry experiment at the location captured in its newest self-portrait.</div>
</div>
</div>
<div class="col-md-8">
<div class="list_text">
<div class="list_date">January 22, 2023</div>
<div class="content_title">NASA's Mars 2020 Rover Closer to Getting Its Name</div>
<div class="article_teaser_body">155 students from across the U.S. have been chosen as semifinalists in NASA's essay contest to name the Mars 2020 rover, and see it launch from Cape Canaveral this July.</div>
</div>
</div>
<div class="col-md-8">
<div class="list_text">
<div class="list_date">January 19, 2023</div>
<div class="content_title">Two of a Space Kind: Apollo 12 and Mars 2020</div>
<div class="article_teaser_body">Apollo 12 and the upcoming Mars 2020 mission

In [49]:
for article in news_data:
    # get article title
    title = article.find('div', class_='content_title')
    # get article preview
    preview = article.find('div', class_='article_teaser_body')
    # confirm results
    print(title.text, preview.text)

New Selfie Shows Curiosity, the Mars Chemist The NASA rover performed a special chemistry experiment at the location captured in its newest self-portrait.
NASA's Mars 2020 Rover Closer to Getting Its Name 155 students from across the U.S. have been chosen as semifinalists in NASA's essay contest to name the Mars 2020 rover, and see it launch from Cape Canaveral this July.
Two of a Space Kind: Apollo 12 and Mars 2020 Apollo 12 and the upcoming Mars 2020 mission may be separated by half a century, but they share several goals unique in the annals of space exploration.
NASA's Curiosity Mars Rover Snaps Its Highest-Resolution Panorama Yet To go along with the stunning 1.8-billion-pixel image, a new video offers a sweeping view of the Red Planet.
From JPL's Mailroom to Mars and Beyond Bill Allen has thrived as the mechanical systems design lead for three Mars rover missions, but he got his start as a teenager sorting letters for the NASA center.
The Man Who Wanted to Fly on Mars The Mars He

### Step 3: Store the Results

Extract the titles and preview text of the news articles that you scraped. Store the scraping results in Python data structures as follows:

* Store each title-and-preview pair in a Python dictionary. And, give each dictionary two keys: `title` and `preview`. An example is the following:

  ```python
  {'title': "Mars Rover Begins Mission!", 
        'preview': "NASA's Mars Rover begins a multiyear mission to collect data about the little-explored planet."}
  ```

* Store all the dictionaries in a Python list.

* Print the list in your notebook.

In [50]:
# Create an empty list to store the dictionaries
news = []

In [26]:
# Loop through the text elements
# Extract the title and preview text from the elements
# Store each title and preview pair in a dictionary
# Add the dictionary to the list
for article in news_data:
    # Store the elements in a dictionary
    article_data = {}
    # confirm article title 
    title = article.find('div', class_='content_title')    
    # confirm article preview
    preview = article.find('div', class_='article_teaser_body')
    # add data to its dict
    article_data['title'] = title.text
    article_data['preview'] = preview.text
    # add dict to the list
    news.append(article_data)

# this snippet present because I wanted a different view on an error; leaving it cause it's pretty
df = pd.DataFrame.from_dict(news)
df

Unnamed: 0,title,preview
0,NASA's Perseverance Rover 100 Days Out,Mark your calendars: The agency's latest rover...
1,MAVEN Maps Electric Currents around Mars that ...,Five years after NASA’s MAVEN spacecraft enter...
2,NASA Moves Forward With Campaign to Return Mar...,"During this next phase, the program will matur..."
3,NASA's Mars Helicopter Attached to Mars 2020 R...,The helicopter will be first aircraft to perfo...
4,A New Video Captures the Science of NASA's Per...,"With a targeted launch date of July 30, the ne..."
5,NASA's Curiosity Mars Rover Takes a New Selfie...,Along with capturing an image before its steep...
6,NASA's Push to Save the Mars InSight Lander's ...,The scoop on the end of the spacecraft's robot...
7,5 Hidden Gems Are Riding Aboard NASA's Perseve...,"The symbols, mottos, and small objects added t..."
8,NASA Prepares for Moon and Mars With New Addit...,Robotic spacecraft will be able to communicate...
9,Robotic Toolkit Added to NASA's Mars 2020 Rover,"The bit carousel, which lies at the heart of t..."


In [32]:
news

[{'title': "NASA's Perseverance Rover 100 Days Out",
  'preview': "Mark your calendars: The agency's latest rover has only about 8,640,000 seconds to go before it touches down on the Red Planet, becoming history's next Mars car."},
 {'title': 'MAVEN Maps Electric Currents around Mars that are Fundamental to Atmospheric Loss',
  'preview': 'Five years after NASA’s MAVEN spacecraft entered into orbit around Mars, data from the mission has led to the creation of a map of electric current systems in the Martian atmosphere.'},
 {'title': 'NASA Moves Forward With Campaign to Return Mars Samples to Earth',
  'preview': 'During this next phase, the program will mature critical technologies and make critical design decisions as well as assess industry partnerships.'},
 {'title': "NASA's Mars Helicopter Attached to Mars 2020 Rover ",
  'preview': 'The helicopter will be first aircraft to perform flight tests on another planet.'},
 {'title': "A New Video Captures the Science of NASA's Perseveranc

In [51]:
browser.quit()

### (Optional) Step 4: Export the Data

Optionally, store the scraped data in a file or database (to ease sharing the data with others). To do so, export the scraped data to either a JSON file or a MongoDB database.

In [41]:
# Export data to JSON
json.dumps(news)

'[{"title": "NASA\'s Perseverance Rover 100 Days Out", "preview": "Mark your calendars: The agency\'s latest rover has only about 8,640,000 seconds to go before it touches down on the Red Planet, becoming history\'s next Mars car."}, {"title": "MAVEN Maps Electric Currents around Mars that are Fundamental to Atmospheric Loss", "preview": "Five years after NASA\\u2019s MAVEN spacecraft entered into orbit around Mars, data from the mission has led to the creation of a map of electric current systems in the Martian atmosphere."}, {"title": "NASA Moves Forward With Campaign to Return Mars Samples to Earth", "preview": "During this next phase, the program will mature critical technologies and make critical design decisions as well as assess industry partnerships."}, {"title": "NASA\'s Mars Helicopter Attached to Mars 2020 Rover ", "preview": "The helicopter will be first aircraft to perform flight tests on another planet."}, {"title": "A New Video Captures the Science of NASA\'s Perseveranc

In [30]:
# Export data to MongoDB
