# Module 12 Challenge
## Deliverable 1: Scrape Titles and Preview Text from Mars News

In [1]:
# Import Splinter and BeautifulSoup
from splinter import Browser
from bs4 import BeautifulSoup as soup
import pandas as pd
import datetime as dt
from webdriver_manager.chrome import ChromeDriverManager

In [2]:
# given to us
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

[WDM] - Downloading: 100%|████████████████████████████████████████████████████████| 6.79M/6.79M [00:00<00:00, 7.71MB/s]


### Step 1: Visit the Website

1. Use automated browsing to visit the [Mars NASA news site](https://redplanetscience.com). Inspect the page to identify which elements to scrape.

      > **Hint** To identify which elements to scrape, you might want to inspect the page by using Chrome DevTools.

In [4]:
# Visit the Mars NASA news site: https://redplanetscience.com 
url = 'https://redplanetscience.com'
browser.visit(url)
# I found this: 
'''
<div class="list_text">
<div class="list_date">February 4, 2023</div>
<div class="content_title">NASA's MAVEN Observes Martian Night Sky Pulsing in Ultraviolet Light</div>
<div class="article_teaser_body">Vast areas of the Martian night sky pulse in ultraviolet light, 
according to images from NASA’s MAVEN spacecraft. The results are being used to illuminate complex circulation patterns in the Martian atmosphere.</div>											 
</div>
'''
browser.is_element_present_by_css("div.list_text", wait_time=1)

True

### Step 2: Scrape the Website

Create a Beautiful Soup object and use it to extract text elements from the website.

In [5]:
# Create a Beautiful Soup object
#set up html parser
html = browser.html
news_soup = soup(html, 'html.parser')
#only look at where the articles' text and images is found.
slide_elem = news_soup.select_one('div.list_text')
print(slide_elem)

<div class="list_text">
<div class="list_date">February 5, 2023</div>
<div class="content_title">NASA's Treasure Map for Water Ice on Mars</div>
<div class="article_teaser_body">A new study identifies frozen water just below the Martian surface, where astronauts could easily dig it up.</div>
</div>


In [11]:
# Extract all the title elements
all_data = news_soup.find_all("div", class_='list_text')
print(all_data)

[<div class="list_text">
<div class="list_date">February 5, 2023</div>
<div class="content_title">NASA's Treasure Map for Water Ice on Mars</div>
<div class="article_teaser_body">A new study identifies frozen water just below the Martian surface, where astronauts could easily dig it up.</div>
</div>, <div class="list_text">
<div class="list_date">February 4, 2023</div>
<div class="content_title">NASA's MAVEN Observes Martian Night Sky Pulsing in Ultraviolet Light</div>
<div class="article_teaser_body">Vast areas of the Martian night sky pulse in ultraviolet light, according to images from NASA’s MAVEN spacecraft. The results are being used to illuminate complex circulation patterns in the Martian atmosphere.</div>
</div>, <div class="list_text">
<div class="list_date">February 1, 2023</div>
<div class="content_title">Celebrate Mars Reconnaissance Orbiter's Views From Above</div>
<div class="article_teaser_body">Marking its 15th anniversary since launch, one of the oldest spacecraft at 

### Step 3: Store the Results

Extract the titles and preview text of the news articles that you scraped. Store the scraping results in Python data structures as follows:

* Store each title-and-preview pair in a Python dictionary. And, give each dictionary two keys: `title` and `preview`. An example is the following:

  ```python
  {'title': "Mars Rover Begins Mission!", 
        'preview': "NASA's Mars Rover begins a multiyear mission to collect data about the little-explored planet."}
  ```

* Store all the dictionaries in a Python list.

* Print the list in your notebook.

In [16]:
# Create an empty list to store the dictionaries
list_of_articles = []

In [33]:
# Loop through the text elements, Extract the title and preview text from the elements

for article in all_data:
     title = article.find("div", class_="content_title").text
     preview = article.find("div", class_="article_teaser_body").text
     article_dict = {"title": title, "preview":preview}
     # print(article_dict)
     list_of_articles.append(article_dict)
    

In [34]:
# it is there now! Finally got it to work!!!! So many hours it took me
print(list_of_articles)

[{'title': "NASA's Treasure Map for Water Ice on Mars", 'preview': 'A new study identifies frozen water just below the Martian surface, where astronauts could easily dig it up.'}, {'title': "NASA's MAVEN Observes Martian Night Sky Pulsing in Ultraviolet Light", 'preview': 'Vast areas of the Martian night sky pulse in ultraviolet light, according to images from NASA’s MAVEN spacecraft. The results are being used to illuminate complex circulation patterns in the Martian atmosphere.'}, {'title': "Celebrate Mars Reconnaissance Orbiter's Views From Above", 'preview': 'Marking its 15th anniversary since launch, one of the oldest spacecraft at the Red Planet has provided glimpses of dust devils, avalanches, and more.'}, {'title': "A New Video Captures the Science of NASA's Perseverance Mars Rover", 'preview': 'With a targeted launch date of July 30, the next robotic scientist NASA is sending to the to the Red Planet has big ambitions.'}, {'title': 'MAVEN Maps Electric Currents around Mars tha

In [35]:
browser.quit()

### (Optional) Step 4: Export the Data

Optionally, store the scraped data in a file or database (to ease sharing the data with others). To do so, export the scraped data to either a JSON file or a MongoDB database.

In [36]:
# Export data to JSON
import json
json_data = json.dumps(list_of_articles)

In [37]:
# show it is there
print(json_data)


[{"title": "NASA's Treasure Map for Water Ice on Mars", "preview": "A new study identifies frozen water just below the Martian surface, where astronauts could easily dig it up."}, {"title": "NASA's MAVEN Observes Martian Night Sky Pulsing in Ultraviolet Light", "preview": "Vast areas of the Martian night sky pulse in ultraviolet light, according to images from NASA\u2019s MAVEN spacecraft. The results are being used to illuminate complex circulation patterns in the Martian atmosphere."}, {"title": "Celebrate Mars Reconnaissance Orbiter's Views From Above", "preview": "Marking its 15th anniversary since launch, one of the oldest spacecraft at the Red Planet has provided glimpses of dust devils, avalanches, and more."}, {"title": "A New Video Captures the Science of NASA's Perseverance Mars Rover", "preview": "With a targeted launch date of July 30, the next robotic scientist NASA is sending to the to the Red Planet has big ambitions."}, {"title": "MAVEN Maps Electric Currents around Mar