# Module 12 Challenge
## Deliverable 1: Scrape Titles and Preview Text from Mars News

In [2]:
# Import Splinter and BeautifulSoup
from splinter import Browser
from bs4 import BeautifulSoup as soup
from webdriver_manager.chrome import ChromeDriverManager

In [3]:
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

### Step 1: Visit the Website

1. Use automated browsing to visit the [Mars NASA news site](https://redplanetscience.com). Inspect the page to identify which elements to scrape.

      > **Hint** To identify which elements to scrape, you might want to inspect the page by using Chrome DevTools.

In [4]:
# Visit the Mars NASA news site: https://redplanetscience.com
browser.visit("https://redplanetscience.com")

### Step 2: Scrape the Website

Create a Beautiful Soup object and use it to extract text elements from the website.

In [5]:
# Create a Beautiful Soup object
html = browser.html
news_soup = soup(html,'html.parser')

In [8]:
# Extract all the text elements
list_text = news_soup.find_all("div", class_="list_text")
list_text[0]

<div class="list_text">
<div class="list_date">January 2, 2023</div>
<div class="content_title">NASA Wins 4 Webbys, 4 People's Voice Awards</div>
<div class="article_teaser_body">Winners include the JPL-managed "Send Your Name to Mars" campaign, NASA's Global Climate Change website and Solar System Interactive.</div>
</div>

### Step 3: Store the Results

Extract the titles and preview text of the news articles that you scraped. Store the scraping results in Python data structures as follows:

* Store each title-and-preview pair in a Python dictionary. And, give each dictionary two keys: `title` and `preview`. An example is the following:

  ```python
  {'title': "Mars Rover Begins Mission!", 
        'preview': "NASA's Mars Rover begins a multiyear mission to collect data about the little-explored planet."}
  ```

* Store all the dictionaries in a Python list.

* Print the list in your notebook.

In [17]:
# Create an empty list to store the dictionaries
list_of_dicts = []

In [18]:
# Loop through the text elements
# Extract the title and preview text from the elements
# Store each title and preview pair in a dictionary
# Add the dictionary to the list

for value in list_text:
    our_dict = {}
    title = value.find('div',class_="content_title").text
    snippit = value.find('div',class_="article_teaser_body").text
    our_dict = {'title':title, 'preview':snippit}
    list_of_dicts.append(our_dict)


In [19]:
# Print the list to confirm success
list_of_dicts

[{'title': "NASA Wins 4 Webbys, 4 People's Voice Awards",
  'preview': 'Winners include the JPL-managed "Send Your Name to Mars" campaign, NASA\'s Global Climate Change website and Solar System Interactive.'},
 {'title': "The Launch Is Approaching for NASA's Next Mars Rover, Perseverance",
  'preview': "The Red Planet's surface has been visited by eight NASA spacecraft. The ninth will be the first that includes a roundtrip ticket in its flight plan. "},
 {'title': 'Naming a NASA Mars Rover Can Change Your Life',
  'preview': 'Want to name the robotic scientist NASA is sending to Mars in 2020? The student who named Curiosity — the rover currently exploring Mars — will tell you this is an opportunity worth taking.'},
 {'title': 'Mars 2020 Unwrapped and Ready for More Testing',
  'preview': "In time-lapse video, bunny-suited engineers remove the inner layer of protective foil on NASA's Mars 2020 rover after it was relocated for testing."},
 {'title': "6 Things to Know About NASA's Ingenui

In [14]:
browser.quit()

### (Optional) Step 4: Export the Data

Optionally, store the scraped data in a file or database (to ease sharing the data with others). To do so, export the scraped data to either a JSON file or a MongoDB database.

In [21]:
# Export data to JSON
import json
#the below comes from https://www.geeksforgeeks.org/reading-and-writing-json-to-a-file-in-python/

# Serializing json
json_object = json.dumps(list_of_dicts, indent=4)
 
# Writing to sample.json
with open("mars_news_text.json", "w") as outfile:
    outfile.write(json_object)

In [33]:
# Export data to MongoDB
# if you haven't started MongoDB make sure to run "brew services start mongodb-community@6.0"
# in your terminal.  Once you are done you can run "brew services stop mongodb-community@6.0"
# in your terminal to stop running MongoDB

from pymongo import MongoClient
mongo = MongoClient('localhost',port=27017)


In [46]:
mars_news_db = mongo['mars_news_text']
mars_news_tbl = mars_news_db['mars_news_table']


In [51]:
#mongo.list_database_names()
#mars_news_db.list_collection_names()


In [48]:
mars_news_tbl.insert_many(list_of_dicts)
#list_of_dicts[0]

<pymongo.results.InsertManyResult at 0x7fabd0086850>

In [49]:
results = mars_news_tbl.find()
for result in results:
    print(result)

{'_id': ObjectId('63b37eb66aadbb6990e97862'), 'title': "NASA Wins 4 Webbys, 4 People's Voice Awards", 'preview': 'Winners include the JPL-managed "Send Your Name to Mars" campaign, NASA\'s Global Climate Change website and Solar System Interactive.'}
{'_id': ObjectId('63b37eb66aadbb6990e97863'), 'title': "The Launch Is Approaching for NASA's Next Mars Rover, Perseverance", 'preview': "The Red Planet's surface has been visited by eight NASA spacecraft. The ninth will be the first that includes a roundtrip ticket in its flight plan. "}
{'_id': ObjectId('63b37eb66aadbb6990e97864'), 'title': 'Naming a NASA Mars Rover Can Change Your Life', 'preview': 'Want to name the robotic scientist NASA is sending to Mars in 2020? The student who named Curiosity — the rover currently exploring Mars — will tell you this is an opportunity worth taking.'}
{'_id': ObjectId('63b37eb66aadbb6990e97865'), 'title': 'Mars 2020 Unwrapped and Ready for More Testing', 'preview': "In time-lapse video, bunny-suited e

In [50]:
#Run if you want to drop the MongoDb Collection and Db
mars_news_db.drop_collection('mars_news_table')
mongo.drop_database('mars_news_text')