NASA Mars News

- Scrape the NASA Mars News Site and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

    # Example:
    news_title = "NASA's Next Mars Mission to Investigate Interior of Red Planet"
    
    news_p = "Preparation of NASA's next spacecraft to Mars, InSight, has ramped up this summer, on course for launch next May from Vandenberg Air Force Base in central California -- the first interplanetary launch in history from America's West Coast."

JPL Mars Space Images - Featured Image

- Visit the url for JPL Featured Space Image here.
- Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called featured_image_url.
- Make sure to find the image url to the full size .jpg image.
- Make sure to save a complete url string for this image.

    # Example:
    featured_image_url = 'https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA16225_hires.jpg'

Mars Weather

- Visit the Mars Weather twitter account here and scrape the latest Mars weather tweet from the page. Save the tweet text for the weather report as a variable called mars_weather.

    # Example:
    mars_weather = 'Sol 1801 (Aug 30, 2017), Sunny, high -21C/-5F, low -80C/-112F, pressure at 8.82 hPa, daylight 06:09-17:55'

Mars Facts

- Visit the Mars Facts webpage here and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.
- Use Pandas to convert the data to a HTML table string.

Mars Hemispheres

- Visit the USGS Astrogeology site here to obtain high resolution images for each of Mar's hemispheres.
- You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.
- Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys img_url and title.
- Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

    # Example:
    hemisphere_image_urls = [
        {"title": "Valles Marineris Hemisphere", "img_url": "..."},
        {"title": "Cerberus Hemisphere", "img_url": "..."},
        {"title": "Schiaparelli Hemisphere", "img_url": "..."},
        {"title": "Syrtis Major Hemisphere", "img_url": "..."},
    ]

---

Step 2 - MongoDB and Flask Application

Use MongoDB with Flask templating to create a new HTML page that displays all of the information that was scraped from the URLs above.

- Start by converting your Jupyter notebook into a Python script called scrape_mars.py with a function called scrape that will execute all of your scraping code from above and return one Python dictionary containing all of the scraped data.
- Next, create a route called /scrape that will import your scrape_mars.py script and call your scrape function.
  - Store the return value in Mongo as a Python dictionary.
- Create a root route / that will query your Mongo database and pass the mars data into an HTML template to display the data.
- Create a template HTML file called index.html that will take the mars data dictionary and display all of the data in the appropriate HTML elements. Use the following as a guide for what the final product should look like, but feel free to create your own design.





---

Hints

- Use Splinter to navigate the sites when needed and BeautifulSoup to help find and parse out the necessary data.
- Use Pymongo for CRUD applications for your database. For this homework, you can simply overwrite the existing document each time the /scrape url is visited and new data is obtained.
- Use Bootstrap to structure your HTML template.




In [4]:
#dependecies
from bs4 import BeautifulSoup as bs
from splinter import Browser

import pandas as pd
import numpy as np
import time

In [5]:
#path & chrome driver 
executable_path = {'executable_path':'chromedriver'}
browser = Browser('chrome', **executable_path)

In [6]:
#url for nasa
url = 'https://mars.nasa.gov/news/'
browser.visit(url)

In [8]:
#scrap using bs
html = browser.html
soup = bs(html, 'html.parser')

In [9]:
#title and paragraph scrapping
news_title = soup.find('div', 'content_title').text
news_p = soup.find('div', 'article_teaser_body').text
print(news_title)
print(news_p)

NASA Launches a New Podcast to Mars
NASA's new eight-episode series 'On a Mission' follows the InSight spacecraft on its journey to Mars and details the extraordinary challenges of landing on the Red Planet.


In [10]:
#visiting url for jpl
jpl_url = "https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars"
browser.visit(jpl_url)

In [11]:
#use bs to parse using html
jpl_html = browser.html
jpl_bs = bs(jpl_html, 'html.parser')

In [14]:
#- Use splinter to navigate the site and find the image url for the current 
# Featured Mars Image and assign the url string to a variable called featured_image_url.
#- Make sure to find the image url to the full size .jpg image.
#- Make sure to save a complete url string for this image.
featured_image_url = jpl_bs.find('img', class_='thumb')['src']
image_p = f'https://www.jpl.nasa.give{featured_image_url}'
print(image_p)

https://www.jpl.nasa.give/spaceimages/images/wallpaper/PIA22793-640x350.jpg


In [17]:
#mars weather twitter 
mars_weather_url = "https://twitter.com/marswxreport?lang=en"
browser.visit(mars_weather_url)

In [19]:
#parse out the most recent twitter from mars weathers 
mars_weather_html = browser.html
mars_weather_bs = bs(mars_weather_html, "html.parser")

mars_weather = mars_weather_bs.find("ol",{"id":"stream-items-id"}).find("li").find("p").text
print(mars_weather)

Sol 2213 (2018-10-27), high -12C/10F, low -70C/-93F, pressure at 8.74 hPa, daylight 06:11-18:29


In [33]:
#Mars Facts

#Visit the Mars Facts webpage here and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.
#Use Pandas to convert the data to a HTML table string.
mars_facts_url = "https://space-facts.com/mars/"
browser.visit(mars_facts_url)
mars_html = browser.html
mars_bs = bs(mars_html, "html.parser")

In [32]:
table = pd.read_html(mars_facts_url)
table[0]

Unnamed: 0,0,1
0,Equatorial Diameter:,"6,792 km"
1,Polar Diameter:,"6,752 km"
2,Mass:,6.42 x 10^23 kg (10.7% Earth)
3,Moons:,2 (Phobos & Deimos)
4,Orbit Distance:,"227,943,824 km (1.52 AU)"
5,Orbit Period:,687 days (1.9 years)
6,Surface Temperature:,-153 to 20 °C
7,First Record:,2nd millennium BC
8,Recorded By:,Egyptian astronomers


In [36]:
#Mars Hemispheres

#- Visit the USGS Astrogeology site here to obtain high resolution images for each of Mar's hemispheres.
#- You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.
#- Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys img_url and title.
#- Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

In [37]:
usgs_url = "https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars"
browser.visit(usgs_url)

In [38]:
usgs_html = browser.html
usgs_bs = bs(usgs_html, 'html.parser')
mars_hemispheres=[]

In [39]:
#ok.. im a bit lost from here