# Mission to Mars

In this assignment we will build a web application that scrapes various websites for data related to the Mission to Mars, and displays the information in a single HTML page.

### Step 1 - Scraping

First, let's install some libraries and import some dependencies: 

In [47]:
#!pip install 'selenium==3.5.0'
#!pip install 'splinter==0.7.6'


In [109]:
import pandas as pd
from bs4 import BeautifulSoup as bs
import time 
from splinter import Browser
from flask import Flask, jsonify, request
from werkzeug.wrappers import Request, Response

NASA Mars News

In [49]:
# Open a blank window of Google Chrome.
browser = Browser("chrome", headless=False)

In [50]:
# Visit the NASA newspage using the blank Chrome window. 
nasa_news_url = "https://mars.nasa.gov/news/"
browser.visit(nasa_news_url)

In [51]:
# Get html code from the site and convert it into json. 
html = browser.html
soup = bs(html,"html.parser")

In [52]:
news_title  = soup.find("div",class_="content_title").text
paragraph_text = soup.find("div", class_="article_teaser_body").text
print(f"Title: {news_title}")
print(f"Para: {paragraph_text}")


Title: NASA's InSight Passes Halfway to Mars, Instruments Check In
Para: NASA's InSight spacecraft, en route to a Nov. 26 landing on Mars, passed the halfway mark on Aug. 6. All of its instruments have been tested and are working well.


JPL Mars Space Images - Featured Image

In [62]:
# Visit the JPL site which includes the featured image and extract the html code.  
jpl_image_url = "https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars"
browser.visit(jpl_image_url)

In [67]:
html = browser.html
soup = bs(html,"html.parser")

featured_image_url = soup.find('a', {'id': 'full_image', 'data-fancybox-href': True}).get('data-fancybox-href') 
featured_image_url


'/spaceimages/images/mediumsize/PIA13664_ip.jpg'

In [71]:
split_url = featured_image_url.split('/')

In [75]:
pia_url = split_url[-1]
pia_url

'PIA13664_ip.jpg'

In [87]:
base_image_url = "https://photojournal.jpl.nasa.gov/jpeg/"

In [88]:
pia_final = pia_url.split('_')[0]+'.jpg'

We explored the website: 
We Clicked on the "Full Image" button. On the next page, we clicked the "More Information" button. The following page has a link to the "Full Res (jpg)" -- the full resolution jpg version of the image. This page, and the full resolution image are all hosted at https://photojournal.jpl.nasa.gov/, and the images have a /jpeg/ in the URL.


In [91]:
full_image_url = base_image_url + pia_final
full_image_url

'https://photojournal.jpl.nasa.gov/jpeg/PIA13664.jpg'

Mars Weather

In [92]:
mars_weather_twitter_url = "https://twitter.com/marswxreport?lang=en"
browser.visit(mars_weather_twitter_url)

html = browser.html
soup = bs(html,"html.parser")

mars_weather = soup.find('p', class_='TweetTextSize TweetTextSize--normal js-tweet-text tweet-text').text
mars_weather


'Sol 2147 (2018-08-21), high -15C/5F, low -68C/-90F, pressure at 8.70 hPa, daylight 05:30-17:44'

Mars Facts

In [93]:
mars_facts_url = "https://space-facts.com/mars/"

In [97]:
mars_facts_tb1 = pd.read_html(mars_facts_url)[0]
mars_facts_tb1.columns=['Physical Properties', 'Values']
mars_facts_tb1

Unnamed: 0,Physical Properties,Values
0,Equatorial Diameter:,"6,792 km"
1,Polar Diameter:,"6,752 km"
2,Mass:,6.42 x 10^23 kg (10.7% Earth)
3,Moons:,2 (Phobos & Deimos)
4,Orbit Distance:,"227,943,824 km (1.52 AU)"
5,Orbit Period:,687 days (1.9 years)
6,Surface Temperature:,-153 to 20 °C
7,First Record:,2nd millennium BC
8,Recorded By:,Egyptian astronomers


In [102]:
mars_facts_tb1.to_html().replace('\n','')

'<table border="1" class="dataframe">  <thead>    <tr style="text-align: right;">      <th></th>      <th>Physical Properties</th>      <th>Values</th>    </tr>  </thead>  <tbody>    <tr>      <th>0</th>      <td>Equatorial Diameter:</td>      <td>6,792 km</td>    </tr>    <tr>      <th>1</th>      <td>Polar Diameter:</td>      <td>6,752 km</td>    </tr>    <tr>      <th>2</th>      <td>Mass:</td>      <td>6.42 x 10^23 kg (10.7% Earth)</td>    </tr>    <tr>      <th>3</th>      <td>Moons:</td>      <td>2 (Phobos &amp; Deimos)</td>    </tr>    <tr>      <th>4</th>      <td>Orbit Distance:</td>      <td>227,943,824 km (1.52 AU)</td>    </tr>    <tr>      <th>5</th>      <td>Orbit Period:</td>      <td>687 days (1.9 years)</td>    </tr>    <tr>      <th>6</th>      <td>Surface Temperature:</td>      <td>-153 to 20 °C</td>    </tr>    <tr>      <th>7</th>      <td>First Record:</td>      <td>2nd millennium BC</td>    </tr>    <tr>      <th>8</th>      <td>Recorded By:</td>      <td>Egyptia

Mars Hemispheres

In [104]:
mars_hemi_urls = "https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars"
browser.visit(mars_hemi_urls)

html = browser.html
soup = bs(html,"html.parser")

In [110]:
#Loop through the class="item" by clicking the h3 tag and getting the title and url. 

images = soup.find('div', class_='collapsible results')
mars_hemi_urls = []

for i in range(len(images.find_all("div", class_="item"))):
    time.sleep(5)
    image = browser.find_by_tag('h3')
    image[i].click()
    html = browser.html
    soup = bs(html, 'html.parser')
    title = soup.find("h2", class_="title").text
    div = soup.find("div", class_="downloads")
    for li in div:
               link = div.find('a')
    url = link.attrs['href']
    hemispheres = {
            'title' : title,
            'img_url' : url
        }
    mars_hemi_urls.append(hemispheres)
    browser.back()

In [111]:
mars_hemi_urls

[{'title': 'Cerberus Hemisphere Enhanced',
  'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/cerberus_enhanced.tif/full.jpg'},
 {'title': 'Schiaparelli Hemisphere Enhanced',
  'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/schiaparelli_enhanced.tif/full.jpg'},
 {'title': 'Syrtis Major Hemisphere Enhanced',
  'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/syrtis_major_enhanced.tif/full.jpg'},
 {'title': 'Valles Marineris Hemisphere Enhanced',
  'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/valles_marineris_enhanced.tif/full.jpg'}]