# Mission to Mars

## Step 1 - Scraping
- Complete your initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.
- Create a Jupyter Notebook file called mission_to_mars.ipynb and use this to complete all of your scraping and analysis tasks. The following outlines what you need to scrape.

In [1]:
# Dependencies
from bs4 import BeautifulSoup as bs
from splinter import Browser
import time
import pandas as pd
import requests

#### NASA Mars News
Scrape the [NASA Mars News Site](https://mars.nasa.gov/news/) and collect the latest News Title and Paragragh Text. Assign the text to variables that you can reference later.

In [2]:
# URL of page to be scraped
nasaURL = 'https://mars.nasa.gov/news/'

# Retrieve page with the requests module
nasaResponse = requests.get(nasaURL)
# Create BeautifulSoup object; parse with 'html.parser'
nasaSoup = bs(nasaResponse.text, 'html.parser')

In [3]:
newsTitle = nasaSoup.find('div', {'class' : 'content_title'}).find('a').get_text().strip()
newsP = nasaSoup.find('div', {'class' : 'rollover_description_inner'}).get_text().strip()

In [4]:
newsTitle

'Nearly a Decade After Mars Phoenix Landed, Another Look'

In [5]:
newsP

"A recent view from Mars orbit of the site where NASA's Phoenix Mars mission landed on far-northern Mars nearly a decade ago captures changes."

#### JPL Mars Space Images - Featured Image
- Visit the url for JPL's Featured Space Image [here](https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars).
- Use splinter to navigate the site and find the image url for the current - Featured Mars Image and assign the url string to a variable called featured_image_url.
- Make sure to find the image url to the full size .jpg image.
- Make sure to save a complete url string for this image.

In [6]:
# Set paths
executable_path = {'executable_path': '/usr/local/Cellar/chromedriver/2.35/bin/chromedriver'}
browser = Browser('chrome', **executable_path, headless=False)
jplURL = 'https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars'

# Retrieve page with the requests module
nasaResponse = requests.get(nasaURL)
# Create BeautifulSoup object; parse with 'html.parser'
nasaSoup = bs(nasaResponse.text, 'html.parser')

In [7]:
# Navigate to article page
browser.visit(jplURL)
browser.click_link_by_partial_text('FULL IMAGE')
time.sleep(3)
browser.click_link_by_partial_text('more info')

In [8]:
# Get HTML from final page
featureArticle = browser.html

# Create BeautifulSoup object; parse with 'html.parser'
jplSoup = bs(featureArticle,'html.parser')

# Find image source tag
featuredImg = jplSoup.find('figure', {'class' : 'lede'}).find('img')
featuredImgSrc = featuredImg['src']

# Create full URL
featuredImgURL = 'https://www.jpl.nasa.gov' + featuredImgSrc 

featuredImgURL

'https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA18847_hires.jpg'

#### Mars Weather
Visit the Mars Weather twitter account [here](https://twitter.com/marswxreport?lang=en) and scrape the latest Mars weather tweet from the page. Save the tweet text for the weather report as a variable called mars_weather.

In [9]:
# URL of page to be scraped
twitterURL = 'https://twitter.com/marswxreport'

# Retrieve page with the requests module
twitterResponse = requests.get(twitterURL)
# Create BeautifulSoup object; parse with 'html.parser'
twitterSoup = bs(twitterResponse.text, 'html.parser')

In [10]:
newsTitle = nasaSoup.find('div', {'class' : 'content_title'}).find('a').get_text().strip()
newsP = nasaSoup.find('div', {'class' : 'rollover_description_inner'}).get_text().strip()

In [11]:
marsWeather = twitterSoup\
              .find('li', {'class' : 'stream-item'})\
              .find('div', {'class' : 'js-tweet-text-container'})\
              .get_text().strip()

marsWeather

'Sol 1967 (Feb 17, 2018), Sunny, high -15C/5F, low -76C/-104F, pressure at 7.34 hPa, daylight 05:39-17:26'

#### Mars Facts
- Visit the Mars Facts webpage [here](http://space-facts.com/mars/) and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.
- Use Pandas to convert the data to a HTML table string.

In [12]:
# Read URL Tables using pandas
factsURL = 'https://space-facts.com/mars/'
tables = pd.read_html(factsURL)

tables

[                      0                              1
 0  Equatorial Diameter:                       6,792 km
 1       Polar Diameter:                       6,752 km
 2                 Mass:  6.42 x 10^23 kg (10.7% Earth)
 3                Moons:            2 (Phobos & Deimos)
 4       Orbit Distance:       227,943,824 km (1.52 AU)
 5         Orbit Period:           687 days (1.9 years)
 6  Surface Temperature:                  -153 to 20 °C
 7         First Record:              2nd millennium BC
 8          Recorded By:           Egyptian astronomers]

In [13]:
# Make it into a dataframe
df = tables[0]
df.columns = ['Profile','Fact']

df.head()

Unnamed: 0,Profile,Fact
0,Equatorial Diameter:,"6,792 km"
1,Polar Diameter:,"6,752 km"
2,Mass:,6.42 x 10^23 kg (10.7% Earth)
3,Moons:,2 (Phobos & Deimos)
4,Orbit Distance:,"227,943,824 km (1.52 AU)"


In [20]:
factsDict = df.to_dict('records')
factsDict

[{'Fact': '6,792 km', 'Profile': 'Equatorial Diameter:'},
 {'Fact': '6,752 km', 'Profile': 'Polar Diameter:'},
 {'Fact': '6.42 x 10^23 kg (10.7% Earth)', 'Profile': 'Mass:'},
 {'Fact': '2 (Phobos & Deimos)', 'Profile': 'Moons:'},
 {'Fact': '227,943,824 km (1.52 AU)', 'Profile': 'Orbit Distance:'},
 {'Fact': '687 days (1.9 years)', 'Profile': 'Orbit Period:'},
 {'Fact': '-153 to 20 °C', 'Profile': 'Surface Temperature:'},
 {'Fact': '2nd millennium BC', 'Profile': 'First Record:'},
 {'Fact': 'Egyptian astronomers', 'Profile': 'Recorded By:'}]

#### Mars Hemisperes
- Visit the USGS Astrogeology site [here](https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars) to obtain high resolution images for each of Mar's hemispheres.
- You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.
- Save both the image url string for the full resolution hemipshere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys img_url and title.
- Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

In [15]:
usgsURL = 'https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars'

# Retrieve page with the requests module
usgsResponse = requests.get(usgsURL)
# Create BeautifulSoup object; parse with 'html.parser'
usgsSoup = bs(usgsResponse.text, 'html.parser')

In [16]:
# Find hemisphere listings
usgsResults = usgsSoup.find_all('div', {'class' : 'item'})

In [17]:
hemispherePhotos = []

# Iterate through results and get hemisphere name and high-res image url
for result in usgsResults:
    # Create detail page URL
    src = result.a['href']
    url = 'https://astrogeology.usgs.gov' + src
    
    # Retrieve page with the requests module
    response = requests.get(url)
    # Create BeautifulSoup object; parse with 'html.parser'
    soup = bs(response.text, 'html.parser')
    
    # Get Hemisphere name
    hemisphere = soup.find('div', {'class' : 'content'}).h2.get_text().strip()
    hemisphereName = hemisphere[:-9]
    
    # Get high-res image URL
    downloads = soup.find('div', {'class' : 'downloads'}).find_all('li')
    link = downloads[1].a['href']
    
    hemispherePhotos.append({'title' : hemisphereName,
                             'img_url' : link})

hemispherePhotos

[{'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/cerberus_enhanced.tif',
  'title': 'Cerberus Hemisphere'},
 {'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/schiaparelli_enhanced.tif',
  'title': 'Schiaparelli Hemisphere'},
 {'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/syrtis_major_enhanced.tif',
  'title': 'Syrtis Major Hemisphere'},
 {'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/valles_marineris_enhanced.tif',
  'title': 'Valles Marineris Hemisphere'}]

In [21]:
marsinfo = {}

marsinfo['latestnews'] = {'title' : newsTitle, 'subtitle': newsP}
marsinfo['featuredimage'] = featuredImgURL
marsinfo['weather'] = marsWeather
marsinfo['planetfacts'] = factsDict
marsinfo['hemispheres'] = hemispherePhotos

In [24]:
import pprint

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(marsinfo)

{   'featuredimage': 'https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA18847_hires.jpg',
    'hemispheres': [   {   'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/cerberus_enhanced.tif',
                           'title': 'Cerberus Hemisphere'},
                       {   'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/schiaparelli_enhanced.tif',
                           'title': 'Schiaparelli Hemisphere'},
                       {   'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/syrtis_major_enhanced.tif',
                           'title': 'Syrtis Major Hemisphere'},
                       {   'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/valles_marineris_enhanced.tif',
                           'title': 'Valles Marineris Hemisphere'}],
    'latestnews': {   'subtitle': 'A recent view from Mars orbit of the site '
                                  "where NASA's Phoenix

In [26]:
for hem in marsinfo['hemispheres']:
    print(hem['title'])

Cerberus Hemisphere
Schiaparelli Hemisphere
Syrtis Major Hemisphere
Valles Marineris Hemisphere
