# Mission to Mars
## University of Denver Data Analytics Bootcamp - Week 12
### Matthew Stewart
### May 12, 2019
---

### Initial Information

This jupyter notebook represents the first half of the DU Data Analytics Bootcamp Week 12 homework.  In this notebook, we will scrape various information from the following sites:
* [NASA Mars News](https://mars.nasa.gov/news/?page=0&per_page=40&order=publish_date+desc%2Ccreated_at+desc&search=&category=19%2C165%2C184%2C204&blank_scope=Latest)
* [JPL Mars Space Images](https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars)
* [Mars Weather (Twitter)](https://twitter.com/marswxreport?lang=en)
* [Mars Facts](https://space-facts.com/mars/)
* [Mars Hemispheres](https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars)
---

### Initial Setup

In [198]:
# Dependencies
from bs4 import BeautifulSoup
import requests
import pandas as pd
from splinter import Browser
import time

---

### NASA Mars News

In this section, we will visit the NASA Mars News site and scrape the title and paragraph text from the latest news story.

In [199]:
# URL
mars_news_url = 'https://mars.nasa.gov/news/?page=0&per_page=40&order=publish_date+desc%2Ccreated_at+desc&search=&category=19%2C165%2C184%2C204&blank_scope=Latest'

In [200]:
# Make request to URL and create BeautifulSoup object
mars_news_response = requests.get(mars_news_url)
mars_news_soup = BeautifulSoup(mars_news_response.text, 'html.parser')

In [201]:
# Scrape for title and paragraph text
mars_news_latest = mars_news_soup.select('.slide')[0]
mars_news_title = mars_news_latest.select('.content_title')[0].text.strip()
mars_news_par = mars_news_latest.select('.rollover_description_inner')[0].text.strip()
print(mars_news_title)
print(mars_news_par)

Why This Martian Full Moon Looks Like Candy
For the first time, NASA's Mars Odyssey orbiter has caught the Martian moon Phobos during a full moon phase. Each color in this new image represents a temperature range detected by Odyssey's infrared camera.


### JPL Mars Space Images

In this section, we will visit the JPL Mars Space Images sit and scrape the URL for the latest featured image.  
_(Note: we will use Splinter to do so.)_

In [202]:
# Splinter setup
executable_path = {'executable_path': 'Resources/chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless = False)

In [203]:
# URL
jpl_space_images_url = 'https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars'

In [204]:
# Use splinter to visit URL and navigate to "full image" page
browser.visit(jpl_space_images_url)
browser.click_link_by_partial_text('FULL IMAGE')

In [205]:
# Further navigation to get to the full-sized image
time.sleep(5)
browser.click_link_by_partial_text('more info')

In [206]:
# Create BeautifulSoup object
html = browser.html
jpl_soup = BeautifulSoup(html, 'html.parser')

In [207]:
# Obtain endpoint for featured image URL (relative path)
featured_image = jpl_soup.select('img.main_image')[0]
featured_image_endpoint = featured_image['src']
print(featured_image_endpoint)

/spaceimages/images/largesize/PIA17356_hires.jpg


In [208]:
# Use slicing to concatentate the final URL
featured_image_url = jpl_space_images_url.split('?')[0] + featured_image_endpoint[13:]
print(featured_image_url)

https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA17356_hires.jpg


In [209]:
browser.quit()

---

### Mars Weather (Twitter)

In this section, we will visit the Mars Weather Twitter site and scrape the text of the latest Mars weather tweet.

In [210]:
# URL
mars_weather_url = 'https://twitter.com/marswxreport?lang=en'

In [211]:
# Make request to URL and create BeautifulSoup object
mars_weather_response = requests.get(mars_weather_url)
mars_weather_soup = BeautifulSoup(mars_weather_response.text, 'html.parser')

In [212]:
# Scrape text
mars_weather_tweet = mars_weather_soup.select('li.js-stream-item')[0]
mars_weather = mars_weather_tweet.select('.js-tweet-text-container p')[0].text
print(mars_weather)

InSight sol 162 (2019-05-12) low -100.2ºC (-148.3ºF) high -20.3ºC (-4.5ºF)
winds from the SW at 4.5 m/s (10.1 mph) gusting to 14.3 m/s (32.0 mph)
pressure at 7.50 hPapic.twitter.com/23uEPf5baF


---

### Mars Facts

In this section, we will visit the Mars Facts site and use pandas to scrape the tabular data that includes diameter, mass, etc.

In [213]:
# URL
mars_facts_url = 'https://space-facts.com/mars/'

In [214]:
# Scrape tabular data
mars_facts_tables = pd.read_html(mars_facts_url)

In [215]:
# Create dataframe from tabular data (note that the above scraping yields only a single table)
mars_facts_df = mars_facts_tables[0]
mars_facts_df.columns = ['Description', 'Value']
mars_facts_df.set_index('Description', inplace = True)
mars_facts_df

Unnamed: 0_level_0,Value
Description,Unnamed: 1_level_1
Equatorial Diameter:,"6,792 km"
Polar Diameter:,"6,752 km"
Mass:,6.42 x 10^23 kg (10.7% Earth)
Moons:,2 (Phobos & Deimos)
Orbit Distance:,"227,943,824 km (1.52 AU)"
Orbit Period:,687 days (1.9 years)
Surface Temperature:,-153 to 20 °C
First Record:,2nd millennium BC
Recorded By:,Egyptian astronomers


In [216]:
# Export above dataframe to HTML table
mars_facts_df.to_html('Resources/mars_table.html')

In [217]:
# Convert above dataframe to dictionary for future storage
mars_facts_dict = mars_facts_df.T.to_dict('list')
mars_facts_dict

{'Equatorial Diameter:': ['6,792 km'],
 'Polar Diameter:': ['6,752 km'],
 'Mass:': ['6.42 x 10^23 kg (10.7% Earth)'],
 'Moons:': ['2 (Phobos & Deimos)'],
 'Orbit Distance:': ['227,943,824 km (1.52 AU)'],
 'Orbit Period:': ['687 days (1.9 years)'],
 'Surface Temperature:': ['-153 to 20 °C'],
 'First Record:': ['2nd millennium BC'],
 'Recorded By:': ['Egyptian astronomers']}

In [218]:
# Iterate through each key/value pair and convert the value from a list to just the element in that list
for k, v in mars_facts_dict.items():
    mars_facts_dict[k] = v[0]
mars_facts_dict

{'Equatorial Diameter:': '6,792 km',
 'Polar Diameter:': '6,752 km',
 'Mass:': '6.42 x 10^23 kg (10.7% Earth)',
 'Moons:': '2 (Phobos & Deimos)',
 'Orbit Distance:': '227,943,824 km (1.52 AU)',
 'Orbit Period:': '687 days (1.9 years)',
 'Surface Temperature:': '-153 to 20 °C',
 'First Record:': '2nd millennium BC',
 'Recorded By:': 'Egyptian astronomers'}

---

### Mars Hemispheres

In this section, we will visit the Mars Hemispheres Enhanced Resolutions site and scrape Title/Image URL information for the four Mars hemispheres:
* Cerberus
* Schiaparelli
* Syrtis Major
* Valles Marineris

We will use the python requests library to do so, once for each of the four hemispheres.

In [219]:
# Four URLs
mars_hem_cerberus_url = 'https://astrogeology.usgs.gov/search/map/Mars/Viking/cerberus_enhanced'
mars_hem_schiaparelli_url = 'https://astrogeology.usgs.gov/search/map/Mars/Viking/schiaparelli_enhanced'
mars_hem_syrtis_url = 'https://astrogeology.usgs.gov/search/map/Mars/Viking/syrtis_major_enhanced'
mars_hem_valles_url = 'https://astrogeology.usgs.gov/search/map/Mars/Viking/valles_marineris_enhanced'

In [220]:
# Four requests
mars_hem_cerberus_response = requests.get(mars_hem_cerberus_url)
mars_hem_schiaparelli_response = requests.get(mars_hem_schiaparelli_url)
mars_hem_syrtis_response = requests.get(mars_hem_syrtis_url)
mars_hem_valles_response = requests.get(mars_hem_valles_url)

In [221]:
# Four BeautifulSoup objects
mars_hem_cerberus_soup = BeautifulSoup(mars_hem_cerberus_response.text, 'html.parser')
mars_hem_schiaparelli_soup = BeautifulSoup(mars_hem_schiaparelli_response.text, 'html.parser')
mars_hem_syrtis_soup = BeautifulSoup(mars_hem_syrtis_response.text, 'html.parser')
mars_hem_valles_soup = BeautifulSoup(mars_hem_valles_response.text, 'html.parser')

In [222]:
# Cerberus scraping
mars_hem_cerberus_content = mars_hem_cerberus_soup.select('.container')[0]

In [223]:
# Cerberus title
mars_hem_cerberus_title = mars_hem_cerberus_content.select('.content h2.title')[0].text.strip()[:-9]
print(mars_hem_cerberus_title)

Cerberus Hemisphere


In [224]:
# Cerberus image URL
mars_hem_cerberus_image = mars_hem_cerberus_content.select('img.wide-image')[0]['src']
mars_hem_cerberus_image_url = mars_hem_cerberus_url[:29] + mars_hem_cerberus_image
print(mars_hem_cerberus_image_url)

https://astrogeology.usgs.gov/cache/images/cfa62af2557222a02478f1fcd781d445_cerberus_enhanced.tif_full.jpg


In [225]:
# Schiaparelli scraping
mars_hem_schiaparelli_content = mars_hem_schiaparelli_soup.select('.container')[0]

In [226]:
# Schiaparelli title
mars_hem_schiaparelli_title = mars_hem_schiaparelli_content.select('.content h2.title')[0].text.strip()[:-9]
print(mars_hem_schiaparelli_title)

Schiaparelli Hemisphere


In [227]:
# Schiaparelli image URL
mars_hem_schiaparelli_image = mars_hem_schiaparelli_content.select('img.wide-image')[0]['src']
mars_hem_schiaparelli_image_url = mars_hem_schiaparelli_url[:29] + mars_hem_schiaparelli_image
print(mars_hem_schiaparelli_image_url)

https://astrogeology.usgs.gov/cache/images/3cdd1cbf5e0813bba925c9030d13b62e_schiaparelli_enhanced.tif_full.jpg


In [228]:
# Syrtis Major scraping
mars_hem_syrtis_content = mars_hem_syrtis_soup.select('.container')[0]

In [229]:
# Syrtis Major title
mars_hem_syrtis_title = mars_hem_syrtis_content.select('.content h2.title')[0].text.strip()[:-9]
print(mars_hem_syrtis_title)

Syrtis Major Hemisphere


In [230]:
# Syrtis Major image URL
mars_hem_syrtis_image = mars_hem_syrtis_content.select('img.wide-image')[0]['src']
mars_hem_syrtis_image_url = mars_hem_syrtis_url[:29] + mars_hem_syrtis_image
print(mars_hem_syrtis_image_url)

https://astrogeology.usgs.gov/cache/images/ae209b4e408bb6c3e67b6af38168cf28_syrtis_major_enhanced.tif_full.jpg


In [231]:
# Valles Marineris scraping
mars_hem_valles_content = mars_hem_valles_soup.select('.container')[0]

In [232]:
# Valles Marineris title
mars_hem_valles_title = mars_hem_valles_content.select('.content h2.title')[0].text.strip()[:-9]
print(mars_hem_valles_title)

Valles Marineris Hemisphere


In [233]:
# Valles Marineris image URL
mars_hem_valles_image = mars_hem_valles_content.select('img.wide-image')[0]['src']
mars_hem_valles_image_url = mars_hem_valles_url[:29] + mars_hem_valles_image
print(mars_hem_valles_image_url)

https://astrogeology.usgs.gov/cache/images/7cf2da4bf549ed01c17f206327be4db7_valles_marineris_enhanced.tif_full.jpg


In [234]:
# Create list of dictionaries using above titles and image URLs
hemisphere_image_urls = [
    {
        'title': mars_hem_cerberus_title,
        'img_url': mars_hem_cerberus_image_url
    },
    
    {
        'title': mars_hem_schiaparelli_title,
        'img_url': mars_hem_schiaparelli_image_url
    },
    
    {
        'title': mars_hem_syrtis_title,
        'img_url': mars_hem_syrtis_image_url
    },
    
    {
        'title': mars_hem_valles_title,
        'img_url': mars_hem_valles_image_url
    }
]

In [235]:
hemisphere_image_urls

[{'title': 'Cerberus Hemisphere',
  'img_url': 'https://astrogeology.usgs.gov/cache/images/cfa62af2557222a02478f1fcd781d445_cerberus_enhanced.tif_full.jpg'},
 {'title': 'Schiaparelli Hemisphere',
  'img_url': 'https://astrogeology.usgs.gov/cache/images/3cdd1cbf5e0813bba925c9030d13b62e_schiaparelli_enhanced.tif_full.jpg'},
 {'title': 'Syrtis Major Hemisphere',
  'img_url': 'https://astrogeology.usgs.gov/cache/images/ae209b4e408bb6c3e67b6af38168cf28_syrtis_major_enhanced.tif_full.jpg'},
 {'title': 'Valles Marineris Hemisphere',
  'img_url': 'https://astrogeology.usgs.gov/cache/images/7cf2da4bf549ed01c17f206327be4db7_valles_marineris_enhanced.tif_full.jpg'}]