## BACKGROUND:
This project scrapes various websites for data related to NASA's Mission to Mars and displays the information in a single HTML page.  This Jupyter Notebook file includes all scraping and analysis tasks.

In [12]:
# Dependencies
import pandas as pd
from splinter import Browser
from bs4 import BeautifulSoup as bs
import requests
import time

### Scrape the [NASA Mars News Site](https://mars.nasa.gov/news/):
Collect the latest News Title and Paragraph Text. Assign the text to variables for use later.

In [13]:
# Mars News URL
url = "https://mars.nasa.gov/news/"

# Retrieve page with the requests module
html = requests.get(url)

# Create BeautifulSoup object; parse with 'html.parser'
soup = bs(html.text, 'html.parser')

# Establish a dictionary to store scraped information
news_data = {}

# Get news title & paragraph description
news_title = soup.find('div', 'content_title', 'a').get_text().strip()
news_paragraph = soup.find('div', 'rollover_description_inner').get_text().strip()

# Add the title and description to the dictionary
news_data["news_title"] = news_title
news_data["news_paragraph"] = news_paragraph

# View the news_data dictionary containing title and news paragraph
news_data

{'news_title': "NASA's Curiosity Mars Rover Finds a Clay Cache",
 'news_paragraph': 'The rover recently drilled two samples, and both showed the highest levels of clay ever found during the mission.'}

###  Scrape the [NASA Jet Propulsion Laboratory site](https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars):
Direction and Guidance:
* Use splinter to navigate the site (using chrome) and find the image url for the current Featured Mars Image and assign the url string to a variable called `featured_image_url`.
* Make sure to find the image url to the full size `.jpg` image.
* Make sure to save a complete url string for this image.

In [16]:
# Use splinter to navigate the JPL site and find the image url for the current 'featured_image' 
# Note: Ensure chromedriver.exe is in the CWD
executable_path = {'executable_path': 'chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False) # False to see the Chrome Browser open

# Open the JPL url using Chrome
JPL_url = "https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars"
browser.visit(JPL_url) # opens up the JPL website

JPL_html=browser.html # take out the information in html format
JPL_soup = bs(JPL_html, 'html.parser')

# Get the featured item
featured = JPL_soup.find('div', class_='default floating_text_area ms-layer')
featured_image = featured.find('footer')
featured_image_url = 'https://www.jpl.nasa.gov/' + featured_image.find('a')['data-fancybox-href']

print(str(featured_image_url))


https://www.jpl.nasa.gov//spaceimages/images/mediumsize/PIA00271_ip.jpg


### Scrape the [Mars Weather twitter account](https://twitter.com/marswxreport?lang=en):
Collect latest Mars weather tweet from the Mars weather twitter account. Save the tweet text for the weather report as a variable called `mars_weather`. Note: this is a tweet which can change often - look to get the weather report (only) like the following Example: mars_weather = 'Sol 1801 (Aug 30, 2017), Sunny, high -21C/-5F, low -80C/-112F, pressure at 8.82 hPa, daylight 06:09-17:55'

In [41]:
# Mars weather text
# Mars Weather twitter account URL
mars_twitter_url = 'https://twitter.com/marswxreport?lang=en'
mars_twitter_response = requests.get(mars_twitter_url)

# Retrieve 'mars_weather'
mars_twitter_soup = bs(mars_twitter_response.text, 'html.parser')
mars_twitter_result = mars_twitter_soup.find('div', class_='js-tweet-text-container')

# Assign the scraped text to a variable 'mars_weather'
mars_weather = mars_twitter_result.find('p', class_='js-tweet-text').text
mars_weather


'Temps range between 250ºF and -280ºF on the lunar surface. The @WeatherChannel shares more. Nice pocket protector there @JimCantorehttps://www.youtube.com/watch?v=VHPCO-33yCU\xa0…'

### Scrape the [Mars Facts Web page](https://space-facts.com/mars/):
use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.
Use Pandas to convert the data to a HTML table string.

In [67]:
# Mars Facts Table
mars_facts_url = 'https://space-facts.com/mars/'

mars_facts_table = pd.read_html(mars_facts_url, index_col=0, flavor=['lxml', 'bs4'])
mars_facts_table

[                                    Mars            Earth
 Mars - Earth Comparison                                  
 Diameter:                       6,779 km        12,742 km
 Mass:                    6.39 × 10^23 kg  5.97 × 10^24 kg
 Moons:                                 2                1
 Distance from Sun:        227,943,824 km   149,598,262 km
 Length of Year:           687 Earth days      365.24 days
 Temperature:               -153 to 20 °C      -88 to 58°C,
                                                   1
 0                                                  
 Equatorial Diameter:                       6,792 km
 Polar Diameter:                            6,752 km
 Mass:                 6.39 × 10^23 kg (0.11 Earths)
 Moons:                          2 (Phobos & Deimos)
 Orbit Distance:            227,943,824 km (1.38 AU)
 Orbit Period:                  687 days (1.9 years)
 Surface Temperature:                   -87 to -5 °C
 First Record:                     2nd millennium 

In [83]:
# Create pandas DataFrame for first table "Mars- Earth Comparison"
Mars_Earth_Comparison_df = mars_facts_table[0]
Mars_Earth_Comparison_df.columns = ['Mars', 'Earth']
Mars_Earth_Comparison_df


Unnamed: 0_level_0,Mars,Earth
Mars - Earth Comparison,Unnamed: 1_level_1,Unnamed: 2_level_1
Diameter:,"6,779 km","12,742 km"
Mass:,6.39 × 10^23 kg,5.97 × 10^24 kg
Moons:,2,1
Distance from Sun:,"227,943,824 km","149,598,262 km"
Length of Year:,687 Earth days,365.24 days
Temperature:,-153 to 20 °C,-88 to 58°C


In [103]:
#Convert the data to a HTML table string.
Mars_facts = Mars_Earth_Comparison_df.to_html()
Mars_facts.replace("\n", "")

Mars_Earth_Comparison_df.to_html('Output/mars_facts.html')


In [95]:
# Create pandas DataFrame for second table ""
Mars_df = mars_facts_table[1]
Mars_df.columns = ['Other Mars Facts']
Mars_df

Unnamed: 0_level_0,Other Mars Facts
0,Unnamed: 1_level_1
Equatorial Diameter:,"6,792 km"
Polar Diameter:,"6,752 km"
Mass:,6.39 × 10^23 kg (0.11 Earths)
Moons:,2 (Phobos & Deimos)
Orbit Distance:,"227,943,824 km (1.38 AU)"
Orbit Period:,687 days (1.9 years)
Surface Temperature:,-87 to -5 °C
First Record:,2nd millennium BC
Recorded By:,Egyptian astronomers


In [104]:
# Convert the data to a HTML table string.
Other_Mars_facts = Mars_df.to_html()
Mars_df.replace("\n", "")

Mars_df.to_html('Output/other_mars_facts.html')

### Scrape the [USGS Astrogeology site](https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars):
Obtain high resolution images for each of Mar's hemispheres. Click each of the links to the hemispheres in order to find the image url to the full resolution image. Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys `img_url` and `title`.


In [99]:
# Use splinter to navigate to the USGS Astrogeology URL
executable_path = {'executable_path': 'chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)

# Open the USGS url using Chrome
USGS_url = 'https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars'
browser.visit(USGS_url)

# Retrieve the image string in html formate for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name
USGS_html = browser.html
USGS_soup = bs(USGS_html, 'html.parser')

USGS_image_list = USGS_soup.find_all('div', class_='item')
USGS_image_list

[<div class="item"><a class="itemLink product-item" href="/search/map/Mars/Viking/cerberus_enhanced"><img alt="Cerberus Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/dfaf3849e74bf973b59eb50dab52b583_cerberus_enhanced.tif_thumb.png"/></a><div class="description"><a class="itemLink product-item" href="/search/map/Mars/Viking/cerberus_enhanced"><h3>Cerberus Hemisphere Enhanced</h3></a><span class="subtitle" style="float:left">image/tiff 21 MB</span><span class="pubDate" style="float:right"></span><br/><p>Mosaic of the Cerberus hemisphere of Mars projected into point perspective, a view similar to that which one would see from a spacecraft. This mosaic is composed of 104 Viking Orbiter images acquired…</p></div> <!-- end description --></div>,
 <div class="item"><a class="itemLink product-item" href="/search/map/Mars/Viking/schiaparelli_enhanced"><img alt="Schiaparelli Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/7677c0a006b83871b5a2f66985ab5857_schiapa

In [102]:
# Create a list called hemisphere_title_and_image
hemispheres_title_and_image = []

base_url ="https://astrogeology.usgs.gov" # neeed below

# Loop through each hemisphere and click on the link to find the large resolution image url
for image in USGS_image_list:
    hemisphere_dict = {}
    
    href = image.find('a', class_='itemLink product-item')
    link = base_url + href['href']
    browser.visit(link)
    
    time.sleep(1)
    
    hemisphere_html_2 = browser.html
    hemisphere_soup_2 = bs(hemisphere_html_2, 'html.parser')
    
    img_title = hemisphere_soup_2.find('div', class_='content').find('h2', class_='title').text
    hemisphere_dict['title'] = img_title
    
    img_url = hemisphere_soup_2.find('div', class_='downloads').find('a')['href']
    hemisphere_dict['url_imgage'] = img_url

    # Append dictionary to hemisphere_image_urls list
    hemispheres_title_and_image.append(hemisphere_dict)

# Print hemisphere_title_and_image list
hemispheres_title_and_image

[{'title': 'Cerberus Hemisphere Enhanced',
  'url_imgage': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/cerberus_enhanced.tif/full.jpg'},
 {'title': 'Schiaparelli Hemisphere Enhanced',
  'url_imgage': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/schiaparelli_enhanced.tif/full.jpg'},
 {'title': 'Syrtis Major Hemisphere Enhanced',
  'url_imgage': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/syrtis_major_enhanced.tif/full.jpg'},
 {'title': 'Valles Marineris Hemisphere Enhanced',
  'url_imgage': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/valles_marineris_enhanced.tif/full.jpg'}]