# Mission to Mars

## Step 1 - Scraping

Complete your initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.

* Create a Jupyter Notebook file called `mission_to_mars.ipynb` and use this to complete all of your scraping and analysis tasks. The following outlines what you need to scrape.

### NASA Mars News

* Scrape the [NASA Mars News Site](https://mars.nasa.gov/news/) and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

In [10]:
# Add Dependencies
from bs4 import BeautifulSoup
from splinter import Browser
from selenium import webdriver
import pandas as pd
import time 
import requests

In [11]:
# Import Splinter and set the chromedriver path
executable_path = {'executable_path': 'C:\Program Files\chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)

In [None]:
# Visit the following URL
url = "https://mars.nasa.gov/news/"
browser.visit(url)
time.sleep(3)

In [None]:
# Create HTML object
nasa_html = browser.html
# Parse HTML with BeautifulSoup
nasa_soup = BeautifulSoup(nasa_html, "html.parser")

In [None]:
# Find News Title
news_title = nasa_soup.find("div", class_ = "content_title").text.strip()
news_title

In [None]:
# Find the Paragraph Text
news_p = nasa_soup.find("div", class_ = "article_teaser_body").text.strip()
news_p

### JPL Mars Space Images - Featured Image

* Visit the url for JPL Featured Space Image [here](https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars).

* Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called `featured_image_url`.

* Make sure to find the image url to the full size `.jpg` image.

* Make sure to save a complete url string for this image.

In [2]:
# Create variable to hold url
executable_path = {'executable_path': 'C:\Program Files\chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)
jpl_url = "https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars"
browser.visit(jpl_url)

In [3]:
time.sleep(3)
# Have splinter click full image button
browser.click_link_by_partial_text("FULL IMAGE")
time.sleep(3)
# Have splinter click more info button
browser.click_link_by_partial_text("more info")

In [4]:
# Create HTML object
featured_pg = browser.html
# Create BeautifulSoup object and parse with HTML parser
jpl_soup = BeautifulSoup(featured_pg, "html.parser")

In [5]:
# Get featured image url
featured_img = jpl_soup.find("figure", class_ = "lede")
featured_img_url =featured_img.a["href"]
featured_img_url = ("https://www.jpl.nasa.gov" + featured_img_url)
featured_img_url

'https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA14924_hires.jpg'

In [8]:
# Use the requests library to download and save the image from the `img_url` above
import requests
import shutil
response = requests.get(featured_img_url, stream=True)
with open('img_new.png', 'wb') as out_file:
    shutil.copyfileobj(response.raw, out_file)

In [9]:
# Display the image with IPython.display
from IPython.display import Image
Image(url='img_new.png')

### Mars Weather

* Visit the Mars Weather twitter account [here](https://twitter.com/marswxreport?lang=en) and scrape the latest Mars weather tweet from the page. Save the tweet text for the weather report as a variable called `mars_weather`.

In [12]:
# Create variable to hold url
executable_path = {'executable_path': 'C:\Program Files\chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)
tweet_url = "https://twitter.com/marswxreport?lang=en"
browser.visit(tweet_url)

In [13]:
time.sleep(3)
# Create HTML object
tweet_mars_html = browser.html
# Create BeautifulSoup object and parse with HTML parser
tweet_mars_soup = BeautifulSoup(tweet_mars_html, "html.parser")

In [27]:
#tweet_mars_soup

In [25]:
# Get the tweet text for the weather report for mars as variable mars_weather
tweets = tweet_mars_soup.find_all("p")
for tweet in tweets:
# if the tweet contains "Sol" we know it is a tweet about weather
   if 'Sol' in tweet.text:
       mars_weather = tweet.text
       break

In [26]:
mars_weather

'Sol 2171 (2018-09-14), high -12C/10F, low -65C/-84F, pressure at 8.79 hPa, daylight 05:43-17:59'

### Mars Facts

* Visit the Mars Facts webpage [here](http://space-facts.com/mars/) and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.

* Use Pandas to convert the data to a HTML table string.

In [17]:
# Create variable to hold url
executable_path = {'executable_path': 'C:\Program Files\chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)
mars_fact_url = "https://space-facts.com/mars/"
browser.visit(mars_fact_url)

In [21]:
tables = pd.read_html(mars_fact_url)
len(tables)

1

In [22]:
tables[0]

Unnamed: 0,0,1
0,Equatorial Diameter:,"6,792 km"
1,Polar Diameter:,"6,752 km"
2,Mass:,6.42 x 10^23 kg (10.7% Earth)
3,Moons:,2 (Phobos & Deimos)
4,Orbit Distance:,"227,943,824 km (1.52 AU)"
5,Orbit Period:,687 days (1.9 years)
6,Surface Temperature:,-153 to 20 °C
7,First Record:,2nd millennium BC
8,Recorded By:,Egyptian astronomers


In [25]:
facts_df = tables[0]
facts_df.columns = ["Description", "Values"]
facts_df.set_index("Description", inplace = True)
facts_df

Unnamed: 0_level_0,Values
Description,Unnamed: 1_level_1
Equatorial Diameter:,"6,792 km"
Polar Diameter:,"6,752 km"
Mass:,6.42 x 10^23 kg (10.7% Earth)
Moons:,2 (Phobos & Deimos)
Orbit Distance:,"227,943,824 km (1.52 AU)"
Orbit Period:,687 days (1.9 years)
Surface Temperature:,-153 to 20 °C
First Record:,2nd millennium BC
Recorded By:,Egyptian astronomers


In [27]:
facts_table = facts_df.to_html()
facts_table

'<table border="1" class="dataframe">\n  <thead>\n    <tr style="text-align: right;">\n      <th></th>\n      <th>Values</th>\n    </tr>\n    <tr>\n      <th>Description</th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>Equatorial Diameter:</th>\n      <td>6,792 km</td>\n    </tr>\n    <tr>\n      <th>Polar Diameter:</th>\n      <td>6,752 km</td>\n    </tr>\n    <tr>\n      <th>Mass:</th>\n      <td>6.42 x 10^23 kg (10.7% Earth)</td>\n    </tr>\n    <tr>\n      <th>Moons:</th>\n      <td>2 (Phobos &amp; Deimos)</td>\n    </tr>\n    <tr>\n      <th>Orbit Distance:</th>\n      <td>227,943,824 km (1.52 AU)</td>\n    </tr>\n    <tr>\n      <th>Orbit Period:</th>\n      <td>687 days (1.9 years)</td>\n    </tr>\n    <tr>\n      <th>Surface Temperature:</th>\n      <td>-153 to 20 °C</td>\n    </tr>\n    <tr>\n      <th>First Record:</th>\n      <td>2nd millennium BC</td>\n    </tr>\n    <tr>\n      <th>Recorded By:</th>\n      <td>Egyptian astronomers</td>\n    </tr

In [28]:
mars_facts_table = facts_table.replace("\n", "")
mars_facts_table

'<table border="1" class="dataframe">  <thead>    <tr style="text-align: right;">      <th></th>      <th>Values</th>    </tr>    <tr>      <th>Description</th>      <th></th>    </tr>  </thead>  <tbody>    <tr>      <th>Equatorial Diameter:</th>      <td>6,792 km</td>    </tr>    <tr>      <th>Polar Diameter:</th>      <td>6,752 km</td>    </tr>    <tr>      <th>Mass:</th>      <td>6.42 x 10^23 kg (10.7% Earth)</td>    </tr>    <tr>      <th>Moons:</th>      <td>2 (Phobos &amp; Deimos)</td>    </tr>    <tr>      <th>Orbit Distance:</th>      <td>227,943,824 km (1.52 AU)</td>    </tr>    <tr>      <th>Orbit Period:</th>      <td>687 days (1.9 years)</td>    </tr>    <tr>      <th>Surface Temperature:</th>      <td>-153 to 20 °C</td>    </tr>    <tr>      <th>First Record:</th>      <td>2nd millennium BC</td>    </tr>    <tr>      <th>Recorded By:</th>      <td>Egyptian astronomers</td>    </tr>  </tbody></table>'

In [29]:
## Using Pandas to convert the data to a HTML table string.
facts_html = facts_df.to_html("facts_table.html")

### Mars Hemispheres

* Visit the USGS Astrogeology site [here](https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars) to obtain high resolution images for each of Mar's hemispheres.

* You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.

* Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys `img_url` and `title`.

* Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

In [15]:
# Create variable to hold url
executable_path = {'executable_path': 'C:\Program Files\chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)
usgs_url = "https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars"
browser.visit(usgs_url)
time.sleep(3)

In [16]:
# Create HTML object
usgs_html = browser.html
# Create BeautifulSoup object and parse with HTML parser
usgs_html_soup = BeautifulSoup(usgs_html, "html.parser")

In [17]:
#Finding all the four hemispheres
results = usgs_html_soup.find_all("h3")
results

[<h3>Cerberus Hemisphere Enhanced</h3>,
 <h3>Schiaparelli Hemisphere Enhanced</h3>,
 <h3>Syrtis Major Hemisphere Enhanced</h3>,
 <h3>Valles Marineris Hemisphere Enhanced</h3>]

In [18]:
# Create empty dictionaries and list
hemisphere_img_urls = []
hemisphere_dicts = {"title": [] , "img_url": []}

In [19]:
# Use loop and splinter to open hemisphere links in order to get parse title and image urls
# Loop through each result
for result in results:
    # Get text info from result
    title = result.text
    browser.click_link_by_partial_text(title)
    time.sleep(1)
    img_url = browser.find_link_by_partial_href("download")["href"]
    hemisphere_dicts = {"title": title, "img_url": img_url}
    hemisphere_img_urls.append(hemisphere_dicts)
    time.sleep(1)
    browser.visit(usgs_url)

In [20]:
print(hemisphere_img_urls)

[{'title': 'Cerberus Hemisphere Enhanced', 'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/cerberus_enhanced.tif/full.jpg'}, {'title': 'Schiaparelli Hemisphere Enhanced', 'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/schiaparelli_enhanced.tif/full.jpg'}, {'title': 'Syrtis Major Hemisphere Enhanced', 'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/syrtis_major_enhanced.tif/full.jpg'}, {'title': 'Valles Marineris Hemisphere Enhanced', 'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/valles_marineris_enhanced.tif/full.jpg'}]


In [21]:
hemisphere_img_urls

[{'title': 'Cerberus Hemisphere Enhanced',
  'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/cerberus_enhanced.tif/full.jpg'},
 {'title': 'Schiaparelli Hemisphere Enhanced',
  'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/schiaparelli_enhanced.tif/full.jpg'},
 {'title': 'Syrtis Major Hemisphere Enhanced',
  'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/syrtis_major_enhanced.tif/full.jpg'},
 {'title': 'Valles Marineris Hemisphere Enhanced',
  'img_url': 'http://astropedia.astrogeology.usgs.gov/download/Mars/Viking/valles_marineris_enhanced.tif/full.jpg'}]