## Step 1 - Scraping 

Complete your initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.

- Create a Jupyter Notebook file called mission_to_mars.ipynb and use this to complete all of your scraping and analysis tasks. The following outlines what you need to scrape.

In [1]:
#Dependencies
import pandas as pd
from bs4 import BeautifulSoup as bs
from splinter import Browser
import requests
import os
from webdriver_manager.chrome import ChromeDriverManager

In [2]:
# Create Splinter browser
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

[WDM] - Current google-chrome version is 86.0.4240
[WDM] - Get LATEST driver version for 86.0.4240
[WDM] - Get LATEST driver version for 86.0.4240
[WDM] - Trying to download new driver from http://chromedriver.storage.googleapis.com/86.0.4240.22/chromedriver_win32.zip


 


[WDM] - Driver has been saved in cache [C:\Users\ellse\.wdm\drivers\chromedriver\win32\86.0.4240.22]


# NASA Mars News

- Scrape the NASA Mars News Site and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

In [12]:
#Nasa News URL 
url = 'https://mars.nasa.gov/news/'
browser.visit(url)

In [13]:
#HTML object
html = browser.html

#Parse w/ Beautiful Soup
soup = bs(html, 'html.parser')

#Retrieve Latest News Title and Pragraph text 
news_title = soup.find('div', class_='content_title').text
news_p = soup.find('div', class_='article_teaser_body').text

#display
print(news_title)
print(news_p)

Mars Now
NASA's Perseverance rover carries a device to convert Martian air into oxygen that, if produced on a larger scale, could be used not just for breathing, but also for fuel.


# JPL Mars Space Images - Featured Image

- Visit the url for JPL Featured Space Image here.


- Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called featured_image_url.


- Make sure to find the image url to the full size .jpg image.


- Make sure to save a complete url string for this image.

In [29]:
#Mars Image URL
featured_image_url = 'https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars'
browser.visit(featured_image_url)

In [30]:
#HTML Object
html_img = browser.html

#Parse w/ Beautiful Soup
soup = bs(html_img, 'html.parser')

image_url = soup.find('article')['style'].replace('background-image: url(','').replace(');','')[1:-1]

featured_url = 'https://www.jpl.nasa.gov' + image_url

featured_url                                                                 

'https://www.jpl.nasa.gov/spaceimages/images/wallpaper/PIA18297-1920x1200.jpg'

# Mars Facts


- Visit the Mars Facts webpage here and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.

- Use Pandas to convert the data to a HTML table string.


In [31]:
#Mars Fact URL
fact_url = 'https://space-facts.com/mars/'
browser.visit(fact_url)

In [32]:
#Pandas read html
tables = pd.read_html(fact_url)
tables

[                      0                              1
 0  Equatorial Diameter:                       6,792 km
 1       Polar Diameter:                       6,752 km
 2                 Mass:  6.39 × 10^23 kg (0.11 Earths)
 3                Moons:            2 (Phobos & Deimos)
 4       Orbit Distance:       227,943,824 km (1.38 AU)
 5         Orbit Period:           687 days (1.9 years)
 6  Surface Temperature:                   -87 to -5 °C
 7         First Record:              2nd millennium BC
 8          Recorded By:           Egyptian astronomers,
   Mars - Earth Comparison             Mars            Earth
 0               Diameter:         6,779 km        12,742 km
 1                   Mass:  6.39 × 10^23 kg  5.97 × 10^24 kg
 2                  Moons:                2                1
 3      Distance from Sun:   227,943,824 km   149,598,262 km
 4         Length of Year:   687 Earth days      365.24 days
 5            Temperature:     -87 to -5 °C      -88 to 58°C,
           

In [34]:
#Creating Dataframe
df = tables[0]
df.columns = ['Description', 'Value']
df

Unnamed: 0,Description,Value
0,Equatorial Diameter:,"6,792 km"
1,Polar Diameter:,"6,752 km"
2,Mass:,6.39 × 10^23 kg (0.11 Earths)
3,Moons:,2 (Phobos & Deimos)
4,Orbit Distance:,"227,943,824 km (1.38 AU)"
5,Orbit Period:,687 days (1.9 years)
6,Surface Temperature:,-87 to -5 °C
7,First Record:,2nd millennium BC
8,Recorded By:,Egyptian astronomers


In [35]:
#Remove index to df
df.set_index('Description', inplace=True)
df

Unnamed: 0_level_0,Value
Description,Unnamed: 1_level_1
Equatorial Diameter:,"6,792 km"
Polar Diameter:,"6,752 km"
Mass:,6.39 × 10^23 kg (0.11 Earths)
Moons:,2 (Phobos & Deimos)
Orbit Distance:,"227,943,824 km (1.38 AU)"
Orbit Period:,687 days (1.9 years)
Surface Temperature:,-87 to -5 °C
First Record:,2nd millennium BC
Recorded By:,Egyptian astronomers


In [36]:
#convert to html table
html_table = df.to_html()
html_table

'<table border="1" class="dataframe">\n  <thead>\n    <tr style="text-align: right;">\n      <th></th>\n      <th>Value</th>\n    </tr>\n    <tr>\n      <th>Description</th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>Equatorial Diameter:</th>\n      <td>6,792 km</td>\n    </tr>\n    <tr>\n      <th>Polar Diameter:</th>\n      <td>6,752 km</td>\n    </tr>\n    <tr>\n      <th>Mass:</th>\n      <td>6.39 × 10^23 kg (0.11 Earths)</td>\n    </tr>\n    <tr>\n      <th>Moons:</th>\n      <td>2 (Phobos &amp; Deimos)</td>\n    </tr>\n    <tr>\n      <th>Orbit Distance:</th>\n      <td>227,943,824 km (1.38 AU)</td>\n    </tr>\n    <tr>\n      <th>Orbit Period:</th>\n      <td>687 days (1.9 years)</td>\n    </tr>\n    <tr>\n      <th>Surface Temperature:</th>\n      <td>-87 to -5 °C</td>\n    </tr>\n    <tr>\n      <th>First Record:</th>\n      <td>2nd millennium BC</td>\n    </tr>\n    <tr>\n      <th>Recorded By:</th>\n      <td>Egyptian astronomers</td>\n    </tr>\

# Mars Hemispheres


- Visit the USGS Astrogeology site here to obtain high resolution images for each of Mar's hemispheres.


- You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.


- Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys img_url and title.


- Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

In [75]:
#Hemisphere images URL
hemispheres_url = 'https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars'
browser.visit(hemisphere_url)  

In [76]:
#HTML Object
html_hemispheres = browser.html

#Parse w/ Beautiful Soup
soup = bs(html_hemispheres, 'html.parser')

items = soup.find_all('div', class_='collapsible results')

items

[<div class="collapsible results">
 <div class="item"><a class="itemLink product-item" href="/search/map/Mars/Viking/cerberus_enhanced"><img alt="Cerberus Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/39d3266553462198bd2fbc4d18fbed17_cerberus_enhanced.tif_thumb.png"/></a><div class="description"><a class="itemLink product-item" href="/search/map/Mars/Viking/cerberus_enhanced"><h3>Cerberus Hemisphere Enhanced</h3></a><span class="subtitle" style="float:left">image/tiff 21 MB</span><span class="pubDate" style="float:right"></span><br/><p>Mosaic of the Cerberus hemisphere of Mars projected into point perspective, a view similar to that which one would see from a spacecraft. This mosaic is composed of 104 Viking Orbiter images acquired…</p></div> <!-- end description --></div><div class="item"><a class="itemLink product-item" href="/search/map/Mars/Viking/schiaparelli_enhanced"><img alt="Schiaparelli Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/08eac6e2

In [82]:
hemisphere_title = items[0].find_all('h3')
hemisphere_title

[<h3>Cerberus Hemisphere Enhanced</h3>,
 <h3>Schiaparelli Hemisphere Enhanced</h3>,
 <h3>Syrtis Major Hemisphere Enhanced</h3>,
 <h3>Valles Marineris Hemisphere Enhanced</h3>]

In [83]:
hemisphere_names = []
for title in hemisphere_title:
    hemisphere_names.append(title.text)
    
hemisphere_names

['Cerberus Hemisphere Enhanced',
 'Schiaparelli Hemisphere Enhanced',
 'Syrtis Major Hemisphere Enhanced',
 'Valles Marineris Hemisphere Enhanced']

In [84]:
href_links = items[0].find_all('a')
href_links


[<a class="itemLink product-item" href="/search/map/Mars/Viking/cerberus_enhanced"><img alt="Cerberus Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/39d3266553462198bd2fbc4d18fbed17_cerberus_enhanced.tif_thumb.png"/></a>,
 <a class="itemLink product-item" href="/search/map/Mars/Viking/cerberus_enhanced"><h3>Cerberus Hemisphere Enhanced</h3></a>,
 <a class="itemLink product-item" href="/search/map/Mars/Viking/schiaparelli_enhanced"><img alt="Schiaparelli Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/08eac6e22c07fb1fe72223a79252de20_schiaparelli_enhanced.tif_thumb.png"/></a>,
 <a class="itemLink product-item" href="/search/map/Mars/Viking/schiaparelli_enhanced"><h3>Schiaparelli Hemisphere Enhanced</h3></a>,
 <a class="itemLink product-item" href="/search/map/Mars/Viking/syrtis_major_enhanced"><img alt="Syrtis Major Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/55a0a1e2796313fdeafb17c35925e8ac_syrtis_major_enhanced.tif_thumb.png"/></a>,

In [88]:
hemisphere_urls = []

hemispheres_main_url = 'https://astrogeology.usgs.gov'
#Loop through items
for item in href_links: 
    
    if (item.img): 
        item_url = hemispheres_main_url + item['href']
        hemisphere_urls.append(item_url)

hemisphere_urls
    
    
#     title = item.find('h3').text
    
#     partial_img_url = item.find('a', class_='itemLink product-item')['href']
    
#     browser.visit(hemispheres_main_url + partial_img_url)
    
#     partial_img_html = browser.html
    
#     soup = bs(partial_img_html, 'html.parser')
    
#     img_url = hemispheres_main_url + soup.find('img', class_='wide-image')['src']
    
#     hemisphere_image_urls.append({"title" : title, "img_url" : img_url})
    

# hemispheres_image_urls

['https://astrogeology.usgs.gov/search/map/Mars/Viking/cerberus_enhanced',
 'https://astrogeology.usgs.gov/search/map/Mars/Viking/schiaparelli_enhanced',
 'https://astrogeology.usgs.gov/search/map/Mars/Viking/syrtis_major_enhanced',
 'https://astrogeology.usgs.gov/search/map/Mars/Viking/valles_marineris_enhanced']

In [92]:
hemisphere_img = []
#looking through each url
for url in hemisphere_urls:
#browser visiting each url
    browser.visit(url)
    html = browser.html
    soup = bs(html, 'html.parser')
    
    img_result = soup.find_all('img', class_='wide-image')
    img_source = img_result[0]['src']
    full_image = hemispheres_main_url + img_source
    hemisphere_img.append({'Title': hemisphere_names, 'Images': full_image })

hemisphere_img

[{'Title': ['Cerberus Hemisphere Enhanced',
   'Schiaparelli Hemisphere Enhanced',
   'Syrtis Major Hemisphere Enhanced',
   'Valles Marineris Hemisphere Enhanced'],
  'Images': 'https://astrogeology.usgs.gov/cache/images/f5e372a36edfa389625da6d0cc25d905_cerberus_enhanced.tif_full.jpg'},
 {'Title': ['Cerberus Hemisphere Enhanced',
   'Schiaparelli Hemisphere Enhanced',
   'Syrtis Major Hemisphere Enhanced',
   'Valles Marineris Hemisphere Enhanced'],
  'Images': 'https://astrogeology.usgs.gov/cache/images/3778f7b43bbbc89d6e3cfabb3613ba93_schiaparelli_enhanced.tif_full.jpg'},
 {'Title': ['Cerberus Hemisphere Enhanced',
   'Schiaparelli Hemisphere Enhanced',
   'Syrtis Major Hemisphere Enhanced',
   'Valles Marineris Hemisphere Enhanced'],
  'Images': 'https://astrogeology.usgs.gov/cache/images/555e6403a6ddd7ba16ddb0e471cadcf7_syrtis_major_enhanced.tif_full.jpg'},
 {'Title': ['Cerberus Hemisphere Enhanced',
   'Schiaparelli Hemisphere Enhanced',
   'Syrtis Major Hemisphere Enhanced',
   

In [94]:
hemispheres_url = 'https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars'
browser.visit(hemispheres_url)
#HTML Object
html_hemispheres = browser.html

#Parse w/ Beautiful Soup
soup = bs(html_hemispheres, 'html.parser')

items = soup.find_all('div', class_='item')

items



[<div class="item"><a class="itemLink product-item" href="/search/map/Mars/Viking/cerberus_enhanced"><img alt="Cerberus Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/39d3266553462198bd2fbc4d18fbed17_cerberus_enhanced.tif_thumb.png"/></a><div class="description"><a class="itemLink product-item" href="/search/map/Mars/Viking/cerberus_enhanced"><h3>Cerberus Hemisphere Enhanced</h3></a><span class="subtitle" style="float:left">image/tiff 21 MB</span><span class="pubDate" style="float:right"></span><br/><p>Mosaic of the Cerberus hemisphere of Mars projected into point perspective, a view similar to that which one would see from a spacecraft. This mosaic is composed of 104 Viking Orbiter images acquired…</p></div> <!-- end description --></div>,
 <div class="item"><a class="itemLink product-item" href="/search/map/Mars/Viking/schiaparelli_enhanced"><img alt="Schiaparelli Hemisphere Enhanced thumbnail" class="thumb" src="/cache/images/08eac6e22c07fb1fe72223a79252de20_schiapa

In [95]:
hemispheres_url = []

hemispheres_main_url = 'https://astrogeology.usgs.gov'
#Loop through items
for item in items: 
      
    image = item.find('a')['href']  
    title = item.find('div', class_='description').find('a').find('h3').text
    full_url = hemispheres_main_url + image
          
    browser.visit(full_url)
    
    partial_img_html = browser.html
    
    soup = bs(partial_img_html, 'html.parser')
    
    img_url = soup.find('div', class_='downloads').find('ul').find('li').find('a')['href']
    
    hemispheres_url.append({"title" : title, "img_url" : img_url})
    

hemispheres_url

[{'title': 'Cerberus Hemisphere Enhanced',
  'img_url': 'https://astropedia.astrogeology.usgs.gov/download/Mars/Viking/cerberus_enhanced.tif/full.jpg'},
 {'title': 'Schiaparelli Hemisphere Enhanced',
  'img_url': 'https://astropedia.astrogeology.usgs.gov/download/Mars/Viking/schiaparelli_enhanced.tif/full.jpg'},
 {'title': 'Syrtis Major Hemisphere Enhanced',
  'img_url': 'https://astropedia.astrogeology.usgs.gov/download/Mars/Viking/syrtis_major_enhanced.tif/full.jpg'},
 {'title': 'Valles Marineris Hemisphere Enhanced',
  'img_url': 'https://astropedia.astrogeology.usgs.gov/download/Mars/Viking/valles_marineris_enhanced.tif/full.jpg'}]