<h1>Web Scraping Challenge</h1>

<h2>Step 1 - Scraping</h2>


In [1]:
from bs4 import BeautifulSoup as bs
import requests
import pymongo
from splinter import Browser
from selenium import webdriver
from flask import Flask, render_template, redirect
from flask_pymongo import PyMongo
import pandas as pd

In [2]:
executable_path = {'executable_path': 'chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)

<h3>Nasa Mars News</h3>

- Scrape the Mars News Site and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

In [4]:
##PAGE TO BE SCRAPED##
news_url = 'https://mars.nasa.gov/news/'
browser.visit(news_url)
html = browser.html
news_soup = bs(html,'html.parser')

In [6]:
##COLLECT THE LATEST NEWS TITLE AND PARAGRAPH TEXT##
news_title = news_soup.find_all('div', class_='content_title')[0].text
news_body = news_soup.find_all('div', class_='article_teaser_body')[0].text

print(news_title)
print("----------")
print(news_body)

Mars Now
----------
The six-wheeled scientist is heading south to explore Jezero Crater’s lakebed in search of signs of ancient microbial life.


<h3>JPL Mars Space Images</h3>

- Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called featured_image_url.

- Make sure to find the image url to the full size .jpg image.

- Make sure to save a complete url string for this image.

In [37]:
def featured():
    url = "https://spaceimages-mars.com"
    browser.visit(url)
    html_img = browser.html
    soup = bs(html_img,"html.parser")
    featured_img_url = soup.find('div', class_='header')
    jpg = featured_img_url.find('a', class_="showimg fancybox-thumbs")
    jpg = jpg['href']
    featured_img_url = (f'{url}/{jpg}')
    featured_img_url
    return(featured_img_url)

featured()

'https://spaceimages-mars.com/image/featured/mars2.jpg'

<h3>Mars Facts</h3>

- Visit the Mars Facts webpage here and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.

- Use Pandas to convert the data to a HTML table string.

In [49]:
url = 'https://galaxyfacts-mars.com/'
tables = pd.read_html(url)
tables

[                         0                1                2
 0  Mars - Earth Comparison             Mars            Earth
 1                Diameter:         6,779 km        12,742 km
 2                    Mass:  6.39 × 10^23 kg  5.97 × 10^24 kg
 3                   Moons:                2                1
 4       Distance from Sun:   227,943,824 km   149,598,262 km
 5          Length of Year:   687 Earth days      365.24 days
 6             Temperature:     -87 to -5 °C      -88 to 58°C,
                       0                              1
 0  Equatorial Diameter:                       6,792 km
 1       Polar Diameter:                       6,752 km
 2                 Mass:  6.39 × 10^23 kg (0.11 Earths)
 3                Moons:          2 ( Phobos & Deimos )
 4       Orbit Distance:       227,943,824 km (1.38 AU)
 5         Orbit Period:           687 days (1.9 years)
 6  Surface Temperature:                   -87 to -5 °C
 7         First Record:              2nd millennium BC

In [60]:
mars_fact = tables[0]
mars_fact.columns = mars_fact.iloc[0]
mars_fact = mars_fact.reindex(mars_fact.index.drop(0)).reset_index(drop=True)
mars_fact.columns.name = None
mars_fact

Unnamed: 0,Mars - Earth Comparison,Mars,Earth
0,Diameter:,"6,779 km","12,742 km"
1,Mass:,6.39 × 10^23 kg,5.97 × 10^24 kg
2,Moons:,2,1
3,Distance from Sun:,"227,943,824 km","149,598,262 km"
4,Length of Year:,687 Earth days,365.24 days
5,Temperature:,-87 to -5 °C,-88 to 58°C


In [61]:
fact_table = mars_fact.to_html()
fact_table

'<table border="1" class="dataframe">\n  <thead>\n    <tr style="text-align: right;">\n      <th></th>\n      <th>Mars - Earth Comparison</th>\n      <th>Mars</th>\n      <th>Earth</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>Diameter:</td>\n      <td>6,779 km</td>\n      <td>12,742 km</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>Mass:</td>\n      <td>6.39 × 10^23 kg</td>\n      <td>5.97 × 10^24 kg</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>Moons:</td>\n      <td>2</td>\n      <td>1</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>Distance from Sun:</td>\n      <td>227,943,824 km</td>\n      <td>149,598,262 km</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>Length of Year:</td>\n      <td>687 Earth days</td>\n      <td>365.24 days</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>Temperature:</td>\n      <td>-87 to -5 °C</td>\n      <td>-88 to 58°C</td>\n    </tr>\n  </tbody>\n</table>'

In [62]:
fact_table.replace('\n','')
print(fact_table)

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Mars - Earth Comparison</th>
      <th>Mars</th>
      <th>Earth</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>Diameter:</td>
      <td>6,779 km</td>
      <td>12,742 km</td>
    </tr>
    <tr>
      <th>1</th>
      <td>Mass:</td>
      <td>6.39 × 10^23 kg</td>
      <td>5.97 × 10^24 kg</td>
    </tr>
    <tr>
      <th>2</th>
      <td>Moons:</td>
      <td>2</td>
      <td>1</td>
    </tr>
    <tr>
      <th>3</th>
      <td>Distance from Sun:</td>
      <td>227,943,824 km</td>
      <td>149,598,262 km</td>
    </tr>
    <tr>
      <th>4</th>
      <td>Length of Year:</td>
      <td>687 Earth days</td>
      <td>365.24 days</td>
    </tr>
    <tr>
      <th>5</th>
      <td>Temperature:</td>
      <td>-87 to -5 °C</td>
      <td>-88 to 58°C</td>
    </tr>
  </tbody>
</table>


<h3>Mars Hemispheres</h3>

- Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys img_url and title.

- Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.