## Step 1 - Scraping

Complete your initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.

* Create a Jupyter Notebook file called `mission_to_mars.ipynb` and use this to complete all of your scraping and analysis tasks. The following outlines what you need to scrape.

In [2]:
# Dependencies
import pymongo
import datetime
import requests
from bs4 import BeautifulSoup as bs
from splinter import Browser
import pandas as pd

In [3]:
# The default port used by MongoDB is 27017
# https://docs.mongodb.com/manual/reference/default-mongodb-port/
conn = 'mongodb://localhost:27017'
client = pymongo.MongoClient(conn)

In [4]:
# Declare the database
db = client.nasa_db

# Declare the collection
collection = db.nasa_db

### NASA Mars News

* Scrape the [NASA Mars News Site](https://mars.nasa.gov/news/) and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

```python
# Example:
news_title = "NASA's Next Mars Mission to Investigate Interior of Red Planet"

news_p = "Preparation of NASA's next spacecraft to Mars, InSight, has ramped up this summer, on course for launch next May from Vandenberg Air Force Base in central California -- the first interplanetary launch in history from America's West Coast."

In [5]:
# Splinter access browser and Chrome opens web site
executable_path = {'executable_path': 'chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)

In [6]:
# Open NASA Mars website
url = 'https://mars.nasa.gov/news/'
browser.visit(url)

In [7]:
# Examine the results using chrome dev tools; then determine elements with title and article info
html = browser.html
soup = bs(html, 'html.parser')

In [8]:
# Scrapping begines here with Beautiful Soup

# results are returned as an iterable list 
results = soup.find_all('li', class_='slide')

# Loop through returned results
for result in results:
    # Error handling
    try:
        # Identify and return title of listing
        news_t = result.find('div', class_='content_title')
        news_title = news_t.text.strip()
        
        news_p1 = result.find('div', class_='article_teaser_body')
        news_p = news_p1.text.strip()

        post = {
    'title': 'news_title',
    'teaser': 'news_p',
} 

# Run only if title, paragraph are avilable
        if (news_title and news_p):
            # Print results
            print('-------------')
            print('title: ', news_title)
            print('article: ', news_p)
      
        collection.insert_one(post) # put information into mongodb.  article = record and collection = table

    except Exception as e:
        print(e)
        
# Click the 'article title' button on each page
try:
    browser.click_link_by_partial_text('article_title')   
except:
    print('-------------')
    print("Scraping Complete")        

-------------
title:  NASA Invites Students to Name Mars 2020 Rover
article:  Through Nov. 1, K-12 students in the U.S. are encouraged to enter an essay contest to name NASA's next Mars rover.
-------------
title:  NASA's Mars Helicopter Attached to Mars 2020 Rover
article:  The helicopter will be first aircraft to perform flight tests on another planet.
-------------
title:  What's Mars Solar Conjunction, and Why Does It Matter?
article:  NASA spacecraft at Mars are going to be on their own for a few weeks when the Sun comes between Mars and Earth, interrupting communications.
-------------
title:  Scientists Explore Outback as Testbed for Mars
article:  Australia provides a great place for NASA's Mars 2020 and the ESA-Roscosmos ExoMars scientists to hone techniques in preparation for searching for signs ancient life on Mars.
-------------
title:  NASA-JPL Names 'Rolling Stones Rock' on Mars
article:  NASA's Mars InSight mission honored one of the biggest bands of all time at Pasadena

-------------
Scraping Complete



### JPL Mars Space Images - Featured Image

* Visit the url for JPL Featured Space Image [here](https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars).

* Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called `featured_image_url`.

* Make sure to find the image url to the full size `.jpg` image.

* Make sure to save a complete url string for this image.

```python
# Example:
featured_image_url = 'https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA16225_hires.jpg'
```

In [9]:
url = 'https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars'
browser.visit(url)

In [21]:
# Capture the new html from the new page
img_html = browser.html

# Create and feed a new scraper the new html
img_scraper = bs(img_html, 'html.parser')

In [22]:
# Go to next image page by clicking on the "full image" button
full_img = img_scraper.find('a', {'class': 'button fancybox'})
full_img

In [23]:
# Click the 'more info' button
# browser.click_link_by_partial_text('more info')

In [24]:
# Scrape the image from the img element
mars_html = browser.html

mars_scraper = bs(mars_html, 'html.parser')

mars_img = mars_scraper.find('img', {'class': 'main_image'})
mars_img

<img alt="On Nov. 26, 2018, MarCO-B, one of NASAs Mars Cube One (MarCO) CubeSats, took this image of Mars from about 11,300 miles (18,200 kilometers) away shortly before NASAs InSight spacecraft landed on Mars." class="main_image" src="/spaceimages/images/largesize/PIA22831_hires.jpg" title="On Nov. 26, 2018, MarCO-B, one of NASAs Mars Cube One (MarCO) CubeSats, took this image of Mars from about 11,300 miles (18,200 kilometers) away shortly before NASAs InSight spacecraft landed on Mars."/>

In [25]:
featured_image_url = mars_img.get('src')
print("https://www.jpl.nasa.gov",featured_image_url)

https://www.jpl.nasa.gov /spaceimages/images/largesize/PIA22831_hires.jpg



### Mars Weather

* Visit the Mars Weather twitter account [here](https://twitter.com/marswxreport?lang=en) and scrape the latest Mars weather tweet from the page. Save the tweet text for the weather report as a variable called `mars_weather`.

```python
# Example:
mars_weather = 'Sol 1801 (Aug 30, 2017), Sunny, high -21C/-5F, low -80C/-112F, pressure at 8.82 hPa, daylight 06:09-17:55'

In [26]:
url = 'https://twitter.com/marswxreport?lang=en'
browser.visit(url)

In [27]:
# Examine the results, then determine element that contains sought info
html = browser.html
# response = requests.get(url)
# soup = bs(response.text, 'lxml')
soup = bs(html, 'html.parser')
print(soup.prettify())

<!DOCTYPE html>
<html data-scribe-reduced-action-queue="true" lang="en" xmlns="http://www.w3.org/1999/xhtml">
 <head>
  <meta charset="utf-8"/>
  <script async="" src="//www.google-analytics.com/analytics.js">
  </script>
  <script nonce="">
   !function(){window.initErrorstack||(window.initErrorstack=[]),window.onerror=function(r,i,n,o,t){r.indexOf("Script error.")&gt;-1||window.initErrorstack.push({errorMsg:r,url:i,lineNumber:n,column:o,errorObj:t})}}();
  </script>
  <script id="bouncer_terminate_iframe" nonce="">
   if (window.top != window) {
  window.top.postMessage({'bouncer': true, 'event': 'complete'}, '*');
}
  </script>
  <script id="ttft_boot_data" nonce="">
   window.ttftData={"transaction_id":"00e5435700facd35.6588f5c2ade7d7ba\u003c:008a8a390012ec12","server_request_start_time":1567554888651,"user_id":null,"is_ssl":true,"rendered_on_server":true,"is_tfe":true,"client":"macaw-swift","tfe_version":"tsa_b\/1.0.1\/20190812.1754.67a1d8b","ttft_browser":"chrome"};!function(){fu

In [28]:
results = soup.find_all('p', class_='TweetTextSize')

In [29]:
# Loop through returned results
for result in results:
    # Error handling
    try:
        # Identify and return title of listing
        report = result.text.strip()
        print('weather report: ', report)
        print('---------------')
    except Exception as e:
        print(e)      

weather report:  We won’t be hearing from @MarsCuriosity or @NASAInSight for the next 2 weeks during Mars solar conjunction. Read more about why Mars missions go silent every 2 years: https://www.wral.com/mars-spacecraft-go-quiet-during-solar-conjunction/18595551/ …pic.twitter.com/fWruE2v151
---------------
weather report:  InSight sol 265 (2019-08-25) low -99.4ºC (-146.9ºF) high -26.3ºC (-15.3ºF)
winds from the SSE at 5.3 m/s (12.0 mph) gusting to 16.1 m/s (35.9 mph)
pressure at 7.50 hPapic.twitter.com/9YLawm67zS
---------------
weather report:  InSight sol 264 (2019-08-24) low -101.0ºC (-149.7ºF) high -26.7ºC (-16.1ºF)
winds from the SW at 4.4 m/s (9.9 mph) gusting to 17.4 m/s (38.9 mph)
pressure at 7.60 hPapic.twitter.com/xIytu1MnDG
---------------
weather report:  InSight sol 263 (2019-08-23) low -100.9ºC (-149.6ºF) high -27.2ºC (-17.0ºF)
winds from the SW at 4.1 m/s (9.2 mph) gusting to 18.3 m/s (40.9 mph)
pressure at 7.60 hPa
---------------
weather report:  InSight sol 262 (2019

### Mars Facts

* Visit the Mars Facts webpage [here](https://space-facts.com/mars/) and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.

* Use Pandas to convert the data to a HTML table string.


In [30]:
url = 'https://space-facts.com/mars/'

In [31]:
# Used class example 09-panda scraping
tables = pd.read_html(url)
tables

[  Mars - Earth Comparison             Mars            Earth
 0               Diameter:         6,779 km        12,742 km
 1                   Mass:  6.39 × 10^23 kg  5.97 × 10^24 kg
 2                  Moons:                2                1
 3      Distance from Sun:   227,943,824 km   149,598,262 km
 4         Length of Year:   687 Earth days      365.24 days
 5            Temperature:    -153 to 20 °C      -88 to 58°C,
                       0                              1
 0  Equatorial Diameter:                       6,792 km
 1       Polar Diameter:                       6,752 km
 2                 Mass:  6.39 × 10^23 kg (0.11 Earths)
 3                Moons:            2 (Phobos & Deimos)
 4       Orbit Distance:       227,943,824 km (1.38 AU)
 5         Orbit Period:           687 days (1.9 years)
 6  Surface Temperature:                   -87 to -5 °C
 7         First Record:              2nd millennium BC
 8          Recorded By:           Egyptian astronomers]

In [32]:
# table first element
df = tables[0]
df.columns = ['Comparision', 'Mars', 'Earth']
df

Unnamed: 0,Comparision,Mars,Earth
0,Diameter:,"6,779 km","12,742 km"
1,Mass:,6.39 × 10^23 kg,5.97 × 10^24 kg
2,Moons:,2,1
3,Distance from Sun:,"227,943,824 km","149,598,262 km"
4,Length of Year:,687 Earth days,365.24 days
5,Temperature:,-153 to 20 °C,-88 to 58°C


In [33]:
# table second element
df = tables[1]
df.columns = ['Fun Facts', 'Mars']
df

Unnamed: 0,Fun Facts,Mars
0,Equatorial Diameter:,"6,792 km"
1,Polar Diameter:,"6,752 km"
2,Mass:,6.39 × 10^23 kg (0.11 Earths)
3,Moons:,2 (Phobos & Deimos)
4,Orbit Distance:,"227,943,824 km (1.38 AU)"
5,Orbit Period:,687 days (1.9 years)
6,Surface Temperature:,-87 to -5 °C
7,First Record:,2nd millennium BC
8,Recorded By:,Egyptian astronomers


### Mars Hemispheres

* Visit the USGS Astrogeology site [here](https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars) to obtain high resolution images for each of Mar's hemispheres.

* You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.

* Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys `img_url` and `title`.

* Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

```python
# Example:
hemisphere_image_urls = [
    {"title": "Valles Marineris Hemisphere", "img_url": "..."},
    {"title": "Cerberus Hemisphere", "img_url": "..."},
    {"title": "Schiaparelli Hemisphere", "img_url": "..."},
    {"title": "Syrtis Major Hemisphere", "img_url": "..."},
]
```

- - -


In [45]:
url_usgs = 'https://astrogeology.usgs.gov/search/results?q=hemisphere'
browser.visit(url_usgs)

MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=51481): Max retries exceeded with url: /session/5ff695d64c8185607ccd140348d93b61/url (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001FDACA81A58>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

In [None]:
#  following code was borrowed from TA example

In [None]:
urls = [
    'https://astrogeology.usgs.gov/search/map/Mars/Viking/cerberus_enhanced',
    'https://astrogeology.usgs.gov/search/map/Mars/Viking/schiaparelli_enhanced',
    'https://astrogeology.usgs.gov/search/map/Mars/Viking/syrtis_major_enhanced',
    'https://astrogeology.usgs.gov/search/map/Mars/Viking/valles_marineris_enhanced'
]

# create empty dictionary to collect images
image_data = []       image_data = []   

In [43]:
 

# for loop through m_urls list and perform some web scraping logic for each link
for url in urls:
    print(url)

    # create empty dictionary
    album = {}
    
    # click link
    browser.visit(url)
    
    # Scrape the image from the img element
    m_html = browser.html
    m_scraper = bs(m_html, 'html.parser')
    
    # scrape the title and image url
    m_title = m_scraper.find('h2', {'class': 'title'}).get_text()
    
    # add title to album
    album['title'] = m_title
  
    # add image to album
    image_data.append(album)
    
    # repeat scraping and extracting steps for image src ------------need to create my code 
    # jpl_image_url = jpl_image.get('src')
    # jpl_image_url

    # go back a page in the browser
    browser.back()

https://astrogeology.usgs.gov/search/map/Mars/Viking/cerberus_enhanced


MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=51481): Max retries exceeded with url: /session/5ff695d64c8185607ccd140348d93b61/url (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001FDACA81860>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

In [40]:
image_data

[{'title': 'Cerberus Hemisphere Enhanced'},
 {'title': 'Schiaparelli Hemisphere Enhanced'},
 {'title': 'Syrtis Major Hemisphere Enhanced'},
 {'title': 'Valles Marineris Hemisphere Enhanced'}]

In [41]:
# close brower and start step 2
browser.quit()

## Step 2 - MongoDB and Flask Application

Use MongoDB with Flask templating to create a new HTML page that displays all of the information that was scraped from the URLs above.

* Start by converting your Jupyter notebook into a Python script called `scrape_mars.py` with a function called `scrape` that will execute all of your scraping code from above and return one Python dictionary containing all of the scraped data.

* Next, create a route called `/scrape` that will import your `scrape_mars.py` script and call your `scrape` function.

  * Store the return value in Mongo as a Python dictionary.

* Create a root route `/` that will query your Mongo database and pass the mars data into an HTML template to display the data.

* Create a template HTML file called `index.html` that will take the mars data dictionary and display all of the data in the appropriate HTML elements. Use the following as a guide for what the final product should look like, but feel free to create your own design.

![final_app_part1.png](Images/final_app_part1.png)
![final_app_part2.png](Images/final_app_part2.png)

- - -

## Step 3 - Submission

To submit your work to BootCampSpot, create a new GitHub repository and upload the following:

1. The Jupyter Notebook containing the scraping code used.

2. Screenshots of your final application.

3. Submit the link to your new repository to BootCampSpot.