In [1]:
# # Mission to Mars

# ![mission_to_mars](Images/mission_to_mars.jpg)

# In this assignment, you will build a web application that scrapes various websites for data related to the Mission to Mars and displays the information in a single HTML page. The following outlines what you need to do.

# ## Step 1 - Scraping

# Complete your initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.

# * Create a Jupyter Notebook file called `mission_to_mars.ipynb` and use this to complete all of your scraping and analysis tasks. The following outlines what you need to scrape.

# ### NASA Mars News

# * Scrape the [NASA Mars News Site](https://mars.nasa.gov/news/) and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

# ```python
# # Example:
# news_title = "NASA's Next Mars Mission to Investigate Interior of Red Planet"

# news_p = "Preparation of NASA's next spacecraft to Mars, InSight, has ramped up this summer, on course for launch next May from Vandenberg Air Force Base in central California -- the first interplanetary launch in history from America's West Coast."
# ```

# ### JPL Mars Space Images - Featured Image

# * Visit the url for JPL Featured Space Image [here](https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars).

# * Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called `featured_image_url`.

# * Make sure to find the image url to the full size `.jpg` image.

# * Make sure to save a complete url string for this image.

# ```python
# # Example:
# featured_image_url = 'https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA16225_hires.jpg'
# ```

# ### Mars Weather

# * Visit the Mars Weather twitter account [here](https://twitter.com/marswxreport?lang=en) and scrape the latest Mars weather tweet from the page. Save the tweet text for the weather report as a variable called `mars_weather`.

# ```python
# # Example:
# mars_weather = 'Sol 1801 (Aug 30, 2017), Sunny, high -21C/-5F, low -80C/-112F, pressure at 8.82 hPa, daylight 06:09-17:55'
# ```

# ### Mars Facts

# * Visit the Mars Facts webpage [here](http://space-facts.com/mars/) and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.

# * Use Pandas to convert the data to a HTML table string.

# ### Mars Hemispheres

# * Visit the USGS Astrogeology site [here](https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars) to obtain high resolution images for each of Mar's hemispheres.

# * You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.

# * Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys `img_url` and `title`.

# * Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

# ```python
# # Example:
# hemisphere_image_urls = [
#     {"title": "Valles Marineris Hemisphere", "img_url": "..."},
#     {"title": "Cerberus Hemisphere", "img_url": "..."},
#     {"title": "Schiaparelli Hemisphere", "img_url": "..."},
#     {"title": "Syrtis Major Hemisphere", "img_url": "..."},
# ]
# ```

# - - -

# ## Step 2 - MongoDB and Flask Application

# Use MongoDB with Flask templating to create a new HTML page that displays all of the information that was scraped from the URLs above.

# * Start by converting your Jupyter notebook into a Python script called `scrape_mars.py` with a function called `scrape` that will execute all of your scraping code from above and return one Python dictionary containing all of the scraped data.

# * Next, create a route called `/scrape` that will import your `scrape_mars.py` script and call your `scrape` function.

#   * Store the return value in Mongo as a Python dictionary.

# * Create a root route `/` that will query your Mongo database and pass the mars data into an HTML template to display the data.

# * Create a template HTML file called `index.html` that will take the mars data dictionary and display all of the data in the appropriate HTML elements. Use the following as a guide for what the final product should look like, but feel free to create your own design.

# ![final_app_part1.png](Images/final_app_part1.png)
# ![final_app_part2.png](Images/final_app_part2.png)

# - - -

# ## Hints

# * Use Splinter to navigate the sites when needed and BeautifulSoup to help find and parse out the necessary data.

# * Use Pymongo for CRUD applications for your database. For this homework, you can simply overwrite the existing document each time the `/scrape` url is visited and new data is obtained.

# * Use Bootstrap to structure your HTML template.



In [2]:
# Complete your initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.

from time import sleep
import requests
import pandas as pd
from bs4 import BeautifulSoup
from pprint import pprint
from splinter import Browser
import pymongo
import selenium

executable_path = {'executable_path': '/usr/local/bin/chromedriver'}
browser = Browser('chrome', **executable_path, headless=False)



In [3]:
# * Scrape the [NASA Mars News Site](https://mars.nasa.gov/news/) and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

nasa_web = "https://mars.nasa.gov/news/"
browser.visit(nasa_web)
html = browser.html
soup = BeautifulSoup(html,'html.parser')




In [4]:
# print(soup.body.prettify())
html_body = soup.body


# collect the latest News Title
slides = soup.find_all('li', class_="slide")

for slide in slides:
    nasa_date = slide.find('div', class_='list_date').text
    nasa_news_title = slide.find('div', class_='content_title').text
    nasa_paragraph = slide.find('div', class_='article_teaser_body').text
    
    
    
    print(nasa_date)
    print(nasa_news_title)
    
    print(nasa_paragraph)
    print('----'*4)
    
    
#     post_1 = {
#         'nasa_date': nasa_date,
#         'nasa_link': nasa_news_title,
#         'nasa_paragraph': nasa_paragraph
#      }
    
#     collection.insert_one(post_1)

# slides.find_all('div', class_='list_date')


December 18, 2018
InSight Engineers Have Made a Martian Rock Garden
Reconstructing Mars here on Earth lets them practice setting down the lander's science instruments.
----------------
December 13, 2018
Mars InSight Lander Seen in First Images from Space 
Look closely, and you can make out the lander's solar panels.
----------------
December 11, 2018
NASA's InSight Takes Its First Selfie
Two new image mosaics detail the lander's deck and "workspace" — the surface where it will eventually set down its science instruments.
----------------
December  7, 2018
NASA InSight Lander 'Hears' Martian Winds 
Vibrations picked up by two spacecraft instruments have provided the first sounds of Martian wind.
----------------
December  6, 2018
NASA's Mars InSight Flexes Its Arm
Now unstowed, the spacecraft's robotic arm will point a camera located on its elbow and take images of the surroundings.
----------------
November 30, 2018
Mars New Home 'a Large Sandbox'
With InSight safely on the surface of 

In [5]:
# ### JPL Mars Space Images - Featured Image

# * Visit the url for JPL Featured Space Image [here](https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars).
# 
mars_images = 'https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars'
browser.visit(mars_images)

In [6]:
# * Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called `featured_image_url`.
##  htmlcode for button.
#---> <a class='button' href style='display: inline-block;'> More</a> 

# * Make sure to find the image url to the full size `.jpg` image.


# h3 = "class=release_date"
#  date = image.find('time')['datetime']



# * Make sure to save a complete url string for this image.
html_3 = browser.html
soup_3 = BeautifulSoup(html_3,'html.parser')

In [7]:
images = soup_3.find_all('li', class_='slide')
    
x = 0 
    
for image in images:
    if x <= 20:
        
        jpl_dates = image.find('h3', class_='release_date').text
        jpl_link = image.a['data-fancybox-href']
        
        x = x +1
        print(x)
        print('--'*8)
        print(jpl_dates)
        print('image url:', 'https://www.jpl.nasa.gov' + jpl_link)
        
        
#         post_2 = {
#             'date': jpl_dates,
#             'link': jpl_link,
#         }
        
#         collection.insert_one(post_2)

1
----------------
December 18, 2018
image url: https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA22955_hires.jpg
2
----------------
December 18, 2018
image url: https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA22954_hires.jpg
3
----------------
December 18, 2018
image url: https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA22953_hires.jpg
4
----------------
December 18, 2018
image url: https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA22952_hires.jpg
5
----------------
December 18, 2018
image url: https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA22951_hires.jpg
6
----------------
December 18, 2018
image url: https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA22880_hires.jpg
7
----------------
December 18, 2018
image url: https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA22879_hires.jpg
8
----------------
December 18, 2018
image url: https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA22744_hires.jpg
9
----------------
December 18, 

In [8]:
# ### Mars Weather

# * Visit the Mars Weather twitter account [here](https://twitter.com/marswxreport?lang=en) and scrape the latest Mars weather tweet from the page. Save the tweet text for the weather report as a variable called `mars_weather`.

mars_twitter = "https://twitter.com/marswxreport?lang=en"

browser.visit(mars_twitter)
html_2 = browser.html
soup_2 = BeautifulSoup(html_2,'html.parser')

In [9]:
tweets = soup_2.find_all('p', class_="TweetTextSize TweetTextSize--normal js-tweet-text tweet-text")

count = 1

for tweet in tweets:
    tweet_post = tweet.text
    
    if tweet.text[:3] == "Sol":
        tweet_split = tweet_post.split(",")
        tweet_dict = {"Sol":tweet_split[0], "high":tweet_split[1], "low":tweet_split[2], "pressure": tweet_split[3], "daylight": tweet_split[4]}
        print('---------')
        print(tweet_dict)
        
        
#         collection.insert_one(tweet_dict)
    


---------
{'Sol': 'Sol 2258 (2018-12-13)', 'high': ' high -6C/21F', 'low': ' low -70C/-93F', 'pressure': ' pressure at 8.41 hPa', 'daylight': ' daylight 06:37-18:51'}
---------
{'Sol': 'Sol 2257 (2018-12-12)', 'high': ' high -4C/24F', 'low': ' low -72C/-97F', 'pressure': ' pressure at 8.36 hPa', 'daylight': ' daylight 06:36-18:50'}
---------
{'Sol': 'Sol 2256 (2018-12-11)', 'high': ' high -11C/12F', 'low': ' low -70C/-93F', 'pressure': ' pressure at 8.43 hPa', 'daylight': ' daylight 06:36-18:50'}
---------
{'Sol': 'Sol 2255 (2018-12-10)', 'high': ' high -11C/12F', 'low': ' low -71C/-95F', 'pressure': ' pressure at 8.41 hPa', 'daylight': ' daylight 06:36-18:50'}
---------
{'Sol': 'Sol 2254 (2018-12-09)', 'high': ' high -13C/8F', 'low': ' low -71C/-95F', 'pressure': ' pressure at 8.45 hPa', 'daylight': ' daylight 06:35-18:49'}
---------
{'Sol': 'Sol 2253 (2018-12-08)', 'high': ' high -15C/5F', 'low': ' low -70C/-93F', 'pressure': ' pressure at 8.45 hPa', 'daylight': ' daylight 06:35-18:4

In [10]:
# ### Mars Facts

# * Visit the Mars Facts webpage [here](http://space-facts.com/mars/) and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.

# * Use Pandas to convert the data to a HTML table string.

In [11]:
mars_facts_url = "http://space-facts.com/mars/"

In [12]:
tables = pd.read_html(mars_facts_url)

In [13]:
df = tables[0]
df.columns = ['Description', 'Value']
df_desc = df.Description
df_val = df.Value




In [14]:
# html_table = df.to_html('table.html')

# html_table.replace('\n', '')

In [15]:
# ### Mars Hemispheres

# * Visit the USGS Astrogeology site [here](https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars) to obtain high resolution images for each of Mar's hemispheres.

# * You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.

# * Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys `img_url` and `title`.

# * Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

# ```python
# # Example:
# hemisphere_image_urls = [
#     {"title": "Valles Marineris Hemisphere", "img_url": "..."},
#     {"title": "Cerberus Hemisphere", "img_url": "..."},
#     {"title": "Schiaparelli Hemisphere", "img_url": "..."},
#     {"title": "Syrtis Major Hemisphere", "img_url": "..."},

In [16]:
astrogeology_url = "https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars"
browser.visit(astrogeology_url)

astro_html = browser.html
astro = BeautifulSoup(astro_html,'html.parser')

In [17]:
browser.find_by_css('.thumb')[0].click()

In [18]:
browser.click_link_by_text('Sample')

In [19]:
link_1 = browser.url

In [20]:
browser.visit(astrogeology_url)
sleep(2)
browser.find_by_css('.thumb')[1].click()
browser.click_link_by_text("Sample")
link_2 = browser.url

In [21]:
browser.visit(astrogeology_url)
browser.find_by_css('.thumb')[2].click()
browser.click_link_by_text("Sample")
link_3 = browser.url

In [22]:
browser.visit(astrogeology_url)
browser.find_by_css('.thumb')[3].click()
browser.click_link_by_text("Sample")
link_4 = browser.url

In [25]:
links_dict = {
    'link1':link_1,
    'link2': link_2,
    'link3': link_3,
    'link4': link_4
}


{'link1': 'https://astrogeology.usgs.gov/search/map/Mars/Viking/cerberus_enhanced',
 'link2': 'https://astrogeology.usgs.gov/search/map/Mars/Viking/schiaparelli_enhanced',
 'link3': 'https://astrogeology.usgs.gov/search/map/Mars/Viking/syrtis_major_enhanced',
 'link4': 'https://astrogeology.usgs.gov/search/map/Mars/Viking/valles_marineris_enhanced'}

In [28]:
# collection.insert_one(links_dict)

<pymongo.results.InsertOneResult at 0x116a29648>

In [None]:
browser.html

In [26]:
# ## Step 2 - MongoDB and Flask Application

# Use MongoDB with Flask templating to create a new HTML page that displays all of the information that was scraped from the URLs above.
conn='mongodb://localhost:27017'
client = pymongo.MongoClient(conn)

# * Start by converting your Jupyter notebook into a Python script called `scrape_mars.py` with a function called `scrape` that will execute all of your scraping code from above and return one Python dictionary containing all of the scraped data.
db = client.mars_db
collection = db.items

# * Next, create a route called `/scrape` that will import your `scrape_mars.py` script and call your `scrape` function.

#   * Store the return value in Mongo as a Python dictionary.

# * Create a root route `/` that will query your Mongo database and pass the mars data into an HTML template to display the data.

# * Create a template HTML file called `index.html` that will take the mars data dictionary and display all of the data in the appropriate HTML elements. Use the following as a guide for what the final product should look like, but feel free to create your own design.

# ![final_app_part1.png](Images/final_app_part1.png)
# ![final_app_part2.png](Images/final_app_part2.png)