### Before You Begin
---
1. Create a new repository for this project called `web-scraping-challenge`. **Do not add this homework to an existing repository.**
2. Clone the new repository to your computer.
3. Inside your local git repository, create a directory for the web scraping challenge. Use a folder name to correspond to the challenge: `Missions_to_Mars`.
4. Add your notebook files to this folder as well as your flask app.
5. Push the above changes to GitHub or GitLab.

### Step 1 - Scraping
---
#### Complete your initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.
- Create a Jupyter Notebook file called `mission_to_mars.ipynb` and use this to complete all of your scraping and analysis tasks. The following outlines what you need to scrape.

In [1]:
# Import module used to connect Python with MongoDb
import pymongo
# Import BeautifulSoup
from bs4 import BeautifulSoup as bs
# Import Requests module
import requests
# Import splinter
from splinter import Browser
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
# Import Pandas
import pandas as pd

In [2]:
# The default port used by MongoDB is 27017
# https://docs.mongodb.com/manual/reference/default-mongodb-port/
conn = 'mongodb://localhost:27017'
client = pymongo.MongoClient(conn)

In [3]:
# Define the 'classDB' database in Mongo
db = client.classDB

In [8]:
# Setup splinter
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)

ValueError: Could not get version for Chrome with this command: reg query "HKEY_CURRENT_USER\Software\Google\Chrome\BLBeacon" /v version

### Scrape NASA Mars News
---
- `https://mars.nasa.gov/news/`
- Scrape the NASA Mars News Site and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

##### Example:
- `news_title = "NASA's Next Mars Mission to Investigate Interior of Red Planet"`
- `news_p = "Preparation of NASA's next spacecraft to Mars, InSight, has ramped up this summer, on course for launch next May from Vandenberg Air Force Base in central California -- the first interplanetary launch in history from America's West Coast."`

In [None]:
url = 'https://mars.nasa.gov/news/'
browser.visit(url)

In [None]:
html = browser.html
soup = bs(html, 'html.parser')
print(soup.prettify())

In [None]:
item_list = soup.find('ul', class_='item_list')

In [None]:
slides = item_list.find_all('li', class_="slide")

In [None]:
news_list = []

In [None]:
for slide in slides:
    news_date = slide.find('div', class_='list_date')
    news_list.append(news_date)
    news_title = slide.find('h3')
    news_list.append(news_title)
    news_p = slide.find('div', class_='rollover_description_inner')
    news_list.append(news_p)
    print('-----')
    print(news_date.text)
    print(news_title.text)
    print(news_p.text)

### Scrape JPL Mars Space Images
---
- `https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars`
- JPL Mars Space Images - Featured Image
- Visit the url for JPL Featured Space Image [here](https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars).
- Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called `featured_image_url`.
- Make sure to find the image url to the full size `.jpg` image.
- Make sure to save a complete url string for this image.

##### Example:
- `featured_image_url = 'https://www.jpl.nasa.gov/spaceimages/images/largesize/PIA16225_hires.jpg'`

In [None]:
url = 'https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars'
browser.visit(url)

In [None]:
html = browser.html
soup = bs(html, 'html.parser')
print(soup.prettify())

In [None]:
articles = soup.find('ul', class_='articles')

In [None]:
slides = articles.find_all('li')

In [None]:
url_list = []

In [None]:
for slide in slides:
    featured_image_url = slide.find('a')['data-fancybox-href']
    featured_image_url = 'https://www.jpl.nasa.gov/' + featured_image_url
    url_list.append(featured_image_url)

In [None]:
url_list

### Scrape Mars Facts
---
- `https://space-facts.com/mars/`
- Visit the Mars Facts webpage [here](https://space-facts.com/mars/) and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.
- Use Pandas to convert the data to a HTML table string.

In [None]:
url = 'https://space-facts.com/mars/'

In [None]:
tables = pd.read_html(url)
tables

In [None]:
type(tables)

In [None]:
df = tables[0]
df

In [None]:
html_table = df.to_html()
html_table

In [None]:
html_table.replace('\n', '')

In [None]:
df.to_html('../Resources/mars_facts_table.html')

### Scrape USGS Astrogeology/Mars Hemispheres from 
---
- `https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars`
- Visit the USGS Astrogeology site [here](https://astrogeology.usgs.gov/search/results?q=hemisphere+enhanced&k1=target&v1=Mars) to obtain high resolution images for each of Mar's hemispheres.
- You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.
- Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys `img_url` and `title`.
- Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

##### Example:
`hemisphere_image_urls = [
    {"title": "Valles Marineris Hemisphere", "img_url": "..."},
    {"title": "Cerberus Hemisphere", "img_url": "..."},
    {"title": "Schiaparelli Hemisphere", "img_url": "..."},
    {"title": "Syrtis Major Hemisphere", "img_url": "..."},
]`

In [None]:
hemisphere_image_urls = []

In [None]:
url = 'https://astrogeology.usgs.gov/search/map/Mars/Viking/cerberus_enhanced'

In [None]:
browser.visit(url)
html = browser.html
soup = bs(html, 'html.parser')
title = soup.find('h2', class_='title').text
hemisphere_image_urls.append(title)
ul = soup.find('ul')
img_url = ul.find('a')['href']
hemisphere_image_urls.append(img_url)

In [None]:
url = 'https://astrogeology.usgs.gov/search/map/Mars/Viking/schiaparelli_enhanced'

In [None]:
browser.visit(url)
html = browser.html
soup = bs(html, 'html.parser')
title = soup.find('h2', class_='title').text
hemisphere_image_urls.append(title)
ul = soup.find('ul')
img_url = ul.find('a')['href']
hemisphere_image_urls.append(img_url)

In [None]:
url = 'https://astrogeology.usgs.gov/search/map/Mars/Viking/syrtis_major_enhanced'

In [None]:
browser.visit(url)
html = browser.html
soup = bs(html, 'html.parser')
title = soup.find('h2', class_='title').text
hemisphere_image_urls.append(title)
ul = soup.find('ul')
img_url = ul.find('a')['href']
hemisphere_image_urls.append(img_url)

In [None]:
url = 'https://astrogeology.usgs.gov/search/map/Mars/Viking/valles_marineris_enhanced'

In [None]:
browser.visit(url)
html = browser.html
soup = bs(html, 'html.parser')
title = soup.find('h2', class_='title').text
hemisphere_image_urls.append(title)
ul = soup.find('ul')
img_url = ul.find('a')['href']
hemisphere_image_urls.append(img_url)

In [None]:
hemisphere_image_urls

In [None]:
# Define a function to convert the list to a dictionary
def Convert(a):
    it = iter(a)
    res_dct = dict(zip(it, it))
    return res_dct
         
# Driver code
print(Convert(hemisphere_image_urls))