# Step 1 - Scraping

Complete your initial scraping using Jupyter Notebook, BeautifulSoup, Pandas, and Requests/Splinter.

* Create a Jupyter Notebook file called `mission_to_mars.ipynb` and use this to complete all of your scraping and analysis tasks. The following outlines what you need to scrape.

In [33]:
import pandas as pd
from bs4 import BeautifulSoup as bs
import requests
import os
import pymongo
import time
from webdriver_manager.chrome import ChromeDriverManager
from splinter import Browser

In [29]:
# URL of page to be scraped
news_url = 'https://redplanetscience.com/'

# Retrieve page with the requests module
response = requests.get(news_url)

# Create BeautifulSoup object; parse with 'html.parser'
soup = bs(response.text, 'html.parser')

# NASA Mars News

* Scrape the [Mars News Site](https://redplanetscience.com) and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.

```python
# Example:
news_title = "NASA's Next Mars Mission to Investigate Interior of Red Planet"

news_p = "Preparation of NASA's next spacecraft to Mars, InSight, has ramped up this summer, on course for launch next May from Vandenberg Air Force Base in central California -- the first interplanetary launch in history from America's West Coast."
```

In [30]:
# Retrieve the parent divs for all articles
results = soup.find_all('div', class_="list_text")

In [31]:
# loop over results to get article data
for result in results:
    # scrape the article header 
    news_title = result.find('div', class_="content_title").text
    
    # scrape the article subheader
    news_teaser = result.find('div', class_="article_teaser_body").text
    
    # print article data
    print('-----------------')
    print(news_title)
    print(news_teaser)
    print('-----------------')


In [32]:
print(results)

[]


# JPL Mars Space Images - Featured Image

* Visit the url for the Featured Space Image page [here](https://spaceimages-mars.com).

* Use splinter to navigate the site and find the image url for the current Featured Mars Image and assign the url string to a variable called `featured_image_url`.

* Make sure to find the image url to the full size `.jpg` image.

* Make sure to save a complete url string for this image.

```python
# Example:
featured_image_url = 'https://spaceimages-mars.com/image/featured/mars2.jpg'
```

In [40]:
def scrape_mars_images():
    # Set up Splinter
    executable_path = {'executable_path': ChromeDriverManager().install()}
    browser = Browser('chrome', **executable_path, headless=False)

    # Visit visitcostarica.herokuapp.com
    image_url = "https://spaceimages-mars.com/"
    browser.visit(image_url)

    time.sleep(1)

    # Scrape page into Soup
    html = browser.html
    soup = bs(html, "html.parser")

    # Find the src for the sloth image
    relative_image_path = soup.find_all('img')[1]["src"]
    featured_image_url = url + relative_image_path

    # Store data in a dictionary
    images_as_url = {
                "featured_image_url": featured_image_url,
                }

    # Close the browser after scraping
    browser.quit()

    # Return results
    return images_as_url

In [None]:
<img class="fancybox-image" src="image/mars/Proctor Crater Dunes 7.jpg" alt="">

# Mars Facts

* Visit the Mars Facts webpage [here](https://galaxyfacts-mars.com) and use Pandas to scrape the table containing facts about the planet including Diameter, Mass, etc.

* Use Pandas to convert the data to a HTML table string.



In [52]:
facts_url = 'https://galaxyfacts-mars.com/'

In [53]:
# Use Panda's `read_html` to parse the url
tables = pd.read_html(facts_url)

In [67]:
# Find the relevant dataframe
mars_facts_df = pd.DataFrame(tables[1])

In [70]:
# Change column headers
mars_facts_df.columns = ['Measures', 'Mars Planet Profile']

In [77]:
# mars_facts_df

In [72]:
# Drop the first row and set the index to the `Measures` column
mars_facts_df = mars_facts_df.iloc[1:]
mars_facts_df.set_index('Measures', inplace=True)
mars_facts_df

Unnamed: 0_level_0,Mars Planet Profile
Measures,Unnamed: 1_level_1
Polar Diameter:,"6,752 km"
Mass:,6.39 × 10^23 kg (0.11 Earths)
Moons:,2 ( Phobos & Deimos )
Orbit Distance:,"227,943,824 km (1.38 AU)"
Orbit Period:,687 days (1.9 years)
Surface Temperature:,-87 to -5 °C
First Record:,2nd millennium BC
Recorded By:,Egyptian astronomers


In [73]:
mars_facts_html = mars_facts_df.to_html()

In [76]:
print(mars_facts_html)

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Mars Planet Profile</th>
    </tr>
    <tr>
      <th>Measures</th>
      <th></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>Polar Diameter:</th>
      <td>6,752 km</td>
    </tr>
    <tr>
      <th>Mass:</th>
      <td>6.39 × 10^23 kg (0.11 Earths)</td>
    </tr>
    <tr>
      <th>Moons:</th>
      <td>2 ( Phobos &amp; Deimos )</td>
    </tr>
    <tr>
      <th>Orbit Distance:</th>
      <td>227,943,824 km (1.38 AU)</td>
    </tr>
    <tr>
      <th>Orbit Period:</th>
      <td>687 days (1.9 years)</td>
    </tr>
    <tr>
      <th>Surface Temperature:</th>
      <td>-87 to -5 °C</td>
    </tr>
    <tr>
      <th>First Record:</th>
      <td>2nd millennium BC</td>
    </tr>
    <tr>
      <th>Recorded By:</th>
      <td>Egyptian astronomers</td>
    </tr>
  </tbody>
</table>


# Mars Hemispheres

* Visit the Astrogeology site [here](https://marshemispheres.com) to obtain high resolution images for each of Mar's hemispheres.

* You will need to click each of the links to the hemispheres in order to find the image url to the full resolution image.

* Save both the image url string for the full resolution hemisphere image, and the Hemisphere title containing the hemisphere name. Use a Python dictionary to store the data using the keys `img_url` and `title`.

* Append the dictionary with the image url string and the hemisphere title to a list. This list will contain one dictionary for each hemisphere.

```python
# Example:
hemisphere_image_urls = [
    {"title": "Valles Marineris Hemisphere", "img_url": "..."},
    {"title": "Cerberus Hemisphere", "img_url": "..."},
    {"title": "Schiaparelli Hemisphere", "img_url": "..."},
    {"title": "Syrtis Major Hemisphere", "img_url": "..."},
]
```