# Web Scraping Challenge

## Part 1 - Scraping
### NASA Mars News

* Scrape the [NASA Mars News Site](https://mars.nasa.gov/news/) and collect the latest News Title and Paragraph Text. Assign the text to variables that you can reference later.


In [1]:
# importing the dependencies that I think I will need
import requests
import os
from datetime import datetime
from bs4 import BeautifulSoup as bs
from splinter import Browser
from pprint import pprint

In [2]:
# Establishing the executable path of the chromedriver
executable_path = {'executable_path': 'chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)

In [3]:
# Getting the url and establishing the soup format thing 
url = 'https://mars.nasa.gov/news/'
response = requests.get(url)
browser.visit(url)

In [4]:
soup_title = bs(response.text, 'html.parser')
soup_paragraph = bs(browser.html, 'html.parser')
pprint(soup_title.prettify()) # Sanity checking
pprint(soup_paragraph.prettify()) # Sanity checking

('<!DOCTYPE html>\n'
 '<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">\n'
 ' <head>\n'
 '  <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>\n'
 '  <!-- Always force latest IE rendering engine or request Chrome Frame -->\n'
 '  <meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/>\n'
 '  <script type="text/javascript">\n'
 '   '
 'window.NREUM||(NREUM={});NREUM.info={"beacon":"bam.nr-data.net","errorBeacon":"bam.nr-data.net","licenseKey":"5e33925808","applicationID":"59562082","transactionName":"JVcPR0MLWApSRU1eAQVVEhxSC1oSUlkWbBMHXwRAHhdcCUA=","queueTime":0,"applicationTime":291,"agent":""}\n'
 '  </script>\n'
 '  <script type="text/javascript">\n'
 '   '
 '(window.NREUM||(NREUM={})).loader_config={xpid:"VQcPUlZTDxAFXVRUBQEPVA==",licenseKey:"5e33925808",applicationID:"59562082"};window.NREUM||(NREUM={}),__nr_require=function(t,n,e){function '
 'r(e){if(!n[e]){var '
 'o=n[e]={exports:{}};t[e][0].call(o.exports,function(n){var '
 'o=t[e][

('<!DOCTYPE html>\n'
 '<html class="no-flash cookies geolocation svg picture canvas video webgl '
 'srcdoc supports no-hiddenscroll no-touchevents fullscreen flexbox '
 'cssanimations flexboxlegacy no-flexboxtweener csstransforms csstransforms3d '
 'csstransitions preserve3d -webkit-" lang="en" style="--vh:623px;" '
 'xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">\n'
 ' <head>\n'
 '  <script '
 'src="https://bam.nr-data.net/1/5e33925808?a=59562082&amp;v=1167.2a4546b&amp;to=JVcPR0MLWApSRU1eAQVVEhxSC1oSUlkWbBMHXwRAHhdcCUA%3D&amp;rst=2835&amp;ref=https://mars.nasa.gov/news/&amp;ap=291&amp;be=211&amp;fe=2714&amp;dc=1339&amp;af=err,xhr,stn,ins&amp;perf=%7B%22timing%22:%7B%22of%22:1586563081350,%22n%22:0,%22f%22:1,%22dn%22:6,%22dne%22:6,%22c%22:6,%22s%22:26,%22ce%22:59,%22rq%22:64,%22rp%22:92,%22rpe%22:102,%22dl%22:120,%22di%22:1339,%22ds%22:1339,%22de%22:1681,%22dc%22:2712,%22l%22:2713,%22le%22:2720%7D,%22navigation%22:%7B%7D%7D&amp;fp=1081&amp;fcp=1081&amp;jsonp=NREUM.setToken" '
 'ty

 '-128px}.ui-icon-video{background-position:-224px '
 '-128px}.ui-icon-script{background-position:-240px '
 '-128px}.ui-icon-alert{background-position:0 '
 '-144px}.ui-icon-info{background-position:-16px '
 '-144px}.ui-icon-notice{background-position:-32px '
 '-144px}.ui-icon-help{background-position:-48px '
 '-144px}.ui-icon-check{background-position:-64px '
 '-144px}.ui-icon-bullet{background-position:-80px '
 '-144px}.ui-icon-radio-on{background-position:-96px '
 '-144px}.ui-icon-radio-off{background-position:-112px '
 '-144px}.ui-icon-pin-w{background-position:-128px '
 '-144px}.ui-icon-pin-s{background-position:-144px '
 '-144px}.ui-icon-play{background-position:0 '
 '-160px}.ui-icon-pause{background-position:-16px '
 '-160px}.ui-icon-seek-next{background-position:-32px '
 '-160px}.ui-icon-seek-prev{background-position:-48px '
 '-160px}.ui-icon-seek-end{background-position:-64px '
 '-160px}.ui-icon-seek-start{background-position:-80px '
 '-160px}.ui-icon-seek-first{background-posi

 '.search_field::-webkit-input-placeholder{color:rgba(255,255,255,0.8);-webkit-font-smoothing:antialiased;opacity:1 '
 '!important;font-family:"Montserrat",Helvetica,Arial,sans-serif}.section_search '
 '.search_field::-ms-clear,.blog_search_form '
 '.search_field::-ms-clear,.overlay_search '
 '.search_field::-ms-clear,.meganav_overlay_search '
 '.search_field::-ms-clear{display:none;width:0;height:0}.section_search '
 '.search_submit,.blog_search_form .search_submit,.overlay_search '
 '.search_submit,.meganav_overlay_search '
 '.search_submit{padding:0;cursor:pointer;width:42px;height:42px;background:url("https://mars.nasa.gov/assets/ui_sprite@2x.png") '
 '-127px '
 '-5px;background-size:300px;position:absolute;right:-5px;top:-3px;border:none;margin-left:-44px;opacity:.8}.section_search '
 '.search_submit:hover,.blog_search_form .search_submit:hover,.overlay_search '
 '.search_submit:hover,.meganav_overlay_search '
 '.search_submit:hover,.section_search .search_submit.active,.blog_sear

 '.mission_items .mission_item:hover '
 '.mission_title{display:none}.missions_gallery_subnav.open .mission_items '
 '.mission_title{text-align:left;font-size:.9em;font-weight:500;color:#fff;position:absolute;margin-left:.8em;margin-bottom:.9em;bottom:0;padding:0 '
 '5px}.missions_gallery_subnav.open .mission_items .mission_title '
 'span.external_link_icon{margin-left:.1em}.missions_gallery_subnav.open '
 '.mission_items .mission_title span.external_link_icon '
 'svg{display:inline-block}.missions_gallery_subnav.open .mission_items '
 '.mission_image{overflow:hidden;-o-object-fit:cover;object-fit:cover}.missions_gallery_subnav.open '
 '.mission_items .mission_link{width:100%}.missions_gallery_subnav.open '
 '.mission_items '
 '.mission_description{display:none;max-width:100%;color:#fff;position:absolute}.missions_gallery_subnav.open '
 '.mission_items .mission_item:hover '
 '.mission_description{padding:.9rem;position:absolute;opacity:1;height:auto;top:0;right:0;width:100%;height:100%

 '.secondary_nav_desktop '
 'a{color:#5AA1F5;font-size:1.2em;font-weight:700;display:block;padding:.3em '
 '.9em .3em 0}@media (min-width: 1700px){.custom_banner_container '
 '.secondary_nav_desktop a{font-size:1.3em}}.custom_banner_container '
 '.secondary_nav_desktop li.current a,.custom_banner_container '
 '.secondary_nav_desktop li:hover '
 'a{text-decoration:none;color:white}.megasection_nav_present '
 'nav.secondary_nav.secondary_nav_mobile{visibility:hidden}.homepage_carousel '
 '.master-slider .ms-slide-bgvideocont{-webkit-transform:none '
 '!important;transform:none '
 '!important}#masterslider{height:480px;width:100%}@media only screen and '
 '(orientation: landscape){#masterslider{height:260px}}@media (min-width: '
 '600px), print{#masterslider{height:500px}}@media only screen and (min-width: '
 '600px) and (orientation: landscape){#masterslider{height:350px}}@media '
 '(min-width: 769px), print{#masterslider{height:500px}}@media only screen and '
 '(min-width: 769px) and (o

 '.slide:hover '
 '.rollover_description{padding:.9rem;position:absolute;opacity:1;height:auto;top:0;right:0;width:100%;height:100%;color:white;background-color:rgba(0,0,0,0.9);cursor:pointer;font-size:.95rem;line-height:1.3}.no-touchevents '
 '.slide:hover .rollover_description '
 'p{line-height:inherit;font-size:inherit;color:white}.no-touchevents '
 '.slide:hover .rollover_description '
 'p:first-child{margin-top:0}.no-touchevents .slide:hover '
 '.rollover_title{font-size:1.6em;font-weight:700;margin-bottom:.2em}.no-touchevents '
 '.slide:hover '
 '.overlay_arrow{height:14px;width:14px;position:absolute;right:14px;bottom:14px;display:block}.no-touchevents '
 '.slide:hover .overlay_arrow img{display:block}}.list_view '
 '.rollover_description{display:none}.fancybox-overlay,#fancybox-lock{background:#000 '
 '!important}.fancybox-wrap,.fancybox-wrap *{-safari-box-sizing:content-box '
 '!important;box-sizing:content-box !important}.fancybox-wrap '
 '.fancybox-inner{box-shadow:none !imp

 '0}}@media (min-width: 480px){.wysiwyg_content '
 '.image_module.right{float:right;margin:1em 0 1.5em 2.5em}}.wysiwyg_content '
 '.image_module.full-bleed,.wysiwyg_content '
 '.image_module.full_width,.wysiwyg_content '
 '.image_module.wide,.wysiwyg_content .image_module.parallax,.wysiwyg_content '
 '.image_module.column-width{clear:both}.wysiwyg_content '
 '.image_module.parallax_module{width:100%;position:relative}.wysiwyg_content '
 '.image_module.parallax_module .caption{margin:.8em .8em 0 '
 '.8em;font-size:.8em;color:#5a6470}@media (min-width: 769px), '
 'print{.wysiwyg_content .image_module.parallax_module '
 '.caption{font-size:.88em}}.explore_overlay_page .wysiwyg_content '
 '.image_module.parallax_module .caption{color:#b0b4b9}.feature_pages '
 '.wysiwyg_content .image_module{width:94%;max-width:100%;margin:3em '
 'auto;float:none}@media (min-width: 600px), print{.feature_pages '
 '.wysiwyg_content .image_module{max-width:600px}}.feature_pages '
 '.wysiwyg_content .image_mod

 '13px}.fullscreen_element:hover '
 '.fullscreen-icon{background:url("https://mars.nasa.gov/assets/fullscreen_sprite@2x.png") '
 '1px 0px;background-size:25px}.fullscreen_element.fullscreen-mode '
 '.fullscreen-icon{background:url("https://mars.nasa.gov/assets/fullscreen_sprite@2x.png") '
 '1px -74px;background-size:25px}.fullscreen_element.fullscreen-mode:hover '
 '.fullscreen-icon{background:url("https://mars.nasa.gov/assets/fullscreen_sprite@2x.png") '
 '1px '
 '-49px;background-size:25px}#timeline-embed:-webkit-full-screen{height:100%;width:100%;min-height:none;max-height:none}#timeline-embed:-moz-full-screen{height:100%;width:100%;min-height:none;max-height:none}#timeline-embed:-ms-fullscreen{height:100%;width:100%;min-height:none;max-height:none}#timeline-embed:fullscreen{height:100%;width:100%;min-height:none;max-height:none}@media '
 '(min-width: 1024px), '
 'print{.double_teaser{padding-left:12%;padding-right:12%}}.double_teaser '
 '.column{width:100%}@media (min-width: 600px)

 '.nav_content_container{max-width:1300px;width:94%;margin:auto;display:-webkit-box;display:flex;height:100%}}.parallax_categorized_teaser '
 'nav.catcont_nav{width:260px;margin-right:3%}@media (min-width: 1024px), '
 'print{.parallax_categorized_teaser '
 'nav.catcont_nav{width:300px;margin-right:4%}}@media (min-width: '
 '1200px){.parallax_categorized_teaser '
 'nav.catcont_nav{width:340px;margin-right:5.5%}}@media (min-width: '
 '1700px){.parallax_categorized_teaser '
 'nav.catcont_nav{margin-right:7.5%}}.parallax_categorized_teaser '
 'nav.catcont_nav '
 '.section{background-color:rgba(129,33,24,0.4);margin-bottom:1px;cursor:pointer;text-align:center;padding:14px '
 '5%;-webkit-transition:background-color 200ms;transition:background-color '
 '200ms}.no-touchevents .parallax_categorized_teaser nav.catcont_nav '
 '.section:hover:not(.current){background-color:rgba(129,33,24,0.8)}@media '
 '(min-width: 480px){.parallax_categorized_teaser nav.catcont_nav '
 '.section{padding:14px 10%}}

 '#secondary_column h4{background-color:#f3f3f3}.faceted_search '
 '#secondary_column '
 '.form_section{position:relative;margin-bottom:2px}.faceted_search '
 '#secondary_column .form_section fieldset{display:none}.faceted_search '
 '#secondary_column .form_section '
 '.expand_section{display:block;position:absolute;top:5px;right:0px;cursor:pointer;padding:12px}.faceted_search '
 '#secondary_column .form_section .expand_section .arrow_down{border-left:6px '
 'solid transparent;border-right:6px solid transparent;border-top:6px solid '
 '#57585a;-webkit-transition:-webkit-transform '
 '0.2s;transition:-webkit-transform 0.2s;transition:transform '
 '0.2s;transition:transform 0.2s, -webkit-transform 0.2s}.faceted_search '
 '#secondary_column .form_section.open fieldset,.faceted_search '
 '#secondary_column .form_section.open '
 '.featured_links{display:block}.faceted_search #secondary_column '
 '.form_section.open .expand_section '
 '.arrow_down{-webkit-transform:rotate(180deg);transform:r

 '  position: fixed;\n'
 '  top: 50%;\n'
 '  left: 50%;\n'
 '  margin-top: -30px;\n'
 '  margin-left: -30px;\n'
 '  width: 60px;\n'
 '  height: 60px;\n'
 '  background-color: #111;\n'
 '  background-image: '
 'url(

 'small:last-child{margin-right:2.5%;float:right}.at4-whatsnextmobile '
 '.at-whatsnext-content{height:100%}.at4-whatsnextmobile.ats-dark{background:#262b30;color:#fff}.at4-whatsnextmobile '
 '.at-close-control button{color:#bfbfbf}.at4-whatsnextmobile.ats-dark '
 'a:link,.at4-whatsnextmobile.ats-dark '
 'a:visited{color:#fff}.at4-whatsnextmobile.ats-gray{background:#f2f2f2;color:#262b30}.at4-whatsnextmobile.ats-light{background:#fff;color:#262b30}.at4-whatsnextmobile.ats-dark '
 '.footer a:link,.at4-whatsnextmobile.ats-dark .footer '
 'a:visited,.at4-whatsnextmobile.ats-gray .footer '
 'a:link,.at4-whatsnextmobile.ats-gray .footer '
 'a:visited,.at4-whatsnextmobile.ats-light .footer '
 'a:link,.at4-whatsnextmobile.ats-light .footer '
 'a:visited{color:#a1a1a1}.at4-whatsnextmobile.ats-gray '
 'a:link,.at4-whatsnextmobile.ats-gray '
 'a:visited,.at4-whatsnextmobile.ats-light '
 'a:link,.at4-whatsnextmobile.ats-light a:visited{color:#262b30}@media only '
 'screen and (min-device-width:32

 'Brakes\n'
 '                 </h3>\n'
 '                </div>\n'
 '               </div>\n'
 '              </a>\n'
 '              <div class="list_text">\n'
 '               <div class="list_date">\n'
 '                April  3, 2020\n'
 '               </div>\n'
 '               <div class="content_title">\n'
 '                <a '
 'href="/news/8641/nasas-perseverance-mars-rover-gets-its-wheels-and-air-brakes/" '
 'target="_self">\n'
 "                 NASA's Perseverance Mars Rover Gets Its Wheels and Air "
 'Brakes\n'
 '                </a>\n'
 '               </div>\n'
 '               <div class="article_teaser_body">\n'
 '                After the rover was shipped from JPL to Kennedy Space '
 'Center, the team is getting closer to finalizing the spacecraft for launch '
 'later this summer.\n'
 '               </div>\n'
 '              </div>\n'
 '             </div>\n'
 '            </li>\n'
 '            <li class="slide">\n'
 '             <div class="image_and_descripti

 '             <div class="image_and_description_container">\n'
 '              <a '
 'href="/news/8595/nasas-maven-maps-winds-in-the-martian-upper-atmosphere-that-mirror-the-terrain-below-and-gives-clues/" '
 'target="_self">\n'
 '               <div class="rollover_description">\n'
 '                <div class="rollover_description_inner">\n'
 '                 Researchers have created the first map of wind circulation '
 'in the upper atmosphere of a planet besides Earth, using data from NASA’s '
 'MAVEN spacecraft that were collected during the last two years.\n'
 '                </div>\n'
 '                <div class="overlay_arrow">\n'
 '                 <img alt="More" src="/assets/overlay-arrow.png"/>\n'
 '                </div>\n'
 '               </div>\n'
 '               <div class="list_image">\n'
 '                <img alt="" '
 'src="/system/news_items/list_view_images/8595_mars_winds_preview_4-320x240.jpg"/>\n'
 '               </div>\n'
 '               <div class="bo

 '              <div class="rollover_description">\n'
 '               <div class="rollover_description_inner">\n'
 "                The team also fueled the rover's sky crane to get ready for "
 "this summer's history-making launch.\n"
 '               </div>\n'
 '               <div class="overlay_arrow">\n'
 '                <img alt="More" src="/assets/overlay-arrow.png"/>\n'
 '               </div>\n'
 '              </div>\n'
 '              <img alt="Mars Helicopter Attached to NASA\'s Perseverance '
 'Rover" class="img-lazy" '
 'src="/system/news_items/list_view_images/8645_PIA23824-RoverWithHelicopter-32x24.jpg?1586563082803" '
 'style="opacity: 1;"/>\n'
 '             </a>\n'
 '            </div>\n'
 '            <div class="content_title">\n'
 '             <a '
 'href="/news/8645/mars-helicopter-attached-to-nasas-perseverance-rover/">\n'
 "              Mars Helicopter Attached to NASA's Perseverance Rover\n"
 '             </a>\n'
 '            </div>\n'
 '           </div

In [5]:
title = soup_title.find('div', class_='content_title').find('a').text

paragraph = soup_paragraph.find('div', class_='article_teaser_body').text

print(title)
print(paragraph)
print(f"This was done on {datetime.now().date()}")


Mars Helicopter Attached to NASA's Perseverance Rover

The team also fueled the rover's sky crane to get ready for this summer's history-making launch.
This was done on 2020-04-10


### JPL Mars Space Images - Featured Image

In [69]:
from splinter import Browser
from bs4 import BeautifulSoup as bs
executable_path = {'executable_path': 'chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)

In [70]:
url = 'https://www.jpl.nasa.gov/spaceimages/?search=&category=Mars'
browser.visit(url)
html = browser.html
soup = bs(html, 'html.parser')
print(soup.prettify())

<!DOCTYPE html>
<!--[if IE 9]> <html class="no-js ie ie9" lang="en"> <![endif]-->
<!--[if IE 8]> <html class="no-js ie ie8" lang="en"> <![endif]-->
<html class="js flexbox canvas canvastext webgl no-touch geolocation postmessage websqldatabase indexeddb hashchange history draganddrop websockets rgba hsla multiplebgs backgroundsize borderimage borderradius boxshadow textshadow opacity cssanimations csscolumns cssgradients cssreflections csstransforms csstransforms3d csstransitions fontface generatedcontent video audio localstorage sessionstorage webworkers applicationcache svg inlinesvg smil svgclippaths -webkit-" style="" xmlns="http://www.w3.org/1999/xhtml">
 <!-- START HEADER: "DEFAULT" -->
 <!-- Google Tag Manager -->
 <head>
  <script async="" src="https://www.google-analytics.com/analytics.js" type="text/javascript">
  </script>
  <script src="https://m.addthis.com/live/red_lojson/300lo.json?si=5e9128fc1de2b61e&amp;bkl=0&amp;bl=1&amp;pdt=1380&amp;sid=5e9128fc1de2b61e&amp;pub=&amp;

In [75]:
article = soup.find('article', class_='carousel_item')
print(article)

<article alt="All Eyes on Oldest Recorded Supernova" class="carousel_item" style="background-image: url('/spaceimages/images/wallpaper/PIA14872-1920x1200.jpg');">
<div class="default floating_text_area ms-layer">
<h2 class="category_title">
</h2>
<h2 class="brand_title">
				  FEATURED IMAGE
				</h2>
<h1 class="media_feature_title">
				  All Eyes on Oldest Recorded Supernova				</h1>
<div class="description">
</div>
<footer>
<a class="button fancybox" data-description="This image combines data from four different space telescopes to create a multi-wavelength view of all that remains of the oldest documented example of a supernova, called RCW 86." data-fancybox-group="images" data-fancybox-href="/spaceimages/images/mediumsize/PIA14872_ip.jpg" data-link="/spaceimages/details.php?id=PIA14872" data-title="All Eyes on Oldest Recorded Supernova" id="full_image">
					FULL IMAGE
				  </a>
</footer>
</div>
<div class="gradient_container_top"></div>
<div class="gradient_container_bottom"></d

In [82]:
image_extension  = article['style'].split("('", 1)[1].split("')")[0]
print(image_extension)

/spaceimages/images/wallpaper/PIA14872-1920x1200.jpg


In [83]:
image_url = f'jpl.nsas.gov{image_extension}'
print(image_url)

jpl.nsas.gov/spaceimages/images/wallpaper/PIA14872-1920x1200.jpg


### Mars Weather

In [102]:
from splinter import Browser
from bs4 import BeautifulSoup as bs
executable_path = {'executable_path': 'chromedriver.exe'}
browser = Browser('chrome', **executable_path, headless=False)

In [124]:
url = 'https://twitter.com/marswxreport'
browser.visit(url)
html = browser.html

In [125]:
soup = bs(html, 'html5lib')
print(soup.prettify())

<!DOCTYPE html>
<html dir="ltr" lang="en" xmlns="http://www.w3.org/1999/xhtml">
 <head>
  <meta charset="utf-8"/>
  <meta content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=0,viewport-fit=cover" name="viewport"/>
  <link href="//abs.twimg.com" rel="preconnect"/>
  <link href="//api.twitter.com" rel="preconnect"/>
  <link href="//pbs.twimg.com" rel="preconnect"/>
  <link href="//t.co" rel="preconnect"/>
  <link href="//video.twimg.com" rel="preconnect"/>
  <link href="//abs.twimg.com" rel="dns-prefetch"/>
  <link href="//api.twitter.com" rel="dns-prefetch"/>
  <link href="//pbs.twimg.com" rel="dns-prefetch"/>
  <link href="//t.co" rel="dns-prefetch"/>
  <link href="//video.twimg.com" rel="dns-prefetch"/>
  <link as="script" crossorigin="anonymous" href="https://abs.twimg.com/responsive-web/web/polyfills.c34a8344.js" nonce="" rel="preload"/>
  <link as="script" crossorigin="anonymous" href="https://abs.twimg.com/responsive-web/web/vendors~main.e0482f54.js" nonce=""

In [130]:
weather_latest = soup.find('div', class_=None).text
print(weather_latest)

Something went wrong, but don’t fret — let’s give it another shot.


### Mars Facts

### Mars Hemispheres
