Skip to content

Web scraping and development using Jupyter Notebook, Python, HTML, Bootstrap, Beautiful Soup, Splinter, and Mongo DB.

Notifications You must be signed in to change notification settings

JennyJohnson78/Mission_to_Mars

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Mission to Mars

Overview

BeautifulSoup and Splinter can be used on scraping-friendly sites to "scrape" images and data about those images. Scraped data can then be stored in a non-relational database such as Mongo DB. A web application is then be used to display the data, and alter the design of the web app to accommodate images. In this analysis, Mars data will be scraped for users to navigate between images online.

Analysis Steps:

  • Scrape Full-Resolution Mars Hemisphere Images and Titles
  • Update the Web App with Mars Hemisphere Images and Titles
  • Add Bootstrap 3 Components

Results

Scrape Full-Resolution Mars Hemisphere Images and Titles

  • Visit website to view Mars images
# Visit the mars nasa news site
url = 'https://redplanetscience.com/'
browser.visit(url)
  • Use the DevTools to inspect the page for the proper elements to scrape
  • Next, create a list to hold the .jpg image URL string and title for each hemisphere image
# Create a list to hold the images and titles.
hemisphere_image_urls = []
  • Write code to retrieve the full-resolution image URL and title for each hemisphere image
# Write code to retrieve the image urls and titles for each hemisphere.
for i in range(4):
    #create empty dictionary
    hemispheres = {}
    browser.find_by_css('a.product-item h3')[i].click()
    element = browser.find_link_by_text('Sample').first
    img_url = element['href']
    title = browser.find_by_css("h2.title").text
    hemispheres["img_url"] = img_url
    hemispheres["title"] = title
    hemisphere_image_urls.append(hemispheres)
    browser.back()

  • Loop through the full-resolution image URL, click the link, find the Sample image anchor tag, and get the href
  • Save the full-resolution image URL string as the value for the img_url key that will be stored in the dictionary

image

  • Save the hemisphere image title as the value for the title key that will be stored in the dictionary
  • Add the dictionary with the image URL string and the hemisphere image title to the list
  • Print the list of dictionary items
# Print the list that holds the dictionary of each image url and title.
hemisphere_image_urls

Update the Web App with Mars Hemisphere Images and Titles

  • Create a new dictionary in the data dictionary to hold a list of dictionaries with the URL string and title of each hemisphere image
  • Create a function that will scrape the hemisphere data by using previous code
  • Run the app.py file, then check the Mongo database to make sure that it's retrieving all of the data
  • Modify the index.html file to access the database
  • Run the app.py file, open the index.html file, and click the "Scrape New Data" button

Results:

image

Add Bootstrap 3 Components

  • Use the Bootstrap 3 grid system to update the index.html file so the website is mobile-responsive
  • Customize the facts table

Facts Table:

image

Summary

Visualizations are an integral part of data analyrics and data science. While more and more companies and institutions are adopting Tableau software, there is still a need for web development within data analytics. This analysis shows how useful and powerful JavaScript can be when it comes to data exploration and visualization.

About

Web scraping and development using Jupyter Notebook, Python, HTML, Bootstrap, Beautiful Soup, Splinter, and Mongo DB.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published