Skip to content

charlesphil/mars-scrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mars Scrape - Python Web Scraping Project

Check out the web app for this project!

https://charlesphil-mars-scrape.herokuapp.com/

About this project

The purpose of this site is to demonstrate the scraping, loading, and storage of many types of content from websites related to Mars and Mars exploration. To access the elements in the HTML, I used the popular BeautifulSoup Python library. To automate the clicking of items for access to these high quality images, I used the Splinter Python library to interact with elements in these pages.

Latest News

The "Latest News" card scrapes the NASA news page to obtain a high resolution image link of the article as well as getting the headline and blurb.

Image of the Week

The "Image of the Week" comes from NASA's home page, which gets the link from the image element along with the name of the image.

Mars Facts

The "Mars Facts" card scrapes the table element on the NASA Mars Facts page and is processed using the Pandas Python library, which then gets exported as a string of HTML code.

Mars Hemispheres

Lastly, the "Mars Hemisphere" card gets the high resolution images and names from the United States Geological Survey Astropedia page on Mars.

MongoDB Logo

For storing the data, I opted to use MongoDB, a NoSQL database, as this project mainly reads stored data that will not change very frequently. I do not have much need to write large amounts of data, and instead I am solely focused on easy content management of a few documents.

Local Installation

Setting up the environment

Environment File

This project uses Anaconda environments to manage dependencies. In order to install the dependencies required for running the Flask app and Jupyter Notebook, first clone the repository and go to the console and type conda env create -f environment.yml while in the project repository with Anaconda running to set up the Conda environment.

Create Command

Installing Environment

Conda Activate

Once installed, activate the environment with conda activate mars_scrape and you will then be able to run the Python app inside Missions_to_Mars.

Finished Installation

Your console will look different depending on your set up.

Start the Mongo database

This project requires MongoDB, a NoSQL database. If MongoDB is not installed on your device, please refer to https://www.mongodb.com/try/download/community for installation.

Please follow these instructions to install and start the service on your platform:

Windows

macOS

Linux (Red Hat, Ubuntu, Debian, SUSE, Amazon)

Running the Flask App

Once the environment is set up, navigate your console to Missions_to_Mars/ and run the command Python app.py.

App Folder

Running the App

Open either Google Chrome or Mozilla Firefox to the localhost address listed in your console (most commonly will be http://127.0.0.1:5000/). This project requires either Chrome or Firefox to be installed on your system as the web automation library uses either the Chrome or Gecko web drivers to run the scrape.

About

Scraping websites for information on Mars to display on a Flask app.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published