GitHub - Marianna-Karavangeli/Earthquake_scraper

Data pipelines project-Earthquake scraper

The Earthquake scraper is a web scraper built in Python. Its purpose is to collect data on earthquakes registered by the United States Geological Survey. The data collected follow the format:

Magnitude
Place
Datetime
Depth

The purpose of this dataset is its possible use in an ML algorithm in order to predict features of earthquakes.

Necessary libraries and packages

In order to make use of this scrapers, the following libraries are required:

selenium
os
time
pandas

In order to connect to an Amazon S3 bucket, boto3 is also needed. For Selenium, chromedriver or geckodriver are also needed depending on the browser you are using (Google Chrome and Firefox Mozilla accordingly).

Repository content

In this repo you can find the code for the earthquake scraper in a python file (Earthquake scraper.py). In this intuitive python script you can see the process from beginning to end: from visiting the main page all the way to the saving and export of the dataset. You can also find a block of code which is used to upload the collected data to an Amazon S3 bucket (link to the AWS S3 bucket: https://earthquakescraper.s3.amazonaws.com/df.csv) . There will also be a .csv file with the initial collection of data.

Step-by-step data collection using the Earthquake scraper

Visit the webpage:

Set the desired start date and time (YYYY-MM-DD HH:MM:SS):

Get and save the results:

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Earthquake_scraper		Earthquake_scraper
Test		Test
.deepsource.toml		.deepsource.toml
.gitignore		.gitignore
Data pipelines project-earthquake scraper.zip		Data pipelines project-earthquake scraper.zip
Earthquake scraper.py		Earthquake scraper.py
LICENSE.md		LICENSE.md
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Earthquake_scraper

Earthquake_scraper

Test

Test

.deepsource.toml

.deepsource.toml

.gitignore

.gitignore

Data pipelines project-earthquake scraper.zip

Data pipelines project-earthquake scraper.zip

Earthquake scraper.py

Earthquake scraper.py

LICENSE.md

LICENSE.md

README.md

README.md

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

Data pipelines project-Earthquake scraper

Necessary libraries and packages

Repository content

Step-by-step data collection using the Earthquake scraper

About

Releases

Packages

Contributors 2

Languages

License

Marianna-Karavangeli/Earthquake_scraper

Folders and files

Latest commit

History

Repository files navigation

Data pipelines project-Earthquake scraper

Necessary libraries and packages

Repository content

Step-by-step data collection using the Earthquake scraper

About

Resources

License

Stars

Watchers

Forks

Languages