Skip to content

A Python scraper for collecting historical space launch data from NextSpaceFlight.com and storing it in Google Cloud Storage. Designed to complement the Space-App project by providing a robust data backbone.

License

Tanguy9862/NextSpaceFlight-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NextSpaceFlight-Scrapper

Overview

This Python package is designed to scrape historical space launch data from NextSpaceFlight.com and store it in Google Cloud Storage. It complements the Space-App project by providing the data backbone for various visualizations and analyses.

Features

  • Source: Scrapes comprehensive historical data from NextSpaceFlight.com.
  • Historical Data: Gathers detailed information on past space launches.
  • Data Transformation: Transforms the scraped data into a CSV format for easy consumption.
  • Google Cloud Storage: Automatically uploads the scraped data to Google Cloud Storage.
  • Data Update: Checks for existing data in Google Cloud Storage and appends new data.
  • Error Handling: Robust error handling to ensure data integrity.
  • Logging: Detailed logging for debugging and monitoring.

Installation

To install this package, run:

pip install git+https://github.com/Tanguy9862/NextSpaceFlight-Scrapper.git

Usage

After installation, you can import the package and use the scrape_past_launches_data() function to scrape and update the data.

from next_spaceflight_scrapper import scraper

# Scrape and update historical launch data
scraper.scrape_past_launches_data()

Dependencies

  • Python 3.x
  • BeautifulSoup
  • Requests
  • Pandas
  • Google Cloud Storage

Authentication

To access Google Cloud Storage, you'll need a JSON file containing your GCS authentication keys. Place this file in the past_launches_scrapper directory and name it spacexploration-keys.json.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Related Projects

About

A Python scraper for collecting historical space launch data from NextSpaceFlight.com and storing it in Google Cloud Storage. Designed to complement the Space-App project by providing a robust data backbone.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages