CE observatory data processing

WORK IN PROGRESS 🚀

Purpose

A collection of scripts to:

extract raw data from public official and emerging sources (incl. via API, web scraping and programmatic download requests) starting from those identified through a dataset review;
transform these through steps including: cleaning and reformatting; grouping by classifications and summarising; data validation, interpolation and extrapolation; calculating key variables/metrics; and
export cleaned data outputs to a PostGreSQL database (supabase) for storage.

Data outputs from these scripts are used to populate the ce-observatory - a dashboard providing for specific resource-product-industry categories, a detailed description using high-quality data of current baseline material and monetary flows as well as wider impacts and alongside the means to make comparison with alternative circular economy configurations.

How to use

Software requirements and setup

Scripts in this repository are largely written in the programming language R. Please see here for more information on running R scripts and computer software requirements. Files are packaged within an R Project with relative file paths used to call data inputs and functions. These can be most easily navigated and ran within the R Studio IDE, though this can also be done in the terminal/command line.

The Python programming language has also been used as part of the project in cases where it offers better performance or provides functions not otherwise available in R. Python scripts are largely presented within Jupyter Notebooks - an open source IDE that requires installing the jupyter-notebook package in your Python environment, more information about which can be found here. In some cases, .py Python scripts are also used. These can be viewed and modified in a code editor such as Visual Studio Code and ran in the terminal/command line.

Folder and file descriptions

scripts

Product-group specific scripts

Electronics scripts readme

functions.R

A collection of custom functions regularly used throughout the data processing pipeline and not otherwise provided in R packages.

Updates

The observatory has been designed to incorporate new data as it becomes available to help with timely insight, trend assessment, monitoring and evaluation. Web hooks are used to trigger site rebuild following data updates. Data are updated through scheduled extraction scripts, with imported data undergoing structure, data type and content validation to reduce risk of site build failure.

Feedback

If you identify any issues, please contact: Oliver Lysaght (oliverlysaght@icloud.com)

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
REE		REE
electronics		electronics
plastics		plastics
textiles		textiles
.gitignore		.gitignore
001_international_trade.R		001_international_trade.R
002_domestic_production.R		002_domestic_production.R
003_total_inflows.R		003_total_inflows.R
004_mass_conversion.R		004_mass_conversion.R
005_stock_outflow.R		005_stock_outflow.R
006_outflow_routing.R		006_outflow_routing.R
007_GVA.R		007_GVA.R
008_emissions.R		008_emissions.R
LICENSE		LICENSE
README.md		README.md
functions.R		functions.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CE observatory data processing

WORK IN PROGRESS 🚀

Purpose

How to use

Software requirements and setup

Folder and file descriptions

scripts

Product-group specific scripts

Updates

Feedback

About

Languages

License

OliverLysa/ce_observatory_data_scripts

Folders and files

Latest commit

History

Repository files navigation

CE observatory data processing

WORK IN PROGRESS 🚀

Purpose

How to use

Software requirements and setup

Folder and file descriptions

scripts

Product-group specific scripts

Updates

Feedback

About

Topics

Resources

License

Stars

Watchers

Forks

Languages