Skip to content

R Package to access and format a variety of data from multiple sources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

Ecosystem-Assessments/pipedat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pipedat

lifecycle R-CMD-check

pipedat is a R package that provides analytical pipelines to access, load, and format a variety of data from multiple sources programatically. The goal of pipedat is to enhance the capacity scientists, planners and the wider public to prepare and perform complex and reproducible ecosystem-scale assessments requiring the integration of multiple spatial datasets such as cumulative effects assessments in the context of ecosystem-based management, and Marxan analyses for the establishment of individual and networks of MPAs. In its current format, pipedat is strictly experimental and in development. We are however hoping to further develop this initiative in the hopes of greatly enhancing the efficiency, transparency and reproducibility of large-scale environmental assessments.

Installation

The easiest way to install pipedat is to use remotes:

install.packages("remotes")
remotes::install_github("Environment-Health/pipedat")

Then, load it:

library(pipedat)

Main features

The pipedat package is built around function called pipedat() that is used to access, load and format a wide variety of data; this function calls on a series of individual scripts built to access data programmatically and reproducibly, which we refer to as data pipelines. Individual data pipelines are executed by using their unique identifier, which are specific to the pipedat package. The full list of data pipelines available can be viewed with the pipelist() function:

# View list of pipelines 
# pipelist()

# Download and format a single dataset 
# pipedat("a3jsd4jh")

# Download and format multiple datasets
# pipedat(c("a3jsd4jh","a8732975y","soif8yiao"))

The pipedat() function will export the raw and formatted data in the folder ‘project-data/pipedat/’.

List of pipelines

Pipeline ID Name Description Source
8509eeb1 Nighttime Ligths A new consistently processed time series of annual global VIIRS nighttime lights has been produced from monthly cloud-free average radiance grids spanning 2012 to 2020. The new methodology is a modification of the original method based on nightly data (Annual VNL V1). Visite https://eogdata.mines.edu/products/vnl/#annual_v2 for more information. Elvidge Zhizhin et al. (2021)
8449dee0 AIS global shipping data Monthly shipping rasters at 0.1 degree resolution including the number of vessels and total hours of vessel presence for all vessels classified by Global Fishing Watch as one of the following: ‘cargo’, ‘specialized_reefer’, ‘tanker’, ‘bunker’, ‘cargo_or_tanker’, ‘cargo_or_reefer’, ‘bunker_or_tanker’, ‘container_reefer’, ‘passenger’. There are two versions of the data available, one based only on actual AIS positions and one where vessel location is interpolated to a regular interval of five minutes. Watch (2022)
c676dc2b Census cartographic boundary files 2016 Cartographic boundary files for dissemination areas of the 2016 Canadian census Canada (2016a); Canada (2017)
b9024b04 Census cartographic boundary files 2021 Boundary files for dissemination areas of the 2021 Canadian census Canada (2022c); Canada (2022d)
d147406d Census population 2016 Population and dwelling counts, for dissemination areas, 2016 Census Canada (2016b)
d96dec16 Census population 2021 Population and dwelling counts: Canada, provinces and territories, census subdivisions and dissemination areas Canada (2022e)
e775900b The GEBCO_2021 Grid The GEBCO_2021 Grid was published in July 2021 and is a global terrain model for ocean and land, providing elevation data, in meters, on a 15 arc-second interval grid Group (2021)
7c8c4da1 Invasive species distribution models Species distribution models and occurrence data for marine invasive species hotspot identification Lyons Lowen et al. (2020a); Lyons Lowen et al. (2020b)
70efb2b0 Native Land Digital Native Land is an app to help map Indigenous territories, treaties, and languages. The map provided does not represent or intend to represent official or legal boundaries of any Indigenous nations. To learn about definitive boundaries, contact the nations in question. Also, the map is not perfect – it is a work in progress with tons of contributions from the community. Please send fixes to info@native-land.ca if you find errors.
ce594316 First Nations Location The First Nations geographic location dataset contains the geographic location of First Nations (groups and subgroups) in Canada as points as well as basic attributes data. Canada (2022a)
621e9a76 Inuit Communities Location The Inuit Communities geographic location dataset contains the geographic location of Inuit Communities in Canada as points, as well as data attributes specific to each community. Crown-Indigenous Relations and Northern Affairs Canada (2020)
e2349037 Terrestrial human footprint Change in terrestrial human footprint drives continued loss of intact ecosystems Venter Sanderson et al. (2016a); Venter Sanderson et al. (2016b); Williams Venter et al. (2020a); Williams Venter et al. (2020b)
e328da3a Community Well-Being Index The Community Well-Being (CWB) Index is a method of assessing socio-economic well-being in Canadian communities. Various indicators of socio-economic well-being, including education, labour force activity, income and housing, are derived from Statistics Canada’s Census of Population and combined to give each community a well-being ‘score’. These scores are used to compare well-being across First Nations and Inuit communities with well-being in other Canadian communities. Crown-Indigenous Relations and Northern Affairs Canada (2022)
091d10ec Mercury concentrations in the Canadian Arctic marine ecosystem This dataset contains 2005 concentrations of total mercury (THg), gaseous elemental mercury (GEM), methylated mercury, dimethyl mercury (DMHg) in the water column of the Canadian Arctic. Kirk (2018); Kirk St. Louis et al. (2008)
0bf96a89 Perfluoroalkyl substances (PFAS) in the Canadian Arctic marine ecosystem This dataset contains concentrations of perfluoroalkyl substances (PFAS) in seawater sampled in various locations in the Arctic ranging from 2005-2008. De Silva and Kirk (2018); Benskin Muir et al. (2012)
caa1fb75 Concentrations of organophosphate esters (OPEs) and polybrominated diphenyl ethers (PBDEs) in the North Atlantic Ocean This dataset contains the ambient dissolved concentrations of organophosphate esters (OPEs) and polybrominated diphenyl ethers (PBDEs) in North Atlantic Ocean (Greenland Sea) as well as a summary of the passive polyethylene samplers (PEs) deployed. De Silva (2018); McDonough De Silva et al. (2018)
d770f210 Carte écoforestière originale et résultats d’inventaire La carte écoforestière originale et résultats d’inventaire constituent un regroupement de données écoforestières comprenant la carte écoforestière originale et de nombreuses autres tables fournissant de l’information se rattachant directement aux peuplements forestiers. L’information contenue dans ce jeu de données correspond au portrait de la forêt jusqu’à l’année de la photographie aérienne Ministère de la Forêt de la Faune et des Parcs (2022)
b5433840 Geolocated placenames in Canada The collection of geolocated placenames in Canada represents a consistent and comprehensive distribution of named places across Canada. Named places include large and small cities, villages, First Nations Communities, Small Hamlets etc. Innovation (2020)
004b3c51 Canadian Exclusive Economic Zone Canadian Exclusive Economic Zone Institute (2019)
a56e753b Timeline of COVID-19 in Canada The Timeline of COVID-19 in Canada (CovidTimelineCanada) is intended to be the definitive source for data regarding the COVID-19 pandemic in Canada. In addition to making available the ready-to-use datasets, this repository also acts as a hub for collaboration on expanding and improving the availability and quality of COVID-19 data in Canada. This repository is maintained by the COVID-19 Canada Open Data Working Group and is one component of the What Happened? COVID-19 in Canada project. Berry O’Neill et al. (2021)
8b0bbc44 Open Database of Healthcare Facilities The Open Database of Healthcare Facilities (ODHF) contains the names, addresses and geo-coordinates of healthcare facilities across Canada. Facilities are classified by type. The current version (version 1.1) contains approximately 7,000 records compiled from open data sources, publicly available data, and data directly provided by sources for inclusion as open data. Canada (2020a); Canada (2020b)
c71da4d7 Health Regions: Boundaries and Correspondence with Census Geography The health region boundaries provided in this product are based on 2016 Census geographic units. The smallest geographic unit available has been used as the building block to define health regions. Canada (2020c)
d2f44fdf National Pollutant Release Inventory The National Pollutant Release Inventory (NPRI) is Canada’s public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains data from 1993 to the latest reporting year. These CSV format datasets are in normalized or ‘list’ format and are optimized for pivot table analyses. Environment and Climate Change Canada (2022)
ee7295d7 Proximity measures database Statistics Canada (StatCan) and Canada Mortgage and Housing Corporation (CMHC) have collaborated on the implementation of a set of proximity measures to services and amenities. CMHC funded this collaboration to generate data and analytical work in support of the National Housing Strategy. Canada (2020d); Canada (2020e)
852db1a3 Census 2021 housing suitability Housing suitability by tenure: Canada, provinces and territories, census divisions and census subdivisions Canada (2022f)
b48b01d6 Census 2021 dwelling condition Dwelling condition by tenure: Canada, provinces and territories, census divisions and census subdivisions Canada (2022g)
f4abec86 Census 2021 acceptable housing Acceptable housing by tenure: Canada, provinces and territories, census divisions and census subdivisions Canada (2022h)
5e4be996 Census cartographic subdivision boundary files 2021 Boundary files for subdivision areas of the 2021 Canadian census Canada (2022i); Canada (2022c)
929a1773 Census water treatment Treatment of main source of water by households, Canada, provinces and census metropolitan areas (CMA) Canada (2021a)
000fd656 Census cartographic subdivision boundary files 2016 Boundary files for subdivision areas of the 2016 Canadian census Canada (2017); Canada (2016c)
175ec912 Assessment of acceptable housing in Canada Acceptable housing across Canada based on Statistics Canada’s census and statistics on acceptable housing, dwelling conditions, and housing suitability
288ca300 Census cartographic division boundary files 2021 Boundary files for division areas of the 2021 Canadian census Canada (2022i); Canada (2022c)
7daa23ee Census 2021 road network file The 2021 Census Road Network File includes the unique identifier, DGUID, name and type for each side of a street arc (where applicable) for provinces and territories, and census subdivisions. In the 2021 Census Road Network File, streets are ranked according to five levels of detail, suitable for mapping at small to medium scales. Canada (2021b); Canada (2021c)
37563350 Census Profile, 2021 Census of Population A detailed statistical portrait of Canada and its people by their demographic, social and economic characteristics. Canada (2021b)
8671c3e4 Canadian social vulnerabilities using the 2021 Census of Poulation Assessment of proxies of social vulnerabilities across Canada using data from the 2021 Census of Population.
c08e9141 Global cumulative human impacts assessments in 2008 and 2013 Data from the assessments of global marine cumulative human impacts in 2008 and 2013 Halpern Walbridge et al. (2008); Halpern Frazier et al. (2015a); Halpern Frazier et al. (2015b)
f616de19 Terrestrial human footprint Data from: Global terrestrial Human Footprint maps for 1993 and 2009 Venter Sanderson et al. (2016a); Venter Sanderson et al. (2016b)
6eefac0b Aboriginal Lands of Canada Legislative Boundaries The Aboriginal Lands of Canada Legislative Boundaries web service includes legislative boundaries of Indian Reserves, Land Claim Settlement Lands (lands created under Comprehensive Land Claims Process that do not or will not have Indian Reserve status under the Indian Act) and Indian Lands. Canada (2022b)
ce5d1455 Inuit Regions (Inuit Nunangat) The Inuit Regions, also known as the Inuit Nunangat, dataset contains the geographical boundaries of the 4 Inuit Regions in Canada: Inuvialuit, Nunavut, Nunavik and Nunatsiavut. Crown-Indigenous Relations and Northern Affairs Canada (2019b)
7fe284e4 Native Land Digital Native Land is an app to help map Indigenous territories, treaties, and languages. The map provided does not represent or intend to represent official or legal boundaries of any Indigenous nations. To learn about definitive boundaries, contact the nations in question. Also, the map is not perfect – it is a work in progress with tons of contributions from the community. Please send fixes to info@native-land.ca if you find errors. Digital (2022)
758c10a3 Tribal Councils Location The tribal council geographic location dataset contains the geographic location of all tribal councils in Canada as points as well as basic attributes data. Each tribal council point represents its address as it is registered in Indigenous and Northern Affairs Canada (INAC) Indian Government Support System (IGSS). Crown-Indigenous Relations and Northern Affairs Canada (2019a)
92230392 Geographical Names of Canada Data These files contain names recognized by the Geographical Names Board of Canada. The records include the names of populated places and administrative areas, water features such as lakes, rivers and bays, terrain features like mountains, capes and valleys; and undersea features such as seamounts and trenches. The data are provided by federal, provincial and territorial naming authorities, and managed by the Geographical Names Board of Canada Secretariat at Natural Resources Canada. Canada (2023)
98916b4a Vessel Density Mapping Series in the Northwest Atlantic Veinot Nicoll et al. (2023)

How to contribute

External contributors are welcome to contribute data pipelines to this package. Simply fork the public repo and create your own data pipeline. The pipenew() function creates a dp_#####.R template for you to use to create a new data pipeline with a unique id. Create a pull request for us to review the data pipeline for inclusion in the package.

A single pull request per pipeline should be created, and merged pull requests should be squashed into a single commit.