Skip to content
Access data from the City of Toronto Open Data Portal in R.
R
Branch: master
Clone or download
Latest commit c001224 Aug 31, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R Merge branch 'master' of github.com:sharlagelfand/opendatatoronto Aug 31, 2019
docs Civic issues docs Aug 31, 2019
man Update civic issues field, don't deped on list_packages to actually g… Aug 31, 2019
pkgdown Improve function documentation Aug 26, 2019
tests remove whitespace Aug 29, 2019
vignettes
.Rbuildignore goodpractice and ropensci book suggestion changes Aug 14, 2019
.gitignore goodpractice and ropensci book suggestion changes Aug 14, 2019
.travis.yml goodpractice and ropensci book suggestion changes Aug 14, 2019
CODE_OF_CONDUCT.md goodpractice and ropensci book suggestion changes Aug 14, 2019
DESCRIPTION Civic issues docs Aug 31, 2019
LICENSE Assign license to city of toronto Jul 31, 2019
LICENSE.md Assign license to city of toronto Jul 31, 2019
NAMESPACE Add show package Aug 25, 2019
README.Rmd Update civic issues field, don't deped on list_packages to actually g… Aug 31, 2019
README.md Update civic issues field, don't deped on list_packages to actually g… Aug 31, 2019
appveyor.yml Hopefully fix travis and appveyor Aug 1, 2019
codecov.yml Add codecov Jul 17, 2019
opendatatoronto.Rproj goodpractice and ropensci book suggestion changes Aug 14, 2019

README.md

opendatatoronto

Travis build status AppVeyor build status Codecov test coverage CRAN status Lifecycle: experimental

opendatatoronto is an R interface to the City of Toronto Open Data Portal. The goal of the package is to help read data directly into R without needing to manually download it via the portal.

For more information, please visit the package website and vignettes:

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("sharlagelfand/opendatatoronto")

Usage

In the Portal, datasets are called packages. You can see a list of available packages by using list_packages(). This will show metadata about the package, including what topics (i.e. tags) the package covers, any civic issues it addresses, a description of it, how many resources there are (and their formats), how often it is is refreshed and when it was last refreshed.

library(opendatatoronto)
packages <- list_packages(limit = 10)
packages
#> # A tibble: 10 x 9
#>    title id    topics excerpt dataset_category num_resources formats
#>    <chr> <chr> <chr>  <chr>   <chr>                    <int> <chr>  
#>  1 Body… c405… City … This d… Table                        2 WEB,CS…
#>  2 Stre… 1db3… City … Transi… Map                          1 CSV,GE…
#>  3 Stre… 74f6… City … Public… Map                          1 CSV,GE…
#>  4 Stre… 821f… City … Public… Map                          1 CSV,GE…
#>  5 Stre… ccfd… City … Poster… Map                          1 CSV,GE…
#>  6 Stre… cf70… City … Poster… Map                          1 CSV,GE…
#>  7 Stre… 3944… City … Litter… Map                          1 CSV,GE…
#>  8 Stre… 99b1… City … Inform… Map                          1 CSV,GE…
#>  9 Stre… 71e6… Trans… "Bike … Map                          1 CSV,GE…
#> 10 Stre… 0c4e… City … Bench … Map                          1 CSV,GE…
#> # … with 2 more variables: refresh_rate <chr>, last_refreshed <date>

You can also search packages by title:

ttc_packages <- search_packages("ttc")

ttc_packages
#> # A tibble: 14 x 9
#>    title id    topics excerpt dataset_category num_resources formats
#>    <chr> <chr> <chr>  <chr>   <chr>                    <int> <chr>  
#>  1 TTC … e271… Trans… TTC Bu… Document                     7 XLSX   
#>  2 TTC … aedd… Trans… This d… Website                      2 WEB,XLS
#>  3 TTC … d9dc… Trans… This d… Document                     1 XLSX   
#>  4 TTC … 8217… Trans… The NV… Document                     1 PDF    
#>  5 TTC … 1444… Trans… This d… Website                      2 WEB,XLS
#>  6 TTC … 4eb6… Trans… This d… Document                     5 XLSX   
#>  7 TTC … 2c4c… Finan… This d… Website                      2 WEB,XLS
#>  8 TTC … c01c… <NA>   "This … Document                     1 SHP    
#>  9 TTC … 7795… Trans… Data c… Document                     1 ZIP    
#> 10 TTC … ef35… Trans… This d… Document                     1 XLSX   
#> 11 TTC … d2a7… Trans… This d… Website                      2 WEB,XLS
#> 12 TTC … 4b80… Trans… This d… Website                      2 WEB,XLS
#> 13 TTC … b68c… Trans… TTC St… Document                     7 XLSX   
#> 14 TTC … 996c… Trans… TTC Su… Document                    29 XLSX   
#> # … with 2 more variables: refresh_rate <chr>, last_refreshed <date>

Or see metadata for a specific package:

show_package("996cfe8d-fb35-40ce-b569-698d51fc683b")
#> # A tibble: 1 x 9
#>   title id    topics excerpt dataset_category num_resources formats
#>   <chr> <chr> <chr>  <chr>   <chr>                    <int> <chr>  
#> 1 TTC … 996c… Trans… TTC Su… Document                    29 XLSX   
#> # … with 2 more variables: refresh_rate <chr>, last_refreshed <date>

Within a package, there are a number of resources - e.g. CSV, XSLX, JSON, SHP files, and more. Resources are the actual “data”.

For a given package, you can get a list of resources using list_package_resources(). You can pass it the package id (which is contained in marriage_license_packages below):

marriage_licence_packages <- search_packages("Marriage Licence Statistics")

marriage_licence_resources <- marriage_licence_packages %>%
  list_package_resources()

marriage_licence_resources
#> # A tibble: 1 x 4
#>   name                      id                         format last_modified
#>   <chr>                     <chr>                      <chr>  <date>       
#> 1 Marriage Licence Statist… 4d985c1d-9c7e-4f74-9864-7… CSV    2019-08-01

But you can also get a list of resources by using the package’s URL from the Portal:

list_package_resources("https://open.toronto.ca/dataset/sexual-health-clinic-locations-hours-and-services/")
#> # A tibble: 2 x 4
#>   name                            id                   format last_modified
#>   <chr>                           <chr>                <chr>  <date>       
#> 1 sexual-health-clinic-locations… e958dd45-9426-4298-… XLSX   2019-08-15   
#> 2 Sexual-health-clinic-locations… 2edcc4a3-c095-4ce3-… XLSX   2019-08-15

Finally (and most usefully!), you can download the resource (i.e., the actual data) directly into R using get_resource():

marriage_licence_statistics <- marriage_licence_resources %>%
  get_resource()

marriage_licence_statistics
#> # A tibble: 412 x 4
#>    `_id` CIVIC_CENTRE MARRIAGE_LICENSES TIME_PERIOD
#>    <int> <chr>                    <int> <chr>      
#>  1   409 ET                          80 2011-01    
#>  2   410 NY                         136 2011-01    
#>  3   411 SC                         159 2011-01    
#>  4   412 TO                         367 2011-01    
#>  5   413 ET                         109 2011-02    
#>  6   414 NY                         150 2011-02    
#>  7   415 SC                         154 2011-02    
#>  8   416 TO                         383 2011-02    
#>  9   417 ET                         177 2011-03    
#> 10   418 NY                         231 2011-03    
#> # … with 402 more rows

Please note that the ‘opendatatoronto’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

You can’t perform that action at this time.