censobr: Download Data from Brazil's Population Census

censobr is an R package to download data from Brazil's Population Census. The package is built on top of the Arrow platform, which allows users to work with larger-than-memory census data using {dplyr} familiar functions.

Installation

# install from CRAN
install.packages("censobr")

# or use the development version with latest features
utils::remove.packages('censobr')
remotes::install_github("ipeaGIT/censobr", ref="dev")
library(censobr)

Basic usage

The package currently includes 6 main functions to download & read census data:

read_population()
read_households()
read_mortality()
read_families()
read_emigration()
read_tracts()

censobr also includes a few support functions to help users navigate the documentation Brazilian censuses, providing convenient information on data variables and methodology:

data_dictionary()
questionnaire()
interview_manual()

Finally, the package includes two functions to help users manage the data chached locally.

censobr_cache()
set_censobr_cache_dir()

The syntax of all censobr functions to read data operate on the same logic so it becomes intuitive to download any data set using a single line of code. Like this:

read_households(
  year,          # year of reference
  columns,       # select columns to read
  add_labels,    # add labels to categorical variables
  as_data_frame, # return an Arrow DataSet or a data.frame
  showProgress,  # show download progress bar
  cache          # cache data for faster access later
  )

Note: all data sets in censobr are enriched with geography columns following the name standards of the {geobr} package to help data manipulation and integration with spatial data from {geobr}. The added columns are: c(‘code_muni’, ‘code_state’, ‘abbrev_state’, ‘name_state’, ‘code_region’, ‘name_region’, ‘code_weighting’).

Data cache

The first time the user runs a function, censobr will download the file and store it locally. This way, the data only needs to be downloaded once. When the cache parameter is set to TRUE (Default), the function will read the cached data, which is much faster.

censobr_cache(): can be used to list and/or delete data files cached locally
set_censobr_cache_dir(): can be used to set custom cache directory for censobr files

Larger-than-memory Data

Microdata of Brazilian census are often be too big to load in users' RAM memory. To avoid this problem, censobr will by default return an Arrow table, which can be analyzed like a regular data.frame using the dplyr package without loading the full data to memory.

More info in the package vignette.

Contributing to censobr

If you would like to contribute to censobr, you're welcome to open an issue to explain the proposed a contribution.

Related projects

Afaik, censobr is the only R package that provides fast and convenient access to data of Brazilian censuses. The microdadosBrasil package used to provide access to microdata of several public data sets, but unfortunately, it has been discontinued.

Similar packages for other countries

Canada: cancensus
Chile: censo2017
US: tidycensus
World: ipumsr

Credits

Original Census data is collected by the Brazilian Institute of Geography and Statistics (IBGE). The censobr package is developed by a team at the Institute for Applied Economic Research (Ipea), Brazil. If you want to cite this package, you can cite it as:

Pereira, Rafael H. M.; Barbosa, Rogério J. (2023) censobr: Download Data from Brazil's Population Census. R package version v0.2.0, https://CRAN.R-project.org/package=censobr.

bibentry(
  bibtype  = "Manual",
  title       = "censobr: Download Data from Brazil's Population Census",
  author      = "Rafael H. M. Pereira [aut, cre] and Rogério J. Barbosa [aut]",
  year        = 2023,
  version     = "v0.2.0",
  url         = "https://CRAN.R-project.org/package=censobr",
  textVersion = "Pereira, R. H. M.; Barbosa, R. J. (2023) censobr: Download Data from Brazil's Population Census. R package version v0.2.0, <https://CRAN.R-project.org/package=censobr>."
)

Name		Name	Last commit message	Last commit date
Latest commit History 254 Commits
.github		.github
R		R
data_prep		data_prep
inst		inst
man		man
pkgdown		pkgdown
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CRAN-SUBMISSION		CRAN-SUBMISSION
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md
censobr.Rproj		censobr.Rproj
codecov.yml		codecov.yml
cran-comments.md		cran-comments.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

censobr: Download Data from Brazil's Population Census

Installation

Basic usage

Data cache

Larger-than-memory Data

Contributing to censobr

Related projects

Similar packages for other countries

Credits

About

Releases 7

Packages

Contributors 3

Languages

License

ipeaGIT/censobr

Folders and files

Latest commit

History

Repository files navigation

censobr: Download Data from Brazil's Population Census

Installation

Basic usage

Data cache

Larger-than-memory Data

Contributing to censobr

Related projects

Similar packages for other countries

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 3

Languages

Packages