Skip to content
R package to access the EUvsDisinfo API
R
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R fix bug of duplicate claims and remove claims_list parameter (not eff… Mar 8, 2020
man fix bug of duplicate claims and remove claims_list parameter (not eff… Mar 8, 2020
.Rbuildignore add script to generate diagram Feb 8, 2020
.gitignore package init Feb 6, 2020
DESCRIPTION clean html text by default Feb 10, 2020
LICENSE.md first version of package Feb 6, 2020
NAMESPACE add doc for add functions Feb 8, 2020
README.md can now download creative works by claim id Feb 22, 2020
euvsdisinfoR.Rproj package init Feb 6, 2020

README.md

euvsdisinfoR

Lifecycle: experimental

The goal of euvsdisinfoR is to provide some easy-to-use wrapper functions for the API provided by EUvsDisinfo. The API provides access to the EUvsDisinfo data base which contains over 7000 disinformation cases and debunks. The EUvsDisinfo project started its work in 2015 and identifies, compiles, and exposes disinformation cases originating in pro-Kremlin media that are spread across the EU and Eastern Partnership countries.

The package is based on work done together with my team colleagues, Alix Dumoulin and Alexandra Pavliuc, during the EUvsDisinformation hackathon, Jan 2020. Our hackathon project can be found here.

Installation

You can install euvsdisinfoR from GitHub with:

install.packages("devtools")
devtools::install_github("corriebar/euvsdisinfoR")

Example

To download the different data tables from the EUvsDisinfo data base, it is easiest to collect them in a disinfo() object:

library(euvsdisinfoR)
d <- disinfo()

It is possible to either add only a few tables or all at once:

d %>%
  add_claims(pages=3) %>%
  add_reviews("all")

d %>% 
  add_all("all")

For claims and claim reviews, it is also possible to query only claims/reviews that were published or reviewed after a certain date:

library(lubridate)

d %>%
     add_claims(published_since = today() - months(3) ) %>%
     add_reviews(published_since = today() - months(3) )
     
d %>%
     add_reviews(reviewed_since = today() - months(2) )

where published_since always relates to when the claim was originally published and reviewed_since refers to when an item was reviewed.

To then flatten the disinfo object and obtain one single data frame, one can use

d %>%
  flatten_disinfo()

This attempts to merge as much as possible (according to the relations from the data model below) and returns a single data frame. It either returns a data frame where each row is a creative work or where each row is a claim (if the disinfo object contains no creative works).

It is also possible to only download single data tables:

# get first 3 pages from the claims endpoint
get_claims(pages=3)
# or get all creative works
get_creative_works(pages="all")

The Data

The different data sets form the following data model:

png

Note that the columns appearances, keywords and content_locations in the claims data frame contain lists of creative works, keywords, or countries (resp).

You can’t perform that action at this time.