Skip to content
jimdale edited this page May 2, 2024 · 33 revisions

viewser 6++ user documentation

This wiki contains documentation for viewser for versions 6.0.0 and above. Much of what follows can also be found in the README.md file in the root directory of this repository.

viewser is a software client enabling users to fetch and transform raw data from the views3 database. Older versions of viewser (<6.0.0) retrieve data from a source which is no longer maintained, and these versions can no longer be used - previous users who wish to continue to use the service should upgrade to the laters version of viewser (see below).

Installing viewser 6++

viewser is a Python package publicly available via pip. It was developed for Apple and Linux platforms and has not been tested to any degree on Windows.

It is very strongly recommended that viewser be installed in a dedicated conda environment.

Conda can be obtained here (the miniforge installer is recommended):

https://conda.io/projects/conda/en/latest/user-guide/install/index.html

A minimal viewser environment can be created by executing

conda create -n viewser python=3.11

Once the environment is activated by

conda activate viewser

viewser itself can be installed via

pip install viewser

The viewser package is regularly updated, so users should frequently update it by executing

pip install --upgrade viewser

Once viewser is installed, it needs to be configured to set the URL from which it fetches data. This can be achieved by

viewser config set REMOTE_URL https://viewser.viewsforecasting.org

Getting help

To open this wiki in a browser window from the terminal, run:

viewser help wiki

Basic concepts

The viewser package is a client which interacts via an https connection with a remote service, usually known as views3, currently hosted at the Peace Research Institute of Oslo (PRIO).

views3 has an actively maintained Postgres database containing of order 10 000 features (e.g. conflict, economic, developmental, environmental) defined at two main temporal levels of analysis (month, year) and two main spatial levels of analysis (country, priogrid).

The database is actively maintained, with several features being updated on monthly timescales.

The viewser client allows users to fetch raw data from the database and apply a wide variety of mathematical transforms to each feature (which may be chained - see below), eventually returned a single pandas dataframe contained the requested data. Users define what data and which transforms they require using a Python class called a Queryset.

Using viewser

The viewser client can be used in two ways:

Via command-line interface (CLI)

viewser functionality is exposed via a CLI on your system after installation. An overview of available commands can be seen by running viewser --help.

The CLI is envisaged mainly as a tool to help users with issues such as selecting appropriate transforms, exploring the database, determining the level of analysis of a given feature, etc.

Useful CLI commands

Show all features in the database:

viewser features list <loa>

with <loa> being one of ['pgm', 'cm', 'pgy', 'cy', 'pg', 'c', 'am', 'a', 'ay', 'm', 'y']

Show all transforms sorted by level of analysis:

viewser transforms list

Show all transforms available at a particular level of analysis:

viewser transforms at_loa <loa>

with <loa> being one of ['any', 'country_month', 'priogrid_month', 'priogrid_year']

Show docstring for a particular transform:

viewser transforms show <transform-name>

List querysets stored in the queryset database:

viewser querysets list

Produce code required to generate a queryset

viewser querysets show <queryset-name>

Via API

The full functionality of viewser is exposed via its API for use in scripts and notebooks

The two fundamental objects used to define what data is fetched by the client are the Queryset and the Column, where a Queryset consists of one or more Columns, and a Column is one raw feature to which zero or more transforms have been applied.

Follow the links below for guides to creating querysets and fetching data.

Common tasks

  • Defining data: How to define new querysets
  • Getting data: How to retrieve data from the views3 service
  • [Drift Detection](Drift detection): Detecting possible anomalies in data retrieved through viewser
  • Model object storage: How to store and retrieve trained models

Conventions

ViEWS 3 relies on several conventions for data and naming, to make exchange and interoperation between packages easier:

Packages

Documentation is provided for each of the constituent ViEWS 3 packages. Notebooks are also available, which show complete workflows using the various packages in concert. Note that package names are written with a dash (-) on PyPi, and with an underscore (_) on github, due to differing naming conventions.

  • viewser is the entrypoint for interacting with the ViEWS 3 cloud, which provides data for the ViEWS team.

  • views-runs provides helper classes for model run management

  • stepshift implements the step-shifting algorithm used for predictive modelling.

  • views-partitioning is used to partition data for training.

  • views-transformation-library contains data transformation functions available in ViEWS 3

  • views-dataviz has functions for making maps and plots

  • views-stepshift is a legacy package containing the reference implementation of the stepshifting algorithm

Viewserspace

A pre-packaged docker image is also provided. This environment works as a common baseline environment for running notebooks, and contains a Jupyter environment with all relevant packages pre-installed.

To run a notebook server with this image, run the command:

viewser notebooks run

This downloads and launches the image.

Clone this wiki locally