Chan Zuckerberg Landscaping Toolkit

Image source on LucidDraw: Link

CZI adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to opensource@chanzuckerberg.com.

Please note: If you believe you have found a security issue, please responsibly disclose by contacting us at security@chanzuckerberg.com.

Install

pip install git+https://github.com/chanzuckerberg/czLandscapingTk.git

How this system was built:

This libray is built on databricks_to_nbdev_template, which is modified version of nbdev_template tailored to work with databricks notebooks.

The steps to contributing to the development of this library are based on a development pipeline that uses databricks. This means that this work will mainly be driven internally from with the CZI tech team: 1. Clone this library from within databricks. 2. Place your scripts and utility notebooks in subdirectories of the databricks folder in the file hierarchy. 3. Any databricks notebooks that contain the text: from nbdev import * will be automatically converted to Jupyter notebooks that live at the root level of the repository. 4. When you push this repository to Github from Databricks, Jupyter notebooks will be built, added to the repo and then processed by nbdev to generate modules and documentation (refer to https://nbdev.fast.ai/ for full documentation on how to do this). Note that pushing code to Github will add and commit more code to github, requiring you to perform another git pull to load and refer to the latest changes in your code.

High-level Design: The Surveying Knowledge Task

This project is focussed on provide a suite of generalizable tools that can be used by knowledge analysts to implement solutions for surveying tasks. The basic structure of this class of data analysis can be described in the following way:

Goal

An analytic task, where we attempt to answer a question by (A) surveying existing data sources, (B) compiling an intermedical knowledge corpus drawn from those sources, (C) analysing that corpus to yield an answer to the question.

Typical Example

Identifying a set of Key Opinion Leaders (KOLs) with specialized expertise in an understudied area.
Performing a systematic review of available treatments for a specific rare disease
Developing (and using) reproducible impact metrics for a funded scientific program to study what is working and what is not.

Terminology + Implementation Design

Question - A natural language expression of the research question that is the objective of the task
Study Data Sources - List of avaiable information sources that can be interrogated by executors of the task
Information Retrieval Query (IR Query) - A list of logically-defined queries that can be run over the data sources
Inclusion / Exclusion Criteria - Logical operators to determine if retrieved data should be included in the study
Intermediate Corpus - Schema and Data of the collection of documents gathered from external information sources
Analysis - Workflow specification of analyses to be performed over the intermediate corpus to generate an Answer
Answer - The answer to the question expressed in natural language with a full explanation of the provenance of how the answer was computed.

Organizational Model

Image source on LucidDraw: Link

Adopting the CommonKADS knowledge engineering design process, we consider the interplay between agents (swimlanes), processes, and items in the figure. In particular, we seek to characterize how knowledge is needed, used, or derived in the workflow.

The goal of this project is to provide code to execute the processes described above to provide an extensible set of executable computational tools to automate the process shown.

Name		Name	Last commit message	Last commit date
Latest commit History 633 Commits
.github		.github
.linkml		.linkml
czLandscapingTk		czLandscapingTk
databricks		databricks
db2nb		db2nb
docs		docs
nbScratchPad		nbScratchPad
nbdev		nbdev
.devcontainer.json		.devcontainer.json
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
settings.ini		settings.ini
setup.py		setup.py

License

chanzuckerberg/czLandscapingTk

Folders and files

Latest commit

History

Repository files navigation

Chan Zuckerberg Landscaping Toolkit

Install

How this system was built:

High-level Design: The Surveying Knowledge Task

Goal

Typical Example

Terminology + Implementation Design

Organizational Model

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages