Skip to content

Commit

Permalink
Merge pull request #353 from bento-platform/ingestion-tweaks
Browse files Browse the repository at this point in the history
  • Loading branch information
davidlougheed committed Dec 16, 2022
2 parents 052fffe + 98d67fa commit e86055b
Show file tree
Hide file tree
Showing 60 changed files with 1,614 additions and 1,107 deletions.
34 changes: 34 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Build and push katsu image
on:
release:
types: [ published ]
pull_request:
branches:
- develop
push:
branches:
- develop

jobs:
build-push:
runs-on: ubuntu-latest

permissions:
contents: read
packages: write

steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: "recursive"

- name: Run Bento build action
uses: bento-platform/bento_build_action@v0.9.3
with:
registry: ghcr.io
registry-username: ${{ github.actor }}
registry-password: ${{ secrets.GITHUB_TOKEN }}
image-name: ghcr.io/bento-platform/katsu
development-dockerfile: bento.dev.Dockerfile
dockerfile: bento.Dockerfile
31 changes: 0 additions & 31 deletions .github/workflows/docker-publish.yml

This file was deleted.

2 changes: 1 addition & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
- uses: actions/setup-python@v2
name: Set up Python
with:
python-version: 3.8
python-version: "3.8"
- name: Install flake8
run: python -m pip install flake8
- name: Run linter
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.9, 3.10.7]
python-version: ["3.8", "3.10"]
services:
postgres:
image: postgres:12
Expand Down
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
include chord_metadata_service/chord/workflows/*.wdl
include chord_metadata_service/chord/workflows/wdls/*.wdl
include chord_metadata_service/chord/tests/*.json
include chord_metadata_service/dats/*
include chord_metadata_service/mcode/tests/*.json
Expand Down
47 changes: 45 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
* [Phenopacket Commands](#phenopacket-commands)
* [Accessing the Django Shell from inside a Bento Container](#accessing-the-django-shell-from-inside-a-bento-container)
* [Configuring Public overview and public search fields](#configuring-public-overview-and-public-search-fields)
* [Config file specification](#config-file-specification)
* [Public APIs](#public-apis)

## License

Expand Down Expand Up @@ -115,6 +113,51 @@ Optionally, you may also install standalone Katsu with the Dockerfile provided.
deploy Katsu as part of the Bento platform, you should use Bento's Docker image instead.


## Environment Variables

Katsu uses several environment variables to configure relevant settings. Below are some:

```bash
# Secret key for sessions; use a securely random value in production
SERVICE_SECRET_KEY=...

# true or false; debug mode enables certain error pages and logging but can leak secrets, DO NOT use in production!
KATSU_DEBUG=true # or BENTO_DEBUG or CHORD_DEBUG

# Mandatory for accepting ingests; temporary directory
KATSU_TEMP= # or SERVICE_TEMP

# Configurable human-readable/translatable name for phenopacket data type (e.g. Clinical Data)
KATSU_PHENOPACKET_LABEL="Clinical Data"

# DRS URL for fetching ingested files
DRS_URL=

# Database configuration
POSTGRES_DATABASE=metadata
POSTGRES_USER=admin
# - If set, will be used instead of POSTGRES_PASSWORD to get the database password.
POSTGRES_PASSWORD_FILE=
POSTGRES_PASSWORD=admin
POSTGRES_HOST=localhost
POSTGRES_PORT=5432

# CHORD/Bento-specific variables:
# - If set, used for setting an allowed host & other API-calling purposes
CHORD_URL=
# - If true, will enforce permissions. Do not run with this not set to true in production!
# Defaults to (not DEBUG)
CHORD_PERMISSIONS=

# CanDIG-specific variables:
CANDIG_AUTHORIZATION=
CANDIG_OPA_URL=
CANDIG_OPA_SECRET=
CANDIG_OPA_SITE_ADMIN_KEY=
INSIDE_CANDIG=
```


## Authentication

Default authentication can be set globally in `settings.py`
Expand Down
19 changes: 19 additions & 0 deletions bento.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
FROM ghcr.io/bento-platform/bento_base_image:python-debian-latest

RUN pip install --no-cache-dir "uvicorn[standard]==0.20.0"

# Backwards-compatible with old BentoV2 container layout
WORKDIR /app

COPY requirements.txt requirements.txt

# Install production dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy all application code
COPY . .

# Create temporary directory for downloading files etc.
RUN mkdir -p tmp

CMD [ "sh", "./entrypoint.sh" ]
18 changes: 18 additions & 0 deletions bento.dev.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM ghcr.io/bento-platform/bento_base_image:python-debian-latest

# Backwards-compatible with old BentoV2 container layout
WORKDIR /app

COPY requirements.txt requirements.txt
COPY requirements-dev.txt requirements-dev.txt

# Install production dependencies
RUN pip install --no-cache-dir -r requirements-dev.txt

# Copy all application code
COPY . .

# Create temporary directory for downloading files etc.
RUN mkdir -p tmp

CMD [ "sh", "./entrypoint.dev.sh" ]
27 changes: 18 additions & 9 deletions chord_metadata_service/chord/data_types.py
Original file line number Diff line number Diff line change
@@ -1,41 +1,49 @@
from django.conf import settings

from chord_metadata_service.experiments.search_schemas import EXPERIMENT_SEARCH_SCHEMA
from chord_metadata_service.phenopackets.search_schemas import PHENOPACKET_SEARCH_SCHEMA
from chord_metadata_service.mcode.schemas import MCODE_SCHEMA
# from chord_metadata_service.mcode.schemas import MCODE_SCHEMA
from chord_metadata_service.experiments.schemas import EXPERIMENT_RESULT_SCHEMA

__all__ = [
"DATA_TYPE_EXPERIMENT",
"DATA_TYPE_EXPERIMENT_RESULT",
"DATA_TYPE_PHENOPACKET",
"DATA_TYPE_MCODEPACKET",
"DATA_TYPE_READSET",
"DATA_TYPES",
]

DATA_TYPE_EXPERIMENT = "experiment"
DATA_TYPE_EXPERIMENT_RESULT = "experiment_result"
DATA_TYPE_PHENOPACKET = "phenopacket"
DATA_TYPE_MCODEPACKET = "mcodepacket"
DATA_TYPE_READSET = "readset"
DATA_TYPE_EXPERIMENT_RESULT = "experiment_result"

DATA_TYPES = {
DATA_TYPE_EXPERIMENT: {
"label": "Experiments",
"schema": EXPERIMENT_SEARCH_SCHEMA,
"metadata_schema": {
"type": "object", # TODO
},
},
DATA_TYPE_PHENOPACKET: {
"label": settings.KATSU_PHENOPACKET_LABEL,
"schema": PHENOPACKET_SEARCH_SCHEMA,
"metadata_schema": {
"type": "object", # TODO
}
},
DATA_TYPE_MCODEPACKET: {
"schema": MCODE_SCHEMA,
"metadata_schema": {
"type": "object", # TODO
}
},
},
# De-listed 2022-12-08 - David L
# DATA_TYPE_MCODEPACKET: {
# "schema": MCODE_SCHEMA,
# "metadata_schema": {
# "type": "object", # TODO
# }
# },
DATA_TYPE_READSET: {
"label": "Readsets",
"schema": {
"file_format": EXPERIMENT_RESULT_SCHEMA["properties"]["file_format"]
},
Expand All @@ -44,6 +52,7 @@
}
},
DATA_TYPE_EXPERIMENT_RESULT: {
"label": "Experiment Results",
"schema": EXPERIMENT_RESULT_SCHEMA,
"metadata_schema": {
"type": "object"
Expand Down
4 changes: 3 additions & 1 deletion chord_metadata_service/chord/export.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import logging
from chord_metadata_service.chord.ingest import WORKFLOW_CBIOPORTAL

from chord_metadata_service.chord.models import Dataset, Project, Table
from chord_metadata_service.chord.workflows.metadata import WORKFLOW_CBIOPORTAL

from .export_cbio import study_export as export_cbioportal_workflow

logger = logging.getLogger(__name__)
Expand Down
20 changes: 18 additions & 2 deletions chord_metadata_service/chord/export_cbio.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,22 @@
from chord_metadata_service.experiments.models import ExperimentResult

__all__ = [
"STUDY_FILENAME",
"SAMPLE_DATA_FILENAME",
"SAMPLE_META_FILENAME",
"PATIENT_DATA_FILENAME",
"PATIENT_META_FILENAME",
"MUTATION_META_FILENAME",
"MUTATION_DATA_FILENAME",
"MAF_LIST_FILENAME",
"CASE_LIST_SEQUENCED",
"CBIO_FILES_SET",

"PATIENT_DATATYPE",
"SAMPLE_DATATYPE",

"REGEXP_INVALID_FOR_ID",

"study_export",
]

Expand Down Expand Up @@ -41,8 +57,8 @@
CASE_LIST_SEQUENCED
})

PATIENT_DATATYPE = 'PATIENT'
SAMPLE_DATATYPE = 'SAMPLE'
PATIENT_DATATYPE = "PATIENT"
SAMPLE_DATATYPE = "SAMPLE"

# [ List
# ^ not in...
Expand Down

0 comments on commit e86055b

Please sign in to comment.