Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototyping new json schema #1249

Closed
DSuveges opened this issue Nov 11, 2020 · 3 comments
Closed

Prototyping new json schema #1249

DSuveges opened this issue Nov 11, 2020 · 3 comments
Assignees
Labels
Data Relates to Open Targets data team

Comments

@DSuveges
Copy link

As part of the consolidation of the evidence objects in the backend, we are re-modeling the json schema to reflect the new simplified/flattened design.

Based on the meeting we had on 2020.11.11 the following conclusions were reached:

  • We need to maintain a json schema that guides our data providers and can be used as template to generate evidence strings.
  • The schema will reflect the concepts of the new platform design, so the units of the schema is going to be data source centric instead of data type.
  • Each of the valuable columns will be defined in a common section.
  • For each data source there will only be a list of required fields.
  • We haven't reached a consensus on how the unique association fields are defined, and at which point of the evidence generation. So for the first iteration of the json schema, the unique_association_fields will be omitted.

The schema is written based on the most recent iteration of the evidence schema review document.

@DSuveges DSuveges added the Data Relates to Open Targets data team label Nov 11, 2020
@DSuveges DSuveges self-assigned this Nov 11, 2020
@ireneisdoomed ireneisdoomed pinned this issue Nov 11, 2020
@ireneisdoomed ireneisdoomed unpinned this issue Nov 11, 2020
@ireneisdoomed
Copy link

As discussed on Nov 12th daily meeting we should keep an eye on:

  • Fields present in the submitted evidence that were not taken into account in the cleaned up version.
  • Check if the scoring depends on fields that are missing.
  • See which data sources provide PubMed Central IDs
  • Maintain a document for those questions that arise and we want to ask the data providers

@DSuveges
Copy link
Author

Structured fields:

  • literature: list of pmids/pmc ids
|-- literature: array (nullable = true)
| |-- element: string (containsNull = true)
  • diseaseModelAssociatedModelPhenotypes: list of objects:
|-- diseaseModelAssociatedModelPhenotypes: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- id: string (nullable = true)
| | |-- label: string (nullable = true)
  • diseaseModelAssociatedHumanPhenotypes: list of objects:
|-- diseaseModelAssociatedHumanPhenotypes: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- id: string (nullable = true)
| | |-- label: string (nullable = true)
  • variations: list of objects
|-- variations: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- functionalConsequenceId: string (nullable = true)
| | |-- inheritancePattern: string (nullable = true)
| | |-- numberMutatedSamples: long (nullable = true)
| | |-- numberSamplesTested: long (nullable = true)
| | |-- numberSamplesWithMutationType: long (nullable = true)
| | |-- variantAminoacidDescription: string (nullable = true)
  • textMiningSentences: list of objects:
|-- textMiningSentences: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- dEnd: long (nullable = true)
| | |-- dStart: long (nullable = true)
| | |-- section: string (nullable = true)
| | |-- tEnd: long (nullable = true)
| | |-- tStart: long (nullable = true)
| | |-- text: string (nullable = true)
  • signigicantDriverMethod: list of string:
|-- significantDriverMethods: array (nullable = true)
| |-- element: string (containsNull = true)
  • clinicalUrls: list of objects:
|-- clinicalUrls: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- niceName: string (nullable = true)
| | |-- url: string (nullable = true)

@DSuveges
Copy link
Author

DSuveges commented Dec 9, 2020

As for the first round, I close the issue. The updates are not yet merged to the master branch.
Updates available: https://github.com/opentargets/json_schema/tree/ds_1249_new_json_schema

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data Relates to Open Targets data team
Projects
None yet
Development

No branches or pull requests

2 participants