Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update nullable fields, add additional fields to AlleleDescription #680

Merged
merged 8 commits into from
May 15, 2023

Conversation

williamdlees
Copy link
Contributor

@williamdlees williamdlees commented Mar 20, 2023

In germline objects, remove all nullable: false entries and add x-airr: qualifiers instead.

  • Move aligned sequence to SequenceDelineationV and change co-ordinates in SequenceDelineationV to be against the unaligned sequence rather than the aligned sequence. This makes AlleleDescription delineation-agnostic and removes reliance on an alignment for the co-ordinates
  • Add fields to AlleleDescription for Allele Similarity Cluster designation

@williamdlees
Copy link
Contributor Author

Now has the two items agreed at our last meeting for implementation in this release:

  • use miairr 'importance' designation in germline objects
  • de-nest objects in Genotype

@williamdlees williamdlees reopened this Mar 30, 2023
@williamdlees
Copy link
Contributor Author

fixed UndocumentedAllele name

@williamdlees
Copy link
Contributor Author

Withdrawn while I work on updating the test data

AlleleDescription. Updated germline test data for germline.
AlleleDescription. Updated germline test data for
@williamdlees williamdlees reopened this Apr 8, 2023
Copy link
Member

@bussec bussec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@williamdlees please find my comments below:

  • The following properties should IMO be essential:

    • RearrangedSequence.derivation
    • RearrangedSequence.observation_type
    • AlleleDescription.species
    • GermlineSet.species
    • MHCGenotype.mhc_class
  • AlleleDescription.coding_sequence.description: Is it intended to refer to IMGT Ontology here?

  • AlleleDescription.allele_similarity_cluster* needs more description and an example. The description should clarify whether "similarity" is different from "identical" and if yes to which degree.

  • Could Genotype.receptor_genotype_set_id be renamed to genotype_set_id to avoid confusion with the receptor_* properties of the Receptor object.

  • MHCGenotype.mhc_alleles is an array-of-objects (and not an array-of-refs)

  • As a note: According to wiki the term "derivation" does not refer to the source of the process (or its at least ambigous).

@javh
Copy link
Contributor

javh commented Apr 17, 2023

From the call:

  • species, mhc_class, and observation_type as essential makes sense.
  • Check with IARC on RearrangedSequence.derivation.
  • coding_sequence.description can we be more explicit here? Can we use the following as a guide? https://docs.airr-community.org/en/stable/miairr/specification_miairr_ncbi.html.
  • Can't change Genotype.receptor_genotype_set_id in a minor release. Maybe v1.5.
  • Replacement of object (MHCGenotype.mhc_alleles) with reference should be fine for v1.4.2.

Good to merge in after the above fixes.

@williamdlees
Copy link
Contributor Author

williamdlees commented May 15, 2023

  • species, mhc_class, and observation_type marked as essential.
  • No strong views from IARC regarding RearrangedSequence.derivation (the name waa reviewed and agreed by IARC some years ago). Possible alternatives should they be required could be derived_from, sample_source, nucleic_acid_class..
  • coding_sequence.descriptionchanged to nucleotide sequence of the core coding region, i.e. the coding region of a D-, J- or C- gene, or the coding region of a V-gene excluding the leader. I think this is pretty specific now.
  • object (MHCGenotype.mhc_alleles) replaced with reference to a new top-level object MHCAllele.

@bcorrie
Copy link
Contributor

bcorrie commented May 15, 2023

Re-structuring of MHCGenotype and Genotype objects to include object references rather than embedded objects looks like it is correct to me.

Copy link
Contributor

@bcorrie bcorrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MHCGenotype and Genotype object restructuring (no more embedded objects) looks like it is correct to me.

@javh
Copy link
Contributor

javh commented May 15, 2023

From the call:

  • Add additional description and clarification for AlleleDescription.allele_similarity_cluster* in the Germline documentation (see above comment from @bussec).

@javh javh merged commit 2520bbd into master May 15, 2023
schristley pushed a commit that referenced this pull request Jul 25, 2023
)

* Update nullable fields, add CDR fields to AlleleDescription
* fix missing x-airr
* Fix miairr tags in germline objects (#663). De-nest genotype (#667)
* Add fwr3_end
* Fix object name
* Update v-sequence delineation fields and dropped aligned sequence from AlleleDescription. Updated germline test data for germline. AlleleDescription. Updated germline test data for.
* Fix combined test data
* Further updates to germline objects
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants