Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update unit tests to match latest serialization model for VRS 2.0.0a5 #85

Closed
ehclark opened this issue Feb 26, 2024 · 11 comments · Fixed by #92
Closed

Update unit tests to match latest serialization model for VRS 2.0.0a5 #85

ehclark opened this issue Feb 26, 2024 · 11 comments · Fixed by #92
Assignees
Labels
bug Something isn't working

Comments

@ehclark
Copy link
Contributor

ehclark commented Feb 26, 2024

The serialization model for VRS 2.0 has been updated in the 2.0.0a5 release. Unit tests need to be updated to account for the changes.

@ehclark ehclark self-assigned this Feb 26, 2024
@ehclark
Copy link
Contributor Author

ehclark commented Feb 26, 2024

@ahwagner @korikuzma @jsstevenson
Is ga4gh/vrs-python#335 a blocker for this? For the /vrs_variation endpoint, the serialization/digest is processed by vrs_enref():

    def put_object(self, variation_object: VrsObject) -> Optional[str]:
        """Attempt to register variation.

        :param variation_object: complete VRS object
        :return: Object digest if successful, None otherwise
        """
        try:
            id, _ = vrs_enref(variation_object, self.object_store, True)
        except ValueError:
            return None
        return id

And I am seeing discrepant behavior. That is, in vrs-python, this test would fail:

allele_dict = {
    "type": "Allele",
    "location": {
        "type": "SequenceLocation",
        "sequenceReference": {
            "type": "SequenceReference",
            "refgetAccession": "SQ.ss8r_wB0-b9r44TQTMmVTI92884QvBiB",
        },
        "start": 87894076,
        "end": 87894077,
    },
    "state": {"type": "LiteralSequenceExpression", "sequence": "T"},
}


def test_vr():
    allele = models.Allele(**allele_dict)
    (id1, obj) = vrs_enref(allele, None, True)
    allele = models.Allele(**allele_dict)
    id2 = ga4gh_identify(allele)
    assert id1 == id2

@korikuzma
Copy link
Contributor

@ehclark I don't think we have any tests for haplotypes. It appears that anyvar currently only supports Alleles / Copy Number. Maybe we should document this in a separate issue?

@korikuzma
Copy link
Contributor

And I am seeing discrepant behavior. That is, in vrs-python, this test would fail:

allele_dict = {
    "type": "Allele",
    "location": {
        "type": "SequenceLocation",
        "sequenceReference": {
            "type": "SequenceReference",
            "refgetAccession": "SQ.ss8r_wB0-b9r44TQTMmVTI92884QvBiB",
        },
        "start": 87894076,
        "end": 87894077,
    },
    "state": {"type": "LiteralSequenceExpression", "sequence": "T"},
}


def test_vr():
    allele = models.Allele(**allele_dict)
    (id1, obj) = vrs_enref(allele, None, True)
    allele = models.Allele(**allele_dict)
    id2 = ga4gh_identify(allele)
    assert id1 == id2

@ehclark Would we just need to add a is_ga4gh_identifiable method to return False in the IRI class?

@ehclark
Copy link
Contributor Author

ehclark commented Feb 27, 2024

@korikuzma I am not familiar enough with the semantics of the underlying model to know what the correct solution is.

While ga4gh/vrs-python#335 only mentions haplotypes, there are several Allele tests in tests/test_vrs2.py that are marked @pytest.mark.skip(reason="Need to refactor enref / deref").

@ahwagner
Copy link
Member

ahwagner commented Mar 1, 2024

Hi @ehclark; yes, the enref/deref work was not addressed per ga4gh/vrs-python#342 (comment) and still needs to be handled. I will take a look at this today.

@ahwagner
Copy link
Member

ahwagner commented Mar 1, 2024

@ehclark see ga4gh/vrs-python#357 for PR to address unit tests in VRS-Python. Enref / deref behavior, as it turns out, is unaffected; however I needed to dig into the enderef code to understand the refatt map and verify it was behaving as before (it is). Not sure if that addresses this issue in its entirety; if there are unit tests that need to be updated in AnyVar I have not looked at that test suite.

@ehclark
Copy link
Contributor Author

ehclark commented Apr 16, 2024

@jsstevenson @korikuzma
The response from the get_variation endpoint is not fully de-referenced as it was before (or maybe it was and the rules for serialization changed in VRS 2?). I'm not sure what the desired result is here. Some guidance would be appreciated.

Previously (according to the unit test expectations) it would work like this:

GET /get_variation/ga4gh:VA.K7akyz9PHB0wg8wBNVlWAAdvMbJUJJfU

{
  "messages": [],
  "object": {
    "id": "ga4gh:VA.K7akyz9PHB0wg8wBNVlWAAdvMbJUJJfU",
    "location": {
      "id": "ga4gh:SL.01EH5o6V6VEyNUq68gpeTwKE7xOo-WAy",
      "end": 87894077,
      "start": 87894076,
      "sequenceReference": {
        "refgetAccession": "SQ.ss8r_wB0-b9r44TQTMmVTI92884QvBiB",
        "type": "SequenceReference"
      },
      "type": "SequenceLocation"
    },
    "state": {
      "sequence": "T",
      "type": "LiteralSequenceExpression"
    },
    "type": "Allele"
  }
}

Now the behavior is as follows:

GET /get_variation/ga4gh:VA.K7akyz9PHB0wg8wBNVlWAAdvMbJUJJfU

{
  "messages": [],
  "data": {
    "location": "01EH5o6V6VEyNUq68gpeTwKE7xOo-WAy",
    "state": {
      "sequence": "T",
      "type": "LiteralSequenceExpression"
    },
    "type": "Allele"
  }
}

If the goal is to have the get_variation endpoint return the fully dereferenced VRS object, I think we cannot have the API response model reference the VRS pydantic models? Because FastAPI is using the pydantic serialization to generate the response bodies.

@jsstevenson jsstevenson added the bug Something isn't working label Apr 17, 2024
@jsstevenson
Copy link
Contributor

I am not up to speed on whether any rules have changed re serialization in VRS 2 (I know there have been a lot of code changes), but -- and maybe this is just my ignorance -- it seems bad that location is a raw digest without a ga4gh:SL. prefix

@korikuzma
Copy link
Contributor

@jsstevenson This may be due to the related issue here

@ehclark
Copy link
Contributor Author

ehclark commented Apr 18, 2024

@korikuzma Yes that issue you referred to is the same as the one in AnyVar. I have only found two solutions:

  • Remove the response_model annotation for the endpoint
  • Change the response class to declare the VRS object as type Dict instead of models.Variation (as is done with RegisterVariationResponse)

But in both cases as you note, the OpenAPI docs do not generate the full response model.

Perhaps we can follow your lead in the other project: remove the response model for now and separate the OpenAPI doc into a separate ticket?

@korikuzma
Copy link
Contributor

@ehclark That works for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants