Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modified query_template in utility.py #81

Merged
merged 6 commits into from
Feb 15, 2023
Merged

Modified query_template in utility.py #81

merged 6 commits into from
Feb 15, 2023

Conversation

rmanaem
Copy link
Collaborator

@rmanaem rmanaem commented Feb 10, 2023

Main changes:

Wrapped age, sex, diagnosis, subject group, and assessment with
sparql optional pattern
@rmanaem rmanaem self-assigned this Feb 10, 2023
@coveralls
Copy link
Collaborator

coveralls commented Feb 10, 2023

Pull Request Test Coverage Report for Build 4180748389

  • 2 of 2 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.03%) to 91.513%

Totals Coverage Status
Change from base Build 4125821592: 0.03%
Covered Lines: 248
Relevant Lines: 271

💛 - Coveralls

Copy link
Contributor

@alyssadai alyssadai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rmanaem! Is there a reason we're keeping bg:hasSession as non-optional here for now? Does it break the below SELECT block if we also wrap it in an OPTIONAL group?

@rmanaem
Copy link
Collaborator Author

rmanaem commented Feb 11, 2023

Is there a reason we're keeping bg:hasSession as non-optional here for now? Does it break the below SELECT block if we also wrap it in an OPTIONAL group?

I'd refer this to @surchs

@surchs
Copy link
Contributor

surchs commented Feb 13, 2023

Aha, excellent question @alyssadai. I think you are right, based on the current data model it would make sense for hasSession to be optional because. Making the hasSession optional is not a problem from the SPARQL perspective, I just checked. But there are some questions about how we want to use the results.

At the moment, we expect to return a session file-path for every subject that meets the the cohort criteria because (at least in the API) we expect every subject to have at least one session. My view on this at the moment is:

  • let's make the file_path optional. You only get a file_path if you have imaging data
  • let's not make the session optional any more. If all your subjects have only one session, BIDS lets you decide if you want to create a session layer at all - but: you can. I would say: let's always have a session for subjects. As soon as we add the phenotypic attributes to the session, we'll need the session anyway. And it makes our lives easier if we don't have to handle has-session and has-not-session paths for all eternity.

It so happens that I fortuitously failed to create a file_path attribute for all "Queensland Twin IMaging (QTIM)" subjects and sessions in the test graph instance. So we can actually start experimenting with this case now.

@alyssadai
Copy link
Contributor

Thanks for the explanation @surchs. Your second bullet point makes sense to me (and I like always having a session for subjects), but just to clarify the first

let's make the file_path optional. You only get a file_path if you have imaging data

What do you mean by optional? Would participants without imaging data just not be annotated with paths in the graph in the first place? So, from the API aggregate query results perspective (assuming there are participants in the graph who have only non-imaging data), you would still get a list of paths for your query, but the # participants with paths in that list may not be the same as the total # matching participants for the given query (?)

@surchs
Copy link
Contributor

surchs commented Feb 14, 2023

you would still get a list of paths for your query, but the # participants with paths in that list may not be the same as the total # matching participants for the given query

Thanks, that's a good point. I think we haven't really discussed what the decision that "you can exist without neuroimaging data" means for the API userflow - and to a lesser extent also the query tool. My view is that for this PR / release, your suggestion of returning a list of paths with possibly fewer entries than the stated number of subject*sessions is fine.

The main purpose of the paths at the moment is so that a user can get to the binary imaging files if they want to. We don't really have something similar to provide to the user yet for a subject without imaging data, but we still want to mention that these subjects exist.

That said, we should revisit the question of "what can we tell a user about a pheno-only subject" quickly. I made #84 to track this.

With that, I'd say we can go ahead here and make the file_path attribute optional. LMK if you want to look at this together @rmanaem

Wrapped file path with sparql optional pattern
The starlette.response.JSONResponse class which is the default
response class in fastapi doesn't allow nan values and throws
`ValueError: Out of range float values are not JSON compliant` error
Copy link
Contributor

@alyssadai alyssadai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @rmanaem, thanks for catching + addressing this quirk with the SPARQL results!

I've added a commit to drop the NaN file paths in the aggregate query response. With this change the ORJSONResponse class should no longer be needed, but I think it's fine if we keep it in for now in case we decide to revert to returning NaN values for attributes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants