Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ticket-2541: Add project code to several locations. #2623

Merged
merged 1 commit into from
Jan 10, 2023

Conversation

morriscb
Copy link
Contributor

@morriscb morriscb commented Dec 7, 2022

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@morriscb morriscb changed the base branch from rc/2.16.1 to master December 13, 2022 00:56
@morriscb morriscb changed the base branch from master to rc/2.14.1 December 23, 2022 17:29
@morriscb morriscb marked this pull request as ready for review December 23, 2022 17:30
Copy link
Contributor

@aamster aamster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, with a few requested changes

@@ -326,7 +328,7 @@ def _get_ophys_experiment_table(self) -> pd.DataFrame:
oec.visual_behavior_experiment_container_id as
ophys_container_id,
pg.group_order AS imaging_plane_group,
pr.code as project_code,
os.project_id as project_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the join on the projects table

@@ -428,10 +430,10 @@ def _get_ophys_session_table(self) -> pd.DataFrame:
SELECT
os.id as ophys_session_id,
bs.id as behavior_session_id,
os.project_id as project_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove join on projects table

output_df : pandas.DataFrame
Fixedup DataFrame
"""
# TODO: Remove this function prior to release and after Jan 5th.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a ticket for this in case we forget?

Copy link
Contributor Author

@morriscb morriscb Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was going to send the NWBs to the science team for a quick look and the mean time this deadline will hopefully pass and we can just remove it on this ticket. Needed to get something running for now.



class ProjectId(DataObject, LimsReadableInterface, NwbReadableInterface):
def __init__(self, project_id: int):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a docstring indicating what project id represents?

from allensdk.internal.api import PostgresQueryMixin


class OphysProjectId(ProjectId):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a docstring.

FROM behavior_sessions bs
WHERE bs.id = {behavior_session_id}
"""
project_id = lims_db.fetchone(query, strict=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of the value seems inconsistent. Here, the code assumes it can be null. While in from_nwb it fills with -1. I think it should be consistently None or -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was a set of spot fixes that that were used to get the unittests/already released data to behave consistently. I'll take a loop through the code and fixup the return types and make them consistent for -1.

@morriscb morriscb force-pushed the ticket/2541/dev branch 2 times, most recently from aeac54a to cd69a0d Compare January 4, 2023 22:31
@morriscb morriscb changed the base branch from rc/2.14.1 to 2.15.0 January 4, 2023 22:31
@morriscb morriscb changed the base branch from 2.15.0 to rc/2.15.0 January 4, 2023 22:34
@morriscb morriscb changed the title Ticket-2541: Add project id to several locations. Ticket-2541: Add project code to several locations. Jan 5, 2023
@morriscb morriscb force-pushed the ticket/2541/dev branch 2 times, most recently from d8a4803 to fb7f18e Compare January 6, 2023 20:54
@morriscb morriscb requested a review from aamster January 6, 2023 20:54
Copy link
Contributor

@aamster aamster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is confusion on the value if the project code is missing and the type of the project_code.
I've seen: "None", "Not available" and -1 used. It somewhere is a string and sometimes an integer. Also the Optional type is used which is specifically for None. I would suggest to keep it as None unless there's good reason otherwise. If you want to use a dummy value instead, I would suggest setting that value in 1 place in the constructor if the value is None and don't pass in a value to the constructor in those cases.

@@ -321,6 +335,12 @@ def behavior_session_uuid(self) -> Optional[uuid.UUID]:
def behavior_session_id(self) -> int:
return self._behavior_session_id.value

@property
def project_code(self) -> Optional[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the ProjectCode class, it looks like it always sets project code to a value, so I'm confused by the checking for None and the Optional type. The Optional type is specifically for None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

return cls(project_code=metadata.project_code)
except AttributeError:
# Return values for NWBs without the project code set/available.
return cls(project_code='Not available')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with from_lims which returns "None"

"""Unique identifier for the project this BehaviorSession is associated
with. Project ids can be used internally to extract a project code.

If project_Code is null/None, we set the value to string 'None'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why you don't leave the value as None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes the NWB storage simpler. The extensions never wanted to accept a None value and returning a default was the easier solution vs trying to track down it down in all the disparate NWB extension schemas.

@@ -60,6 +60,13 @@ def from_lims(cls, ophys_experiment_id: int,
ophys_metadata = OphysExperimentMetadata.from_lims(
ophys_experiment_id=ophys_experiment_id, lims_db=lims_db)

if ophys_metadata.project_code != behavior_metadata.project_code:
raise ValueError(
'Project id for Ophys experiment table does not match '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "code" instead of "id"

# project_code needs to be excluded from comparison
# since it's only exposed internally
self._exclude_from_equals = {'project_code'}
self._exclude_from_equals = ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can just remove this line

@@ -142,7 +145,7 @@ def ophys_session_id(self) -> int:
return self._ophys_session_id.value

@property
def project_code(self) -> Optional[str]:
def project_code(self) -> Optional[int]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it's a string. Currently it's never None (optional) since you fill it with a dummy value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

"""Unique identifier for the project this OphysExperiment is associated
with. Project ids can be used internally to extract a project code.

If the returned project id is null, return -1 for the id.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why you don't leave it as None

If you plan on keeping the logic to fill with a dummy value, please refactor setting the dummy value if missing into the base class constructor so that that logic does not need to be repeated here and the code is easier to follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed to just use a default the class.

@classmethod
def from_lims(cls, ophys_experiment_id: int,
lims_db: PostgresQueryMixin) -> "OphysProjectCode":
query = f"""
Copy link
Contributor

@aamster aamster Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is confusing since there should be a single source of truth for the project code (even though I know the database has it in multiple places). An ophys session is always associated with a behavior session. Is the project code always populated in the behavior sessions table? Can we also just look it up from there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly being cautions. OphysSessions seem to always have a project associated with them. However, it seems that these projects are changed in some cases so consider this a catch for before a project_id is assigned.

@@ -11,3 +14,37 @@ def __init__(self):
self._df = self._df.drop(
['date_of_acquisition_behavior',
'date_of_acquisition_ophys'], axis=1)

if 'project_code_behavior' in self._df and \
'project_code_ophys' in self._df:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate that we need such logic. Can you please put a comment here that the project code is stored in multiple places in the database and we need this logic to deal with that. Also it couldn't hurt to place this in a method since it's pretty long.

@@ -60,6 +60,13 @@ def from_lims(cls, ophys_experiment_id: int,
ophys_metadata = OphysExperimentMetadata.from_lims(
ophys_experiment_id=ophys_experiment_id, lims_db=lims_db)

if ophys_metadata.project_code != behavior_metadata.project_code:
raise ValueError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with the project table, in which you just issue a warning. I think a warning should be ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to warning.

Copy link
Contributor Author

@morriscb morriscb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @aamster, thanks for the thorough review. I've added a default to the ProjectCode class that cleans up much of the issues and deals with the None values as we need it.

Copy link
Contributor

@aamster aamster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, there is a bug in the OphysProjectCode.from_lims

"""
project_code = lims_db.fetchone(query, strict=True)
if project_code is None:
cls()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think you forgot return

If project_Code is null/None, we set the value to string 'Not Available'.
"""

def __init__(self, project_code: str = 'Not Available'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to do it is to make this optional and set it to this default value if None. That will fix the duplication of if project_code is None checks.

Add project_code to session metadata
Add project_code to metadata tables.
Fix single process table loading.
Add support to NWBs no project code stored.
Add project_code to BehaviorSession metadata return.
@morriscb morriscb merged commit a18646a into rc/2.15.0 Jan 10, 2023
@aamster aamster linked an issue Jul 14, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add project_code to a couple new locations
2 participants