Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panda dataframe default display; skip DynamicTable Regions #175

Open
CodyCBakerPhD opened this issue Sep 23, 2021 · 4 comments · May be fixed by #176
Open

Panda dataframe default display; skip DynamicTable Regions #175

CodyCBakerPhD opened this issue Sep 23, 2021 · 4 comments · May be fixed by #176

Comments

@CodyCBakerPhD
Copy link
Collaborator

I have an NWBFile where a column of the UnitsTable contains references to a DynamicTableRegion, specifically to particular IDs of the ElectrodesTable. Note however that what we're seeing here likely occurs for any table display for other DynamicTable objects referenced in a custom column.

Screenshot of issue together with actual file contents with HDFView (note that actually pulling the data through an NWBHDF5IO works correctly in returning the actual DynamicTableRegion object with all its data properly formatted):
image

The specific tab here is the table view of the Units module in the widgets, which loads a simple view as a pandas dataframe display. Basically, it doesn't know how to map the object to a dataframe so it just grabs the string column names of the target table and breaks that down to a tuple of characters.

We could discuss what the best way of viewing such nested table data should be, but I think the easiest, quickest fix would be to just not try to display any column that has DynamicTableRegions values.

@bendichter
Copy link
Collaborator

@CodyCBakerPhD I think this is an issue with DynamicTable.to_dataframe. Could you try that, and if so, could you post this issue in HDMF?

@oruebel
Copy link
Contributor

oruebel commented Sep 23, 2021

DynamicTable.to_dataframe by default resolves the DynamicTableRegion links and will return a nested DataFrame, i.e., each cell of the column will itself contain a DynamicTableRegion with the data from the linked table. If you don't want to_dataframe to resolve the links then simply set index=False and you should get the integer indices (or list/tuple of indices) for the rows that are being references. See here https://hdmf.readthedocs.io/en/stable/hdmf.common.table.html?highlight=DynamicTable#hdmf.common.table.DynamicTable.to_dataframe

@oruebel
Copy link
Contributor

oruebel commented Sep 23, 2021

Also, if you have a DynamicTable ds and want to exclude any DynamicTableRegion columns when converting to a pandas DataFrame then I think the following should work:

ds.to_dataframe(exclude=set(ds.get_foreign_columns())

@CodyCBakerPhD
Copy link
Collaborator Author

If you don't want to_dataframe to resolve the links then simply set index=False and you should get the integer indices (or list/tuple of indices) for the rows that are being references.

Index = False is apparently the default that resulted in the above issue, setting Index = True gets things to look better at least.

image

I think this is an issue with DynamicTable.to_dataframe. Could you try that, and if so, could you post this issue in HDMF?

Given the above, I think it'd be best to just set Index=True in the view call for rendering dataframes; I'll get a quick PR up for that.

@CodyCBakerPhD CodyCBakerPhD linked a pull request Sep 27, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants