Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

anndata views do not work as expected for obs and var dataframes #887

Open
jlause opened this issue Jan 27, 2023 · 3 comments
Open

anndata views do not work as expected for obs and var dataframes #887

jlause opened this issue Jan 27, 2023 · 3 comments

Comments

@jlause
Copy link

jlause commented Jan 27, 2023

After talking to @ivirshup on Zulip, opening this issue here to point out an inconsistency in how anndata views currently work.

When I have a view of an anndata object, I expect that any changes to the parent anndata object will propagate to the view. This does not work for the obs and var fields.

Example: When I try adding a new column to obs or change an existing one in the parent anndata, the view is unchanged:

#make adata with obs field
adata = anndata.AnnData(X=np.zeros((5,5)),
                        obs=dict(old='old_value'))
#make view by slicing
view = adata[:3,:3]
#add new column
adata.obs['new']='test'
#edit existing column
adata.obs['old']='new_value'
#view.obs does not have the changes!
print(view.obs)
#output:
#         old
#0  old_value
#1  old_value
#2  old_value

For X I see the expected behavior:

#change the parent object's .X
adata.X[0,0]=1
#change visible in view.X
view.X
#output:
#ArrayView([[1., 0., 0.],
#           [0., 0., 0.],
#           [0., 0., 0.]], dtype=float32)

(also works for obsp, obsm and layers, not shown for brevity)

According to @ivirshup

I believe obs and var get copied when a view is made. IIRC this is about needing an updated obs_names and var_names which is tied to them being in obs/ var.

I also quickly checked the docs and did not find a mention of this behavior.
What should be the next steps?

@ivirshup
Copy link
Member

Thanks for opening this.

I suspect what we need to do is separate obs_names/ var_names from the obs and var dataframes.

But, maybe this is possible without that. I would need to take a closer look into the code.

Prioritization

It's hard for me to say what the priority of this is. The dataframes should be relatively small compared to the rest of the AnnData.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity.
Please add a comment if you want to keep the issue open. Thank you for your contributions!

@github-actions github-actions bot added the stale label Jun 11, 2023
@ivirshup ivirshup added Bug 🐛 and removed stale labels Jun 19, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity.
Please add a comment if you want to keep the issue open. Thank you for your contributions!

@github-actions github-actions bot added the stale label Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants