New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`set_clusters` should use string type for cluster labels for hdf5 dump #40

Closed
bobermayer opened this Issue Jan 18, 2018 · 2 comments

Comments

Projects
None yet
3 participants
@bobermayer

bobermayer commented Jan 18, 2018

Hi,
when setting clusters from an external data frame (eg., read Seurat output using pandas with vlm.set_clusters(my_annotation.loc[vlm.ca['CellID'],'cell.type'])), the dtype of vlm.cluster_labels is 'O', which cannot be dumped into hdf5 (TypeError: Object dtype dtype('O') has no native HDF5 equivalent).
using vlm.cluster_labels=vlm.cluster_labels.astype(np.string_) fixes this problem but creates byte strings which is maybe not optimal. some unicode compatibility problem?

great package by the way!

@gioelelm gioelelm closed this in d450968 Jan 18, 2018

@gioelelm

This comment has been minimized.

Member

gioelelm commented Jan 18, 2018

Thank for reporting the bug, it was a little rare use case, but I fixed anyways in v0.12.4

great package by the way!

Thank you!

@yueqiw

This comment has been minimized.

yueqiw commented Jun 23, 2018

Hi,

I'm running into the same issue with vlm.set_clusters() using Scanpy output. The dtype of vlm.cluster_labels and vlm.colorandum (Hex color code) are '<U2' and '<U7', and this causes error in vlm.to_hdf5().

The following code fixes the problem:

if x.dtype.kind == 'U': x = x.astype(np.string_)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment