Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual preprocessing with Pandas>1.4+ #272

Closed
bpr4242 opened this issue May 29, 2023 · 3 comments
Closed

Manual preprocessing with Pandas>1.4+ #272

bpr4242 opened this issue May 29, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@bpr4242
Copy link

bpr4242 commented May 29, 2023

Description of the bug

Hi Zewen! Great package I really have been enjoying it as I try to prepare a pipeline for some mouse scRNAseq with BCR/TCR paired libraries. I comment here to say that in reference to #180 where you enforce pandas<1.5 in your singularity environment, it seems that the issue lies with the new requirement that Pandas>1.5 raises a Value Error when an index is defined by a set.

Minimal reproducible example

import dandelion as ddl
sample = 'path/to/fasta'
ddl.pp.assign_isotypes((sample),org= "mouse", plot=True, save_plot= True)

The error message produced by the code above

ValueError                                Traceback (most recent call last)
----> 1 ddl.pp.assign_isotypes((samples[0]),filename_prefix=bcr_filename_prefixes,org= "mouse", plot=True, save_plot= True)

~/mambaforge/envs/scrnaseq_env1/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py in ?(fastas, fileformat, org, correct_c_call, correction_dict, plot, save_plot, show_plot, figsize, blastdb, allele, filename_prefix, verbose)
    923 
    924     logg.info("Assign isotypes \n")
    925 
    926     for i in range(0, len(fastas)):
--> 927         assign_isotype(
    928             fastas[i],
    929             fileformat=fileformat,
    930             org=org,

~/mambaforge/envs/scrnaseq_env1/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py in ?(fasta, fileformat, org, evalue, correct_c_call, correction_dict, plot, save_plot, show_plot, figsize, blastdb, allele, filename_prefix, verbose)
    862     # move and rename
    863     move_to_tmp(fasta, filename_prefix)
    864     make_all(fasta, filename_prefix, loci="ig")
    865     rename_dandelion(fasta, filename_prefix, endswith=out_ex, subdir="tmp")
--> 866     update_j_multimap(fasta, filename_prefix)

~/mambaforge/envs/scrnaseq_env1/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py in ?(data, filename_prefix)
   6483             "support_multimappers",
   6484         ]
   6485         check_multimapper(filePath0, filePath2)
   6486         if filePath0 is not None:
-> 6487             jmulti = multimapper(filePath0)
   6488             if filePath1 is not None:
   6489                 dbpass = load_data(filePath1)
   6490                 for col in jmm_transfer_cols:

~/mambaforge/envs/scrnaseq_env1/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py in ?(filename)
   6383     df = pd.read_csv(filename, delimiter="\t")
   6384     df_new = df.loc[
   6385         df["j_support"] < 1e-3, :
   6386     ]  # maybe not needing to filter if j_support has already been filtered
-> 6387     mapped = pd.DataFrame(
   6388         index=set(df_new["sequence_id"]),
   6389         columns=[
   6390             "multimappers",

~/mambaforge/envs/scrnaseq_env1/lib/python3.9/site-packages/pandas/core/frame.py in ?(self, data, index, columns, dtype, copy)
    669         manager = get_option("mode.data_manager")
    670 
    671         # GH47215
    672         if index is not None and isinstance(index, set):
--> 673             raise ValueError("index cannot be a set")
    674         if columns is not None and isinstance(columns, set):
    675             raise ValueError("columns cannot be a set")
    676 

ValueError: index cannot be a set

OS information

MacOS

Version information

dandelion==0.3.1 pandas==2.0.1 numpy==1.24.3 matplotlib==3.7.1 networkx==3.1 scipy==1.10.1

Additional context

No response

@bpr4242 bpr4242 added the bug Something isn't working label May 29, 2023
@bpr4242
Copy link
Author

bpr4242 commented May 29, 2023

For the time being I am cloning my conda env and downgrading pandas/numpy to confirm the problem is fixed.

@zktuong
Copy link
Owner

zktuong commented May 29, 2023

hi @bpr4242 thanks for the interest in this package! Actually I just updated the pypi version to 0.3.2 yesterday and it included an automatic fix by dependabot

So if you reinstall and use pandas >=2, it should still work

@bpr4242
Copy link
Author

bpr4242 commented May 30, 2023

Perfect! Working well now, thank you!

@bpr4242 bpr4242 closed this as completed May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants