Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: "None of ['WhiteListMatch'] are in the columns"Traceback: #11

Closed
Selecton98 opened this issue May 23, 2024 · 10 comments
Closed

Error: "None of ['WhiteListMatch'] are in the columns"Traceback: #11

Selecton98 opened this issue May 23, 2024 · 10 comments

Comments

@Selecton98
Copy link

Hello the Gotcha authors,

I'm trying your genotyping algorithm and encounter the following error:

"None of ['WhiteListMatch'] are in the columns"

Traceback:

  1. GotchaLabeling(path = "/lab-share/Public/data/DNV/barcode3/",
    . infile = "Genemeta.csv", gene_id = "Gene", sample_id = "RM30")
  2. reticulate::import_from_path(module = "gotcha_labeling", path = find.package("Gotcha"))$GotchaLabeling(path = path,
    . infile = infile, gene_id = gene_id, sample_id = sample_id)
  3. py_call_impl(callable, call_args$unnamed, call_args$named)

My file has a format matching your demo:
image

Could you please tell me in which file should I 'WhiteListMatch'
Thank you,
Sunny

@fizzo13
Copy link
Collaborator

fizzo13 commented May 23, 2024

Hi Sunny, thank you for using GoTChA. Can you try replacing the column name from 'X' to "WhiteListMatch"?
Tagging @SanjayKottapalli for help with the python module.

@Selecton98
Copy link
Author

Selecton98 commented May 23, 2024

Hi Sunny, thank you for using GoTChA. Can you try replacing the column name from 'X' to "WhiteListMatch"? Tagging @SanjayKottapalli for help with the python module.

Hello! Thank you for the reply. I tried to change 'X' to 'WhiteListMatch' in the list but it didn't help.
@SanjayKottapalli

@SanjayKottapalli
Copy link
Collaborator

Hi Sunny, I think the issue is actually with the "gene_id" and "sample_id" parameters that you set. Can you show me exactly how your own input .csv file looks?

@Selecton98
Copy link
Author

Selecton98 commented May 24, 2024

Genemeta.csv
@SanjayKottapalli
Please find my file in the attachment. I used "RM30" to make this column same as your demo (for easier debugging).
Thank you for your help!

@SanjayKottapalli
Copy link
Collaborator

Hmm, I tried to read the file in python and it worked just fine for me. Can you try the following?

In an interactive python session:

  1. Navigate to the folder where you installed Gotcha, then cd into inst/python
  2. import gotcha_labeling
  3. test = gotcha_labeling.read_data("/lab-share/Public/data/DNV/barcode3/Genemeta.csv", "Gene", "RM30")
  4. print(test)

Let me know what output this gives and if any errors come up.

@Selecton98
Copy link
Author

image @SanjayKottapalli The files is successfully read by the "gotcha_labeling.read_data".

@fizzo13
Copy link
Collaborator

fizzo13 commented May 28, 2024

@Selecton98 was the issue fixed?

@SanjayKottapalli
Copy link
Collaborator

SanjayKottapalli commented May 28, 2024

Since it seems to be working when using python only and doesn't work when you call python through the R pipeline, it seems like it could be a reticulate issue @fizzo13 . Probably related to formatting of dataframes.

@Selecton98 could you also try commenting/changing this code in your gotcha_labeling.py from

try:
genotyping = pd.DataFrame(index=cell_line.index)
if sample_id in np.unique(cell_line['Sample'].values):
cell_line = cell_line.loc[cell_line['Sample']==sample_id, :]
genotyping['WTcount'] = cell_line[gene_id+'_WTcount']
genotyping['MUTcount'] = cell_line[gene_id+'_MUTcount']
except:
cell_line = cell_line.set_index("WhiteListMatch")
genotyping = pd.DataFrame(index=cell_line.index)
genotyping['WTcount'] = cell_line['WTcount']
genotyping['MUTcount'] = cell_line['MUTcount']

to only what's in the try block?

genotyping = pd.DataFrame(index=cell_line.index)
if sample_id in np.unique(cell_line['Sample'].values):
cell_line = cell_line.loc[cell_line['Sample']==sample_id, :]
genotyping['WTcount'] = cell_line[gene_id+'_WTcount']
genotyping['MUTcount'] = cell_line[gene_id+'_MUTcount']

The error is strange because the try block shouldn't fail to begin with.

@Selecton98
Copy link
Author

Selecton98 commented May 29, 2024

@SanjayKottapalli
@fizzo13
Hi everyone,
I finally get this bug solved.

One problem is the R "GotchaLabeling" doesn't need a "sample_column" information which is required by the python "GotchaLabeling" and "read_data". So I manually defined "sample_column='Sample'" in each of these two functions.

Another problem is the csv file should include an index column and a "WhiteListMatch" column. Both of these two columns should be index/barcode information.

Thank you,
Sunny

@fizzo13
Copy link
Collaborator

fizzo13 commented May 29, 2024

Thank you @SanjayKottapalli and @Selecton98 . I'll close the issue.

@fizzo13 fizzo13 closed this as completed May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants