Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError when running MUSiCC on Windows and Linux #1

Open
zkstewart opened this issue Aug 16, 2017 · 3 comments
Open

IndexError when running MUSiCC on Windows and Linux #1

zkstewart opened this issue Aug 16, 2017 · 3 comments

Comments

@zkstewart
Copy link

Hi,

I was attempting to use MUSiCC to normalise read count data for a project, but continue to run into an error when using the '-perf' argument regardless of the OS I have Python installed on or whether I use my own data or the provided example data. Below are four examples of this error, the first two are from my home PC (Windows 10) using Anaconda3 (Python 3.6), the third is from running this on a high-performance computing environment running on SUSE using Anaconda3 (Python 3.6), and the last is from the same SUSE environment using Anaconda2 (Python 2.7).

Although I haven't shown the output below, I have also tried running these same scripts with my own tab-delimited gene counts file and receive the same error, so I do not believe the example data file is broken.

When I do not call the '-perf' argument, everything runs fine to completion. Thus, as the traceback shows, I believe that the numpy data structure is being referenced incorrectly when attempting to calculate the model performance (I do not understand numpy myself so do not know what the issue is from looking at the code).

This is using MUSiCC 1.0.2.

Thanks,
Zac


python C:\abbreviated_dir\Anaconda3\Scripts\run_musicc.py D:\abbreviated_dir\simulated_ko_relative_abundance.tab -o D:\abbreviated_dir\musicc.test.tab -n -perf -v -c learn_model
C:\abbreviated_dir\Anaconda3\lib\site-packages\sklearn\cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
Running MUSiCC...
Input: D:\abbreviated_dir\simulated_ko_relative_abundance.tab
Output: D:\abbreviated_dir\musicc.test.tab
Normalize: True
Correct: learn_model
Compute scores: True
Loading data using pandas module...
20 samples and 3573 genes
Done.
Performing MUSiCC Correction...
Learning sample-specific models
....................Done.
Model performance on various gene sets:
Traceback (most recent call last):
File "C:\abbreviated_dir\Anaconda3\Scripts\run_musicc.py", line 26, in
correct_and_normalize(vars(given_args))
File "C:\abbreviated_dir\Anaconda3\lib\site-packages\musicc\core.py", line 395, in correct_and_normalize
print("Median R^2 across samples for all USCG:" + str(np.nanmedian(all_samples_mean_scores)[0]))
IndexError: invalid index to scalar variable.


run_musicc.py C:\abbreviated_dir\Anaconda3\Scripts\run_musicc.py D:\abbreviated_dir\simulated_ko_relative_abundance.tab -o D:\abbreviated_dir\musicc.test.tab -n -c use_generic -perf -v
C:\abbreviated_dir\Anaconda3\lib\site-packages\sklearn\cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
Running MUSiCC...
Input: D:\abbreviated_dir\simulated_ko_relative_abundance.tab
Output: D:\abbreviated_dir\musicc.test.tab
Normalize: True
Correct: use_generic
Compute scores: True
Loading data using pandas module...
20 samples and 3573 genes
Done.
Performing MUSiCC Correction...
Generic model intercept:1.0
Generic model coefficients:[-0.00509 -0.00189 -0.00031 0.005 0.00126 0.00005 0.00001 0.00006 -0.00016 -0.00016 -0.00048 -0.00099 -0.00062 -0.00413 0.00038 0.00006 -0.00061 -0.00386 -0.00092 0.00002 0.00006 0.00126
0.00009 0.00006 0.00015 -0.00199 -0.00026 -0.00222 -0.01525 -0.04291 -0.01742 0.00447 -0. 0.00001 0.00688]
Correcting samples using generic model
....................Done.
Model performance on various gene sets:
Traceback (most recent call last):
File "C:\abbreviated_dir\Anaconda3\Scripts\run_musicc.py", line 26, in
correct_and_normalize(vars(given_args))
File "C:\abbreviated_dir\Anaconda3\lib\site-packages\musicc\core.py", line 395, in correct_and_normalize
print("Median R^2 across samples for all USCG:" + str(np.nanmedian(all_samples_mean_scores)[0]))
IndexError: invalid index to scalar variable.


python /abbreviated_dir/anaconda3/bin/run_musicc.py /abbreviated_dir/simulated_ko_relative_abundance.tab -o /abbreviated_dir/musicc_norm.tab -n -perf -v -c learn_model
/abbreviated_dir/anaconda3/lib/python3.6/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
Running MUSiCC...
Input: /abbreviated_dir/simulated_ko_relative_abundance.tab
Output: /abbreviated_dir/musicc_norm.tab
Normalize: True
Correct: learn_model
Compute scores: True
Loading data using pandas module...
20 samples and 3573 genes
Done.
Performing MUSiCC Correction...
Learning sample-specific models
....................Done.
Model performance on various gene sets:
Traceback (most recent call last):
File "/abbreviated_dir/anaconda3/bin/run_musicc.py", line 26, in
correct_and_normalize(vars(given_args))
File "/abbreviated_dir/anaconda3/lib/python3.6/site-packages/musicc/core.py", line 395, in correct_and_normalize
print("Median R^2 across samples for all USCG:" + str(np.nanmedian(all_samples_mean_scores)[0]))
IndexError: invalid index to scalar variable.


python2.7 /abbreviated_dir/anaconda2/bin/run_musicc.py /abbreviated_dir/simulated_ko_relative_abundance.tab -o /abbreviated_dir/musicc_norm.tab -n -perf -v -c learn_model
/abbreviated_dir/anaconda2/lib/python2.7/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
Running MUSiCC...
Input: /abbreviated_dir/simulated_ko_relative_abundance.tab
Output: /abbreviated_dir/musicc_norm.tab
Normalize: True
Correct: learn_model
Compute scores: True
Loading data using pandas module...
20 samples and 3573 genes
Done.
Performing MUSiCC Correction...
Learning sample-specific models
....................Done.
Model performance on various gene sets:
Traceback (most recent call last):
File "/abbreviated_dir/anaconda2/bin/run_musicc.py", line 26, in
correct_and_normalize(vars(given_args))
File "/abbreviated_dir/anaconda2/lib/python2.7/site-packages/musicc/core.py", line 395, in correct_and_normalize
print("Median R^2 across samples for all USCG:" + str(np.nanmedian(all_samples_mean_scores)[0]))
IndexError: invalid index to scalar variable.

@omanor
Copy link
Collaborator

omanor commented Aug 16, 2017

Thank you Zac, I will try to look into this. Also, I saw another email you sent but that didn't open a new issue. Can you open a new issue so I can answer you there?

@jzrapp
Copy link

jzrapp commented Nov 4, 2019

Hi @zkstewart and @omanor,

I receive the same error. Was there an answer or solution to this? I don't really understand what the software is trying to tell me..

@engal
Copy link
Collaborator

engal commented Nov 7, 2019

Hi,

Thanks for notifying me that this was still an issue.

It looks like there was a bug when MUSiCC tried to print out performance metrics for learned models. I just released an update that should have addressed this issue.

Thanks for your interest in MUSiCC!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants