java.lang.NullPointerException when running MS-GF+ #13

RiegardtJohnson · 2017-06-07T08:30:20Z

I ran an MS-GF+(v2017.01.13) search using SearchGUI, and received the following errors when the output files were being generated:

Writing results...
java.lang.NullPointerException
at edu.ucsd.msjava.mzid.MZIdentMLGen.getDBSequence(MZIdentMLGen.java:661)
at edu.ucsd.msjava.mzid.MZIdentMLGen.getPeptideEvidenceList(MZIdentMLGen.java:619)
at edu.ucsd.msjava.mzid.MZIdentMLGen.addSpectrumIdentificationResults(MZIdentMLGen.java:347)
at edu.ucsd.msjava.ui.MSGFPlus.runMSGFPlus(MSGFPlus.java:397)
at edu.ucsd.msjava.ui.MSGFPlus.runMSGFPlus(MSGFPlus.java:106)
at edu.ucsd.msjava.ui.MSGFPlus.main(MSGFPlus.java:57)

The search finishes without any errors, however no output .mzid files are generated. The command used to run the search was as follows:
ms-gf+ command:
/home/user/anaconda2/jre/bin/java -Xmx50g -jar /run/media/user/Data/rmj_proteomics/SearchGUI-3.2.18/resources/MS-GF+/MSGFPlus.jar -s /run/media/user/Data/rmj_proteomics/proteomics/RECONVERTED/RJ_FC2_DCE.mgf -d /run/media/user/Data/rmj_proteomics/TREMBL_database/nr_fungal/nr_fungal_concatenated_target_decoy.fasta -o /run/media/user/Data/rmj_proteomics/proteomics/nr_fungal_lin/.SearchGUI_temp/RJ_FC2_DCE.msgf.mzid -t 10.0ppm -tda 0 -mod /run/media/user/Data/rmj_proteomics/SearchGUI-3.2.18/resources/MS-GF+/params/Mods.txt -minCharge 2 -maxCharge 6 -inst 3 -thread 23 -m 3 -e 1 -ntt 2 -protocol 0 -minLength 8 -maxLength 45 -n 10 -addFeatures 0 -ti 0,4

Can you advise on how to resolve this error?

Kind regards,
Riegardt Johnson

alchemistmatt · 2017-06-07T20:31:38Z

That's useful information that you provided, but it's not enough for us to solve the problem. It may be related to the protein names or protein sequences in the FASTA file, but without the actual files, we won't be able to diagnose. Please send SearchGUI-3.2.18/resources/MS-GF+/params/Mods.txt along with a portion of the .mgf file (e.g. a sampling of 25 spectra from the middle of the scan range) to proteomics@pnnl.gov

alchemistmatt · 2017-06-07T20:35:04Z

Also, please provide us info on where you obtained the TREMBL nr_fungal FASTA file. It would also be helpful if you sent us a portion of your FASTA file, including both the normal proteins and the decoy proteins that you added. This will let us see the format you're using for protein names, descriptions, and sequences.

I'm going to guess you're using uniprot_trembl_fungi.dat.gz from ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/ but please confirm.

alchemistmatt · 2017-06-07T21:47:01Z

If you're using the full-size TREMBL nr_fungal FASTA file, I'm frankly surprised that MSGF+ is not running out of memory. We have found that when FASTA files get larger than ~800 MB, we get memory usage issues (in that the system requires 16 GB of memory or more, scaling with FASTA file size). In cases like that we split the FASTA file into multiple parts, run MSGF+ once on each FASTA file part, then merge the results together.

The May 2017 release of uniprot_trembl_fungi.dat has 6.6 million proteins, giving a 4 GB FASTA file. The decoy version of that is 8 GB. I see you're allocating 50 GB via /java -Xmx50g so hopefully that's enough memory, but I suggest you first get things working with a sampling of that huge FASTA file. Something like head -5000000 uniprot_trembl_fungi.fasta > uniprot_trembl_fungi_excerpt.fasta

Stortebecker · 2017-09-19T21:38:27Z

I have got a similar error when running an mzML file, which has undergone PeakPicking on MS2 level with the OpenMS tool PeakPickerHiRes. When I instead use the vendor peak picking provided by MSConvert, MSGF runs without any error.

You can find the database, the original file and the vendor-peak-picked file here. I uploaded the PeakPickerHiRes output to Dropbox.

The command I ran:
java -jar MSGFPlus.jar -s PeakPickerHiRes_on_qExactive01819.mzml -d Human_database_cRAP_added.fasta -t 10ppm

The error I got:

Loading database finished (elapsed time: 20,16 sec)
Reading spectra...
java.lang.NullPointerException
at edu.ucsd.msjava.msutil.Spectrum.getCharge(Spectrum.java:124)
at edu.ucsd.msjava.msutil.SpecKey.getSpecKeyList(SpecKey.java:91)
at edu.ucsd.msjava.ui.MSGFPlus.runMSGFPlus(MSGFPlus.java:220)
at edu.ucsd.msjava.ui.MSGFPlus.runMSGFPlus(MSGFPlus.java:105)
at edu.ucsd.msjava.ui.MSGFPlus.main(MSGFPlus.java:56)

hroest · 2017-12-07T21:45:29Z

@Stortebecker maybe this is related to OpenMS/OpenMS#3082

@alchemistmatt is it possible that MSGF+ relies on optional elements in the mzML file?

FarmGeek4Life · 2018-12-10T21:02:08Z

@Stortebecker That file has no charge state information for the precursors, which is what MS-GF+ is trying to read when it crashes. PeakPickerHiRes does not report the charge states, but as of 2014 there was work in progress to implement charge state determination/deconvolution algorithms as options in OpenMS, according to OpenMS issue #877.

@RiegardtJohnson: This is a problem with the implementation of the search in MS-GF+, and limitations of Java. Java uses a 32-bit integer as the index for an array, which limits values to ~2.147 billion entries; MS-GF+ accesses all peptides in the fasta file in a way that means each residue is one entry in an array. Your database file, at 4GB, is big enough to have this problem for just a target or decoy search; when creating the concatenated target/decoy files for a target and decoy combined search, the number of residues is doubled, which doesn't make it any easier.

Wang-kaifei · 2023-12-19T02:53:08Z

Dear Developers,

I had the same problem recently. I was using a fasta file size of 14GB, and by reading the replies between everyone, I realised that I needed to slice the database for searching.

Because there are cases where a single MSMS is matched to different peptides in different searches, it seems to me that it is not possible to directly concatenate the results of these searches.

So I wonder if there is an official tool for merging the results from these sliced searches?

The command I use is: java -Xms150G -Xmx210G -jar MSGFPlus.jar -conf param_file_path
I am using the software version: MSGFPlus_v20230112

Any replies will be appreciated!

alchemistmatt · 2023-12-19T03:48:43Z

Use the MzidMerger to combine .mzid files from separate MS-GF+ searches of the same instrument file

Info:
- https://github.com/PNNL-Comp-Mass-Spec/MzidMerger
Download:
- https://github.com/PNNL-Comp-Mass-Spec/MzidMerger/releases

Wang-kaifei · 2023-12-19T10:59:26Z

Use the MzidMerger to combine .mzid files from separate MS-GF+ searches of the same instrument file

Info:

https://github.com/PNNL-Comp-Mass-Spec/MzidMerger

Download:

https://github.com/PNNL-Comp-Mass-Spec/MzidMerger/releases

Thanks a lot, I will try it!

Wang-kaifei · 2023-12-22T08:13:24Z

Dear,

I've got another problem.
When I use the command: dotnet /data/liuqingxiu/wkf/MSGFMerge/net5.0/MzidMerger.exe -inDir a -out b, I receive the following error:

Error:
An assembly specified in the application dependencies manifest (MzidMerger.deps.json) has already been found but with a different file extension:
package: 'MzidMerger', version: '1.3.1'
path: 'MzidMerger.dll'
previously found assembly: '/data/liuqingxiu/wkf/MSGFMerge/net5.0/MzidMerger.exe'

I'm using Ubuntu 20.04 with dotnet version 5.0.408.

Any replies will be appreciated!

FarmGeek4Life · 2023-12-22T08:51:04Z

I think you need to use "dotnet run /data/liuqingxiu/wkf/MSGFMerge/net5.0/MzidMerger.dll -inDir a -outDir b". I can't say with certainty, but I know the .exe is designed to be run standalone, so it's probably not the correct file to specify there, and all of the online examples show the use of a .dll.

…

________________________________ From: Kaifei Wang ***@***.***> Sent: Friday, December 22, 2023 12:13:34 AM To: MSGFPlus/msgfplus ***@***.***> Cc: Gibbons, Bryson C ***@***.***>; Comment ***@***.***> Subject: Re: [MSGFPlus/msgfplus] java.lang.NullPointerException when running MS-GF+ (#13) Check twice before you click! This email originated from outside PNNL. Dear, I've got another problem. When I use the command: dotnet /data/liuqingxiu/wkf/MSGFMerge/net5.0/MzidMerger.exe -inDir a -out b, I receive the following error: Error: An assembly specified in the application dependencies manifest (MzidMerger.deps.json) has already been found but with a different file extension: package: 'MzidMerger', version: '1.3.1' path: 'MzidMerger.dll' previously found assembly: '/data/liuqingxiu/wkf/MSGFMerge/net5.0/MzidMerger.exe' I'm using Ubuntu 20.04 with dotnet version 5.0.408. Any replies will be appreciated! — Reply to this email directly, view it on GitHub<#13 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABPPX5N6JW3IGFVMXO3U3A3YKU6K5AVCNFSM4DOJPKS2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBWG4ZTOMRWGEYQ>. You are receiving this because you commented.Message ID: ***@***.***>

hbarsnes mentioned this issue Jun 9, 2017

java.lang.NullPointerException while running MSGF+ search compomics/searchgui#146

Closed

FarmGeek4Life added the enhancement label Dec 10, 2018

FarmGeek4Life mentioned this issue Sep 25, 2019

MS-GF+ search in SearchGUI failed #77

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

java.lang.NullPointerException when running MS-GF+ #13

java.lang.NullPointerException when running MS-GF+ #13

RiegardtJohnson commented Jun 7, 2017

alchemistmatt commented Jun 7, 2017

alchemistmatt commented Jun 7, 2017

alchemistmatt commented Jun 7, 2017

Stortebecker commented Sep 19, 2017

hroest commented Dec 7, 2017

FarmGeek4Life commented Dec 10, 2018

Wang-kaifei commented Dec 19, 2023

alchemistmatt commented Dec 19, 2023

Wang-kaifei commented Dec 19, 2023

Wang-kaifei commented Dec 22, 2023

FarmGeek4Life commented Dec 22, 2023 via email

java.lang.NullPointerException when running MS-GF+ #13

java.lang.NullPointerException when running MS-GF+ #13

Comments

RiegardtJohnson commented Jun 7, 2017

alchemistmatt commented Jun 7, 2017

alchemistmatt commented Jun 7, 2017

alchemistmatt commented Jun 7, 2017

Stortebecker commented Sep 19, 2017

hroest commented Dec 7, 2017

FarmGeek4Life commented Dec 10, 2018

Wang-kaifei commented Dec 19, 2023

alchemistmatt commented Dec 19, 2023

Wang-kaifei commented Dec 19, 2023

Wang-kaifei commented Dec 22, 2023

FarmGeek4Life commented Dec 22, 2023 via email