Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

update peptides.rb to parse the peptide section of the dat file successf... #2

Merged
merged 1 commit into from Feb 24, 2012
Jump to file or symbol
Failed to load files and symbols.
+31 −4
Split
View
@@ -42,7 +42,10 @@ def initialize(dat_file, byteoffset, cache_psm_index=true)
@byteoffset = byteoffset
@endbytepos = nil
- @file = File.new(dat_file,'r')
+ #@file = File.new(dat_file,'r')
+ # ================> changed to
+ @file = dat_file
+
@file.pos = @byteoffset
@curr_psm = [1,1]
@psmidx = []
@@ -73,9 +76,10 @@ def index_psm_positions
rewind
end
-
def rewind
- @file.pos = @byteoffset
+ #@file.pos = @byteoffset
+ # ===============> changed to
+ @file.pos = @psmidx[1][1] # go to the first line of psm, while @byteoffset goes to the boundary string ex. gc0p4Jq0M2Yt08jU534c0p
end
def psm q,p
@@ -88,20 +92,43 @@ def next_psm
# get the initial values for query & rank
tmp = []
tmp << @file.readline.chomp
+
+ # ===========> added these 2 lines
+ k,v = tmp[0].split "="
+ return nil if v == "-1" # skip when there are no peptides (value equals -1)
+
tmp[0] =~ /q(\d+)_p(\d+)/
q = $1
p = $2
+
+ # ==============> added file position handler to set the file position to the start of the next psm
+ # because it finishes a psm when it reads a new q#{q}_p#{p}, so it has already gone to the new psm
+ # that means that when it does next_psm it misses the first line of the psm
+ # ==============> added this line
+ tmp_pos = @file.pos
@file.each do |l|
break if l =~ @boundary
break unless l =~ /^q#{q}_p#{p}/
tmp << l.chomp
+ # ==============> added this line
+ tmp_pos = @file.pos
end
+ # ==============> added this line
+ @file.pos = tmp_pos
+
Mascot::DAT::PSM.parse(tmp)
end
def each
while @file.pos < @endbytepos
- yield next_psm()
+ #yield next_psm()
+ # ===========> changed to this block
+ psm = next_psm()
+ if psm.nil?
+ next # go to next line when psm is empty (there are no peptides, when value equals -1)
+ else
+ yield psm # go to next_psm
+ end
end
end
end