Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize dwarf line numbers decoding #7413

Merged
merged 7 commits into from Feb 15, 2019
78 changes: 60 additions & 18 deletions src/debug/dwarf/line_numbers.cr
Expand Up @@ -213,6 +213,12 @@ module Debug

# Decodes the compressed matrix of addresses to line numbers.
private def decode_sequences(size)
# Map names to indexes to avoid doing linear scans over @files and @directories
indexes = {
directory: {} of String => Int32,
filename: {} of String => Int32,
}

while (@io.tell - @offset) < size
sequence = Sequence.new

Expand All @@ -236,7 +242,7 @@ module Debug
read_filename_table(sequence)

if @io.tell - @offset < sequence.offset + sequence.total_length
read_statement_program(sequence)
read_statement_program(sequence, indexes)
end
end
end
Expand Down Expand Up @@ -277,7 +283,7 @@ module Debug
end

# TODO: support LNE::DefineFile (manually register file, uncommon)
private def read_statement_program(sequence)
private def read_statement_program(sequence, indexes)
registers = Register.new(sequence.default_is_stmt)

loop do
Expand All @@ -289,7 +295,7 @@ module Debug
operation_advance = adjusted_opcode / sequence.line_range
increment_address_and_op_index(operation_advance)
registers.line &+= sequence.line_base + (adjusted_opcode % sequence.line_range)
register_to_matrix(sequence, registers)
register_to_matrix(sequence, registers, indexes)
registers.reset
elsif opcode == 0
# extended opcode
Expand All @@ -299,7 +305,7 @@ module Debug
case extended_opcode
when LNE::EndSequence
registers.end_sequence = true
register_to_matrix(sequence, registers)
register_to_matrix(sequence, registers, indexes)
if (@io.tell - @offset - sequence.offset) < sequence.total_length
registers = Register.new(sequence.default_is_stmt)
else
Expand All @@ -324,7 +330,7 @@ module Debug

case standard_opcode
when LNS::Copy
register_to_matrix(sequence, registers)
register_to_matrix(sequence, registers, indexes)
registers.reset
when LNS::AdvancePc
operation_advance = DWARF.read_unsigned_leb128(@io)
Expand Down Expand Up @@ -363,15 +369,15 @@ module Debug

@current_sequence_matrix : Array(Row)?

private def register_to_matrix(sequence, registers)
private def register_to_matrix(sequence, registers, indexes)
file = sequence.file_names[registers.file]
path = sequence.include_directories[file[1]]

row = Row.new(
registers.address,
registers.op_index,
register_directory(path),
register_filename(file[0]),
register_directory(path, indexes[:directory]),
register_filename(file[0], indexes[:filename]),
registers.line.to_i,
registers.column.to_i,
registers.end_sequence
Expand All @@ -389,20 +395,56 @@ module Debug
end
end

private def register_filename(name)
if index = @files.index(name)
return index
# When decoding statement programs when asking for the current
# filename it's usually the case that it's the same as the old
# one (because all info from a single file comes first, then
# another file comes next, etc.). So we remember the last
# mapped index to avoid unnecessary lookups.
@last_filename : String?
@last_filename_index = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, if the files are read in some order (and not jumping from file to file, do @last_filename ever differ from @files[-1], (and similar for the index)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they differ. They are not read in order. I don't know why, though, I'm not familiar with the format. Maybe understanding the format first, then optimizing it might be better... 😊

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, the format follows functions (symbols). An instruction group of the function refers to a file:line, then the following group likely refers to the same file at a further line... unless the compiler inlined code, which will cause jumps to another file, etc.

If I recall correctly, that is.


private def register_filename(name, filename_indexes)
if name.same?(@last_filename)
return @last_filename_index
end
@files << name
@files.size - 1

@last_filename = name

index = filename_indexes[name]? || @files.index(name)

unless index
index = @files.size
@files << name
filename_indexes[name] = index
end

@last_filename_index = index

index
end

private def register_directory(name)
if index = @directories.index(name)
return index
# Same logic as `@last_filename` but for directories
@last_directory : String?
@last_directory_index = 0

private def register_directory(name, directory_indexes)
if name.same?(@last_directory)
return @last_directory_index
end
@directories << name
@directories.size - 1

@last_directory = name

index = directory_indexes[name]? || @directories.index(name)

unless index
index = @directories.size
@directories << name
directory_indexes[name] = index
end

@last_directory_index = index

index
end
end
end
Expand Down