Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash collisions among data points in delay model #1525

Closed
hnpl opened this issue Aug 1, 2024 · 0 comments
Closed

Hash collisions among data points in delay model #1525

hnpl opened this issue Aug 1, 2024 · 0 comments
Labels
delay model Predicting/modeling delays of target (backend) processes

Comments

@hnpl
Copy link
Contributor

hnpl commented Aug 1, 2024

Describe the bug
Hash collisions among data points are causing fewer data points being reported than specified in samples.textproto.

The timing characterization client relies on hashing the sample specs to determine which sample's data point will be collected [1]. However, for the generated hashes, the element count and result element count of array ops are not taken into account [2]. This causes a problem that, if the samples of two arrays are the same except for the element counts, the result of one of them will be overwritten by the other.

Currently, the bug affects the data point generation of kArrayIndex, in which the sample [3] and sample [4] produce the same hash. Both have the same result widths and input operand widths, while the element counts are different. As a result, only one of the samples is being written to the data points output file.

Note that, the DataPoint proto will be translated to SampleSpec proto before the timing information is collected. The translation function [5] does not take into account the element counts and the result element counts as well. Note that the Operand proto does not have the result element count field [6].

I believe a fix for this would involve changing Operand proto.

To Reproduce
Follow the procedure for generating data points for any of the PDKs.

Expected behavior
[3] and [4] should be two distinct data points.

Additional context

[1]

spec_key = delay_model_utils.get_sample_spec_key(spec)

[2]
key = (
spec.op_samples.op
+ ': '
+ ', '.join([str(spec.point.result_width)] + bit_count_strs)
)
if spec.op_samples.specialization:
key = key + ' ' + str(spec.op_samples.specialization)
return key

[3]
samples {
result_width: 8
operand_widths: 8
operand_widths: 8
operand_element_counts {
operand_number: 0
element_counts: 256
}

[4]
samples {
result_width: 8
result_element_counts: 4
operand_widths: 8
operand_widths: 8
operand_element_counts {
operand_number: 0
element_counts: 4
element_counts: 256
}
}

[5]
def data_point_to_sample_spec(
data_point: delay_model_pb2.DataPoint,
) -> SampleSpec:
"""Converts a data point to the sample spec describing it."""
op_samples = delay_model_pb2.OpSamples()
op_samples.op = data_point.operation.op
if data_point.operation.specialization:
op_samples.specialization = data_point.operation.specialization
point = delay_model_pb2.Parameterization()
point.result_width = data_point.operation.bit_count
point.operand_widths.extend(
[operand.bit_count for operand in data_point.operation.operands]
)
return SampleSpec(op_samples, point)

[6]
message Operation {
// XLS Op (e.g., kAdd).
string op = 1;
// Number of bits of the result of the operation. For array-types this is the
// number of bits in each element.
int64 bit_count = 2;
// Operands.
message Operand {
// Number of bits of the operand. For array types this is the number of bits
// in each element.
int64 bit_count = 1;
// If the operand is array-typed, this is the number of elements in the
// array.
int64 element_count = 2;
}
repeated Operand operands = 3;
// The category of specialization of this operation, if any. For example, if
// the operands of the xls::Node represented by this proto are all identical,
// this field would be OPERANDS_IDENTICAL.
SpecializationKind specialization = 4;
message LiteralOperandInstanceDetails {
repeated int64 literal_operand = 1;
repeated int64 nonliteral_operand = 2;
}
LiteralOperandInstanceDetails literal_operand_details = 5;
}

@cdleary cdleary added the delay model Predicting/modeling delays of target (backend) processes label Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
delay model Predicting/modeling delays of target (backend) processes
Projects
None yet
Development

No branches or pull requests

2 participants