Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Integration] Decimal test files are huge #38745

Closed
pitrou opened this issue Nov 16, 2023 · 0 comments · Fixed by #38746
Closed

[Integration] Decimal test files are huge #38745

pitrou opened this issue Nov 16, 2023 · 0 comments · Fixed by #38746

Comments

@pitrou
Copy link
Member

pitrou commented Nov 16, 2023

Describe the bug, including details regarding any error messages, version, and platform.

For some reason, the decimal files generated by the integration test suite have as many record batches as columns.
This looks like a programming error rather than something deliberate:

def generate_decimal128_case():
fields = [
DecimalField(name='f{}'.format(i), precision=precision, scale=2,
bit_width=128)
for i, precision in enumerate(range(3, 39))
]
possible_batch_sizes = 7, 10
batch_sizes = [possible_batch_sizes[i % 2] for i in range(len(fields))]
# 'decimal' is the original name for the test, and it must match
# provide "gold" files that test backwards compatibility, so they
# can be appropriately skipped.
return _generate_file('decimal', fields, batch_sizes)
def generate_decimal256_case():
fields = [
DecimalField(name='f{}'.format(i), precision=precision, scale=5,
bit_width=256)
for i, precision in enumerate(range(37, 70))
]
possible_batch_sizes = 7, 10
batch_sizes = [possible_batch_sizes[i % 2] for i in range(len(fields))]
return _generate_file('decimal256', fields, batch_sizes)

See in particular:

batch_sizes = [possible_batch_sizes[i % 2] for i in range(len(fields))]

Component(s)

Archery, Integration

pitrou added a commit that referenced this issue Nov 16, 2023
### Rationale for this change

For some reason, the decimal files generated by the integration test suite have as many record batches as columns.
This looks like a programming error rather than something deliberate.

### What changes are included in this PR?

Use similar batch sizes as for other test files.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* Closes: #38745

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou pitrou added this to the 15.0.0 milestone Nov 16, 2023
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
### Rationale for this change

For some reason, the decimal files generated by the integration test suite have as many record batches as columns.
This looks like a programming error rather than something deliberate.

### What changes are included in this PR?

Use similar batch sizes as for other test files.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* Closes: apache#38745

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant