Skip to content
This repository has been archived by the owner on Feb 1, 2024. It is now read-only.

Fix handling of numeric cells in Excel workbooks #1663

Merged
merged 1 commit into from Feb 21, 2022

Conversation

jwalgran
Copy link
Contributor

@jwalgran jwalgran commented Feb 18, 2022

Overview

With the introduction of number_of_workers we now have a column that can have a numeric value. This presented 2 challenges to the existing Python code.

  • Excel parsing library returns these as numeric types, but the code was expecting a string.
  • A value that appears as 130 in the workbook can appear as a float 130.0 after parsing.

This commit attempts to fix both of these issues by first manually casing cell values to strings when parsing the workbook, then using an int(parse(x)) nested cast in the extract_int_range_value function to ensure that the decimal portion is removed.

Connects #1638

Testing Instructions

  • Check out develop, log in as c2@example.com, submit
    extended-template-test.xlsx. Verify that an error blocks submission.
  • Check out bugfix/jvw/handle-numeric-excel-cells and attempt to submit the file again. Verify that it succeeds and that ./tools/batch_prcess {is} completes successfully.

Checklist

  • fixup! commits have been squashed
  • CI passes after rebase
  • CHANGELOG.md updated with summary of features or fixes, following Keep a Changelog guidelines

@jwalgran jwalgran marked this pull request as ready for review February 21, 2022 16:14
Copy link
Contributor

@TaiWilkin TaiWilkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this both with the provided list, and the list Embedded.Map.CSV.Template.-.boohoo.xlsx from the issue. Both lists broke on develop, but were able to be uploaded and processed successfully on this branch.
Thanks for the clear write-up and code comments explaining the fix for this!

@TaiWilkin TaiWilkin assigned jwalgran and unassigned TaiWilkin Feb 21, 2022
With the introduction of number_of_workers we now have a column that can have a
numeric value. This presented 2 challenges to the existing Python code.

- Excel parsing library returns these as numeric types, but the code was
expecting a string.
- A value that appears as 130 in the workbook can appear as a float 130.0 after
parsing.

This commit attempts to fix both of these issues by first manually casing cell
values to strings when parsing the workbook, then using an `int(parse(x))`
nested cast in the `extract_int_range_value` function to ensure that the decimal
portion is removed.
@jwalgran jwalgran force-pushed the bugfix/jvw/handle-numeric-excel-cells branch from 0909380 to f6a0aaa Compare February 21, 2022 17:23
@jwalgran
Copy link
Contributor Author

Thank your for the quick review turnaround.

@jwalgran jwalgran merged commit 4e4119d into develop Feb 21, 2022
@jwalgran jwalgran deleted the bugfix/jvw/handle-numeric-excel-cells branch April 15, 2022 15:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants