Bug: Inconsistent Calculated Column Formula when applying columns[:].formula using table range references #1015

philippefutureboy · 2023-10-06T20:07:21Z

Current behavior

When writing a Table which is comprised of columns that use formulas thar refer to other columns by their named range reference, the resulting Excel file does not load the formulas to the table and highlights all the affected cells with the error "Inconsistent Calculated Column Formula".
This requires manual correction in the file itself by a person for the formulas to work.

Furthermore, if a second table refers to one of the columns using a formula in the first table, on opening the file, Excel opens a popup that says the file is broken:

"We found a problem with some content in 'file.xlsx'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes."

When that happens the formulas in the second tables are just completely removed, and need to be manually inserted instead of restored using the contextual menu on each cell.

See loom: https://www.loom.com/share/23fb24892cf54ec9bfe80d6bbe7af776

This issue prevents us from offering automated, high quality excel reports to our clients as it requires manual intervention on the file to correct the error-highlighted formulas. If this is solved, that is the last hurdle for us to offer automated excel reporting as a reverse ELT analysis.

Expected behavior

We expect formulas that refer to other named ranges in the table (or in another named table) to work properly without manual intervention.

Sample code to reproduce

from pathlib import Path
from textwrap import dedent
from typing import Dict

import pandas as pd

PATH = Path(__file__).parent / "bug.xlsx"

def main():
    data = [
        {"index": 0, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"02971\"","amount_charged":225.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 1, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23222\"","amount_charged":263.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 2, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"33300\"","amount_charged":1550.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 3, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":354.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 4, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":177.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 5, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 6, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 7, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 8, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 9, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01130\"","amount_charged":72.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0}
    ]

    with pd.ExcelWriter(PATH) as writer:
        columns = {
            "index": {
                "header": "Index",
            },
            "visible": {
                "header": "Visible",
                "formula": "=SUBTOTAL(3, [@[Index]])"
            },
            "schedule_track_department_label": {
                "header": "Schedule dept.",
                "total_string": "TOTAL"
            },
            "responsible_label": {
                "header": "Assigned Dentist"
            },
            "procedure_key": {
                "header": "Code",
            },
            "amount_charged": {
                "header": "Charges",
                "format": writer.book.add_format({"num_format": "$#,##0.00"}),
                "total_function": "sum"
            },
            "amount_paid_correction": {
                "header": "Corrected dentist pay amount",
                "format": writer.book.add_format({"num_format": "$#,##0.00"}),
                "formula": dedent("""
                    =(
                        Details[@[Charges]] *
                        Details[@[Corrected percentage]]
                    )
                """),
                "total_function": "sum"
            },
            "amount_paid_percentage_correction": {
                "header": "Corrected percentage",
                "format": writer.book.add_format({"num_format": "0%"}),
                "formula": dedent("""
                    =IF(
                        AND(
                            Details[@[Schedule dept.]] = "Hygiene",
                            IF(
                                IFERROR(MATCH(Details[@Code], {"01110", "01120", "01130", "01200", "01250"}, 0), FALSE) = FALSE,
                                FALSE,
                                TRUE
                            )
                        ),
                        0.25,
                        0.35
                    )
                """)
            },
        }
        # Creating sheets before printing
        pd.DataFrame().to_excel(writer, sheet_name="Sheet 1")

        # Printing data to file
        worksheet = writer.sheets["Sheet 1"]

        # cell_range[2] -> +1: Adds a row for rollup row
        cell_range = (0, 0, len(data) + 1, len(data[0].keys()) - 1)

        worksheet.add_table(
            *cell_range,
            {
                "name": "Details",
                "autofilter": True,
                "style": "Table Style Medium 9",
                "total_row": True,
                "columns": [col for col in columns.values()],
                "data": [
                    [v for v in record.values()]
                    for record in data
                ],
            },
        )


if __name__ == "__main__":
    main()

Environment

- XlsxWriter version: 3.1.0
- Python version: 3.8.7
- Excel version: Microsoft Excel for Mac Version 16.77.1 (23091703), Microsoft 365 Subscription
- OS: MacOS Monterey 12.6.7

Any other information

OpenOffice and LibreOffice users

I have tested the output file with Excel.

jmcnamara · 2023-10-06T21:10:24Z

Thanks for the detailed report.

The main issue is that the data from the data in add_table() is overwriting the column formulas. That is a bug in XlsxWriter which I will fix shortly.

In the meantime you can move the data writing out of add_table(). Like this:

from pathlib import Path
from textwrap import dedent
from typing import Dict

import pandas as pd

PATH = Path(__file__).parent / "bug.xlsx"

def main():
    data = [
        {"index": 0, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"02971\"","amount_charged":225.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 1, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23222\"","amount_charged":263.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 2, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"33300\"","amount_charged":1550.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 3, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":354.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 4, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":177.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 5, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 6, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 7, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 8, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 9, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01130\"","amount_charged":72.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0}
    ]

    with pd.ExcelWriter(PATH) as writer:
        columns = {
            "index": {
                "header": "Index",
            },
            "visible": {
                "header": "Visible",
                "formula": "=SUBTOTAL(3, [@[Index]])"
            },
            "schedule_track_department_label": {
                "header": "Schedule dept.",
                "total_string": "TOTAL"
            },
            "responsible_label": {
                "header": "Assigned Dentist"
            },
            "procedure_key": {
                "header": "Code",
            },
            "amount_charged": {
                "header": "Charges",
                "format": writer.book.add_format({"num_format": "$#,##0.00"}),
                "total_function": "sum"
            },
            "amount_paid_correction": {
                "header": "Corrected dentist pay amount",
                "format": writer.book.add_format({"num_format": "$#,##0.00"}),
                "formula": dedent("""
                    =(
                        Details[@[Charges]] *
                        Details[@[Corrected percentage]]
                    )
                """),
                "total_function": "sum"
            },
            "amount_paid_percentage_correction": {
                "header": "Corrected percentage",
                "format": writer.book.add_format({"num_format": "0%"}),
                "formula": dedent("""
                    =IF(
                        AND(
                            Details[@[Schedule dept.]] = "Hygiene",
                            IF(
                                IFERROR(MATCH(Details[@Code], {"01110", "01120", "01130", "01200", "01250"}, 0), FALSE) = FALSE,
                                FALSE,
                                TRUE
                            )
                        ),
                        0.25,
                        0.35
                    )
                """)
            },
        }
        # Creating sheets before printing
        pd.DataFrame().to_excel(writer, sheet_name="Sheet 1")

        # Printing data to file
        worksheet = writer.sheets["Sheet 1"]

        # cell_range[2] -> +1: Adds a row for rollup row
        cell_range = (0, 0, len(data) + 1, len(data[0].keys()) - 1)

        for row_num, record in enumerate(data):
            values = [v for v in record.values()]
            worksheet.write_row(row_num + 1, 0, values)

        worksheet.add_table(
            *cell_range,
            {
                "name": "Details",
                "autofilter": True,
                "style": "Table Style Medium 9",
                "total_row": True,
                "columns": [col for col in columns.values()],
            },
        )


if __name__ == "__main__":
    main()

Which should give the expected result:

The difference in the programs is:

85a86,89
>         for row_num, record in enumerate(data):
>             values = [v for v in record.values()]
>             worksheet.write_row(row_num + 1, 0, values)
>
94,97d97
<                 "data": [
<                     [v for v in record.values()]
<                     for record in data
<                 ],

philippefutureboy · 2023-10-07T00:00:49Z

Hi John @jmcnamara!

Fantastic, thank you so much for such a fast answer! I confirm this workaround solves the issue in the minimal code sample.
I'll have to do quite a bit of work to integrate in my actual code, but that's on my end! :)

Do you have someplace I can donate you a few bucks as a mark of gratitude?

Issue #1015

jmcnamara · 2023-10-07T20:06:12Z

I'll have to do quite a bit of work to integrate in my actual code

I've pushed a fix to main that allows the first, unmodified, example to work as expected. So no modification required on your end. You can try it when you get a chance and if everything it okay I'll wrap it into a release.

Do you have someplace I can donate you a few bucks as a mark of gratitude?

There is a Paypal link on the GitHub page but it isn't necessary. :-)

philippefutureboy · 2023-10-09T15:30:19Z

Hi @jmcnamara !

I've pushed a fix to main that allows the first, unmodified, example to work as expected. So no modification required on your end. You can try it when you get a chance and if everything it okay I'll wrap it into a release.

I've installed XlsxWriter @ 0401664 in my project, and used your solution successfully for single-Table Excel files.
This being said, for cases where there is a second Table which contains a column that uses a formula referring to the first Table, the file cannot be opened and crashes Excel (across all files that were opened).
N.B. that the workaround (print the data separatedly) results in the same issue as well.

You can find an updated code sample to reproduce the issue here:

from pathlib import Path
from textwrap import dedent
from typing import Dict

import pandas as pd

PATH = Path(__file__).parent / "bug.xlsx"

def main():
    with pd.ExcelWriter(PATH) as writer:
        # Creating sheets before printing
        pd.DataFrame().to_excel(writer, sheet_name="Sheet 1")
        pd.DataFrame().to_excel(writer, sheet_name="Sheet 2")

        # Writing sheets content
        sheet1(writer, writer.sheets["Sheet 1"])
        sheet2(writer, writer.sheets["Sheet 2"])

# same code as initial sample
def sheet1(writer, worksheet):
    data = [
        {"index": 0, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"02971\"","amount_charged":225.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 1, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23222\"","amount_charged":263.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 2, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"33300\"","amount_charged":1550.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 3, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":354.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 4, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":177.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 5, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 6, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 7, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 8, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 9, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01130\"","amount_charged":72.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0}
    ]

    columns = {
        "index": {
            "header": "Index",
        },
        "visible": {
            "header": "Visible",
            "formula": "=SUBTOTAL(3, [@[Index]])"
        },
        "schedule_track_department_label": {
            "header": "Schedule dept.",
            "total_string": "TOTAL"
        },
        "responsible_label": {
            "header": "Assigned Dentist"
        },
        "procedure_key": {
            "header": "Code",
        },
        "amount_charged": {
            "header": "Charges",
            "format": writer.book.add_format({"num_format": "$#,##0.00"}),
            "total_function": "sum"
        },
        "amount_paid_correction": {
            "header": "Corrected dentist pay amount",
            "format": writer.book.add_format({"num_format": "$#,##0.00"}),
            "formula": dedent("""
                =(
                    Details[@[Charges]] *
                    Details[@[Corrected percentage]]
                )
            """),
            "total_function": "sum"
        },
        "amount_paid_percentage_correction": {
            "header": "Corrected percentage",
            "format": writer.book.add_format({"num_format": "0%"}),
            "formula": dedent("""
                =IF(
                    AND(
                        Details[@[Schedule dept.]] = "Hygiene",
                        IF(
                            IFERROR(MATCH(Details[@Code], {"01110", "01120", "01130", "01200", "01250"}, 0), FALSE) = FALSE,
                            FALSE,
                            TRUE
                        )
                    ),
                    0.25,
                    0.35
                )
            """)
        },
    }

    # cell_range[2] -> +1: Adds a row for rollup row
    cell_range = (0, 0, len(data) + 1, len(data[0].keys()) - 1)

    # for row_num, record in enumerate(data):
    #     values = [v for v in record.values()]
    #     worksheet.write_row(row_num + 1, 0, values)

    worksheet.add_table(
        *cell_range,
        {
            "name": "Details",
            "autofilter": True,
            "style": "Table Style Medium 9",
            "total_row": True,
            "columns": [col for col in columns.values()],
            "data": [[v for v in record.values()] for record in data]
        },
    )

# new code
def sheet2(writer, worksheet):
    data = [
        {"responsible_label":"Dr Jeanette Valencia","amount_charged": 0,"amount_paid_correction":0},
        {"responsible_label":"Dr Alexis Morgan","amount_charged": 0,"amount_paid_correction":0},
        {"responsible_label":"Dr Jeanette Valencia","amount_charged": 0,"amount_paid_correction":0},
        {"responsible_label":"Dr Stephanie Jackson","amount_charged": 0,"amount_paid_correction":0},
    ]

    columns = {
        "responsible_label": {
            "header": "Assigned Dentist"
        },
        "amount_charged": {
            "header": "Charges",
            "format": writer.book.add_format({"num_format": "$#,##0.00"}),
            "formula": dedent("""
                =IFERROR(
                    SUM(
                        FILTER(
                            Details[Charges],
                            (Details[Visible]=1) *
                            (
                                Details[Assigned dentist] = 
                                Summary[@[Assigned dentist]]
                            )
                        )
                    ),
                    0
                )
            """),
            "total_function": "sum"
        },
        "amount_paid_correction": {
            "header": "Corrected dentist pay amount",
            "format": writer.book.add_format({"num_format": "$#,##0.00"}),
            "formula": dedent("""
                =IFERROR(
                    SUM(
                        FILTER(
                            Details[Corrected dentist pay amount],
                            (Details[Visible]=1) *
                            (
                            Details[Assigned dentist] = 
                            Summary[@[Assigned dentist]]
                            )
                        )
                    ),
                    0
                )
            """),
            "total_function": "sum"
        },
    }

    # cell_range[2] -> +1: Adds a row for rollup row
    cell_range = (0, 0, len(data) + 1, len(data[0].keys()) - 1)

    # for row_num, record in enumerate(data):
    #     values = [v for v in record.values()]
    #     worksheet.write_row(row_num + 1, 0, values)

    worksheet.add_table(
        *cell_range,
        {
            "name": "Summary",
            "autofilter": True,
            "style": "Table Style Medium 9",
            "total_row": True,
            "columns": [col for col in columns.values()],
            "data": [[v for v in record.values()] for record in data]
        },
    )

if __name__ == "__main__":
    main()

And the crash report from Excel here:

https://gist.github.com/philippefutureboy/986a5d0951fdbe9b85b3783f0bcc6b07

There is a Paypal link on the GitHub page but it isn't necessary. :-)

Great, it will be my pleasure still :)

Issue #1015

jmcnamara · 2023-10-09T16:44:28Z

This being said, for cases where there is a second Table which contains a column that uses a formula referring to the first Table, the file cannot be opened and crashes Excel (across all files that were opened).

The issue there is that Filter is a Future Function in Excel and should be written as _xlfn._xlws.FILTER. XlsxWriter generally handles this automatically for cell formulas but misses it for the internal copy used for tables. I've fixed that on main and you can try it out. Here is the output from your sample program after the change:

Also note, you can/should omit the = at the start of the formula. XlsxWriter usually removes it but there is also leading whitespace/newline and that is deliberately maintained (although in your case you probably don't intend it (?). Also the Filter formula refers to Assigned dentist but that should be sentence case Assigned Dentist. Neither of these affect Excel but it is best to fix them in case something else has an issue with them.

philippefutureboy · 2023-10-09T18:24:41Z

Fantastic, that works! 🎉
On my end this issue is resolved! Thanks for your prompt & positive support in this matter!!

Issue #1015

jmcnamara · 2023-10-09T19:21:53Z

Fixed upstream in version v3.1.7. Thanks for the report.

philippefutureboy added the bug label Oct 6, 2023

jmcnamara self-assigned this Oct 6, 2023

jmcnamara added a commit that referenced this issue Oct 7, 2023

table: fix issue where column formulas were overwritten by table data

47eaf27

Issue #1015

jmcnamara added a commit that referenced this issue Oct 9, 2023

table: fix escaping of future funtions

c32b2a1

Issue #1015

jmcnamara added a commit that referenced this issue Oct 9, 2023

table: fix escaping of future funtions

878affa

Issue #1015

jmcnamara closed this as completed Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Inconsistent Calculated Column Formula when applying columns[:].formula using table range references #1015

Bug: Inconsistent Calculated Column Formula when applying columns[:].formula using table range references #1015

philippefutureboy commented Oct 6, 2023 •

edited

jmcnamara commented Oct 6, 2023

philippefutureboy commented Oct 7, 2023

jmcnamara commented Oct 7, 2023

philippefutureboy commented Oct 9, 2023 •

edited

jmcnamara commented Oct 9, 2023

philippefutureboy commented Oct 9, 2023

jmcnamara commented Oct 9, 2023

Bug: Inconsistent Calculated Column Formula when applying columns[:].formula using table range references #1015

Bug: Inconsistent Calculated Column Formula when applying columns[:].formula using table range references #1015

Comments

philippefutureboy commented Oct 6, 2023 • edited

Current behavior

Expected behavior

Sample code to reproduce

Environment

Any other information

OpenOffice and LibreOffice users

jmcnamara commented Oct 6, 2023

philippefutureboy commented Oct 7, 2023

jmcnamara commented Oct 7, 2023

philippefutureboy commented Oct 9, 2023 • edited

jmcnamara commented Oct 9, 2023

philippefutureboy commented Oct 9, 2023

jmcnamara commented Oct 9, 2023

philippefutureboy commented Oct 6, 2023 •

edited

philippefutureboy commented Oct 9, 2023 •

edited