Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Inconsistent Calculated Column Formula when applying columns[:].formula using table range references #1015

Closed
1 task done
philippefutureboy opened this issue Oct 6, 2023 · 7 comments
Assignees
Labels

Comments

@philippefutureboy
Copy link

philippefutureboy commented Oct 6, 2023

Current behavior

When writing a Table which is comprised of columns that use formulas thar refer to other columns by their named range reference, the resulting Excel file does not load the formulas to the table and highlights all the affected cells with the error "Inconsistent Calculated Column Formula".
This requires manual correction in the file itself by a person for the formulas to work.

Furthermore, if a second table refers to one of the columns using a formula in the first table, on opening the file, Excel opens a popup that says the file is broken:

"We found a problem with some content in 'file.xlsx'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes."

When that happens the formulas in the second tables are just completely removed, and need to be manually inserted instead of restored using the contextual menu on each cell.

See loom: https://www.loom.com/share/23fb24892cf54ec9bfe80d6bbe7af776

This issue prevents us from offering automated, high quality excel reports to our clients as it requires manual intervention on the file to correct the error-highlighted formulas. If this is solved, that is the last hurdle for us to offer automated excel reporting as a reverse ELT analysis.

Expected behavior

We expect formulas that refer to other named ranges in the table (or in another named table) to work properly without manual intervention.

Sample code to reproduce

from pathlib import Path
from textwrap import dedent
from typing import Dict

import pandas as pd

PATH = Path(__file__).parent / "bug.xlsx"

def main():
    data = [
        {"index": 0, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"02971\"","amount_charged":225.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 1, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23222\"","amount_charged":263.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 2, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"33300\"","amount_charged":1550.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 3, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":354.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 4, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":177.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 5, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 6, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 7, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 8, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 9, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01130\"","amount_charged":72.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0}
    ]

    with pd.ExcelWriter(PATH) as writer:
        columns = {
            "index": {
                "header": "Index",
            },
            "visible": {
                "header": "Visible",
                "formula": "=SUBTOTAL(3, [@[Index]])"
            },
            "schedule_track_department_label": {
                "header": "Schedule dept.",
                "total_string": "TOTAL"
            },
            "responsible_label": {
                "header": "Assigned Dentist"
            },
            "procedure_key": {
                "header": "Code",
            },
            "amount_charged": {
                "header": "Charges",
                "format": writer.book.add_format({"num_format": "$#,##0.00"}),
                "total_function": "sum"
            },
            "amount_paid_correction": {
                "header": "Corrected dentist pay amount",
                "format": writer.book.add_format({"num_format": "$#,##0.00"}),
                "formula": dedent("""
                    =(
                        Details[@[Charges]] *
                        Details[@[Corrected percentage]]
                    )
                """),
                "total_function": "sum"
            },
            "amount_paid_percentage_correction": {
                "header": "Corrected percentage",
                "format": writer.book.add_format({"num_format": "0%"}),
                "formula": dedent("""
                    =IF(
                        AND(
                            Details[@[Schedule dept.]] = "Hygiene",
                            IF(
                                IFERROR(MATCH(Details[@Code], {"01110", "01120", "01130", "01200", "01250"}, 0), FALSE) = FALSE,
                                FALSE,
                                TRUE
                            )
                        ),
                        0.25,
                        0.35
                    )
                """)
            },
        }
        # Creating sheets before printing
        pd.DataFrame().to_excel(writer, sheet_name="Sheet 1")

        # Printing data to file
        worksheet = writer.sheets["Sheet 1"]

        # cell_range[2] -> +1: Adds a row for rollup row
        cell_range = (0, 0, len(data) + 1, len(data[0].keys()) - 1)

        worksheet.add_table(
            *cell_range,
            {
                "name": "Details",
                "autofilter": True,
                "style": "Table Style Medium 9",
                "total_row": True,
                "columns": [col for col in columns.values()],
                "data": [
                    [v for v in record.values()]
                    for record in data
                ],
            },
        )


if __name__ == "__main__":
    main()

Environment

- XlsxWriter version: 3.1.0
- Python version: 3.8.7
- Excel version: Microsoft Excel for Mac Version 16.77.1 (23091703), Microsoft 365 Subscription
- OS: MacOS Monterey 12.6.7

Any other information

Screen Shot 2023-10-06 at 3 52 03 PM

OpenOffice and LibreOffice users

  • I have tested the output file with Excel.
@jmcnamara
Copy link
Owner

Thanks for the detailed report.

The main issue is that the data from the data in add_table() is overwriting the column formulas. That is a bug in XlsxWriter which I will fix shortly.

In the meantime you can move the data writing out of add_table(). Like this:

from pathlib import Path
from textwrap import dedent
from typing import Dict

import pandas as pd

PATH = Path(__file__).parent / "bug.xlsx"

def main():
    data = [
        {"index": 0, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"02971\"","amount_charged":225.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 1, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23222\"","amount_charged":263.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 2, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"33300\"","amount_charged":1550.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 3, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":354.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 4, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":177.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 5, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 6, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 7, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 8, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 9, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01130\"","amount_charged":72.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0}
    ]

    with pd.ExcelWriter(PATH) as writer:
        columns = {
            "index": {
                "header": "Index",
            },
            "visible": {
                "header": "Visible",
                "formula": "=SUBTOTAL(3, [@[Index]])"
            },
            "schedule_track_department_label": {
                "header": "Schedule dept.",
                "total_string": "TOTAL"
            },
            "responsible_label": {
                "header": "Assigned Dentist"
            },
            "procedure_key": {
                "header": "Code",
            },
            "amount_charged": {
                "header": "Charges",
                "format": writer.book.add_format({"num_format": "$#,##0.00"}),
                "total_function": "sum"
            },
            "amount_paid_correction": {
                "header": "Corrected dentist pay amount",
                "format": writer.book.add_format({"num_format": "$#,##0.00"}),
                "formula": dedent("""
                    =(
                        Details[@[Charges]] *
                        Details[@[Corrected percentage]]
                    )
                """),
                "total_function": "sum"
            },
            "amount_paid_percentage_correction": {
                "header": "Corrected percentage",
                "format": writer.book.add_format({"num_format": "0%"}),
                "formula": dedent("""
                    =IF(
                        AND(
                            Details[@[Schedule dept.]] = "Hygiene",
                            IF(
                                IFERROR(MATCH(Details[@Code], {"01110", "01120", "01130", "01200", "01250"}, 0), FALSE) = FALSE,
                                FALSE,
                                TRUE
                            )
                        ),
                        0.25,
                        0.35
                    )
                """)
            },
        }
        # Creating sheets before printing
        pd.DataFrame().to_excel(writer, sheet_name="Sheet 1")

        # Printing data to file
        worksheet = writer.sheets["Sheet 1"]

        # cell_range[2] -> +1: Adds a row for rollup row
        cell_range = (0, 0, len(data) + 1, len(data[0].keys()) - 1)

        for row_num, record in enumerate(data):
            values = [v for v in record.values()]
            worksheet.write_row(row_num + 1, 0, values)

        worksheet.add_table(
            *cell_range,
            {
                "name": "Details",
                "autofilter": True,
                "style": "Table Style Medium 9",
                "total_row": True,
                "columns": [col for col in columns.values()],
            },
        )


if __name__ == "__main__":
    main()

Which should give the expected result:

screenshot

The difference in the programs is:

85a86,89
>         for row_num, record in enumerate(data):
>             values = [v for v in record.values()]
>             worksheet.write_row(row_num + 1, 0, values)
>
94,97d97
<                 "data": [
<                     [v for v in record.values()]
<                     for record in data
<                 ],

@jmcnamara jmcnamara self-assigned this Oct 6, 2023
@philippefutureboy
Copy link
Author

Hi John @jmcnamara!

Fantastic, thank you so much for such a fast answer! I confirm this workaround solves the issue in the minimal code sample.
I'll have to do quite a bit of work to integrate in my actual code, but that's on my end! :)

Do you have someplace I can donate you a few bucks as a mark of gratitude?

@jmcnamara
Copy link
Owner

I'll have to do quite a bit of work to integrate in my actual code

I've pushed a fix to main that allows the first, unmodified, example to work as expected. So no modification required on your end. You can try it when you get a chance and if everything it okay I'll wrap it into a release.

Do you have someplace I can donate you a few bucks as a mark of gratitude?

There is a Paypal link on the GitHub page but it isn't necessary. :-)

@philippefutureboy
Copy link
Author

philippefutureboy commented Oct 9, 2023

Hi @jmcnamara !

I've pushed a fix to main that allows the first, unmodified, example to work as expected. So no modification required on your end. You can try it when you get a chance and if everything it okay I'll wrap it into a release.

I've installed XlsxWriter @ 0401664 in my project, and used your solution successfully for single-Table Excel files.
This being said, for cases where there is a second Table which contains a column that uses a formula referring to the first Table, the file cannot be opened and crashes Excel (across all files that were opened).
N.B. that the workaround (print the data separatedly) results in the same issue as well.

You can find an updated code sample to reproduce the issue here:

from pathlib import Path
from textwrap import dedent
from typing import Dict

import pandas as pd

PATH = Path(__file__).parent / "bug.xlsx"

def main():
    with pd.ExcelWriter(PATH) as writer:
        # Creating sheets before printing
        pd.DataFrame().to_excel(writer, sheet_name="Sheet 1")
        pd.DataFrame().to_excel(writer, sheet_name="Sheet 2")

        # Writing sheets content
        sheet1(writer, writer.sheets["Sheet 1"])
        sheet2(writer, writer.sheets["Sheet 2"])

# same code as initial sample
def sheet1(writer, worksheet):
    data = [
        {"index": 0, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"02971\"","amount_charged":225.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 1, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23222\"","amount_charged":263.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 2, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Jeanette Valencia","procedure_key":"=\"33300\"","amount_charged":1550.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 3, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":354.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 4, "visible":0,"schedule_track_department_label":"Restorative","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"23221\"","amount_charged":177.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 5, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 6, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Alexis Morgan","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 7, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 8, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01120\"","amount_charged":46.75,"amount_paid_correction":0,"amount_paid_percentage_correction": 0},
        {"index": 9, "visible":0,"schedule_track_department_label":"Hygiene","responsible_label":"Dr Stephanie Jackson","procedure_key":"=\"01130\"","amount_charged":72.0,"amount_paid_correction":0,"amount_paid_percentage_correction": 0}
    ]

    columns = {
        "index": {
            "header": "Index",
        },
        "visible": {
            "header": "Visible",
            "formula": "=SUBTOTAL(3, [@[Index]])"
        },
        "schedule_track_department_label": {
            "header": "Schedule dept.",
            "total_string": "TOTAL"
        },
        "responsible_label": {
            "header": "Assigned Dentist"
        },
        "procedure_key": {
            "header": "Code",
        },
        "amount_charged": {
            "header": "Charges",
            "format": writer.book.add_format({"num_format": "$#,##0.00"}),
            "total_function": "sum"
        },
        "amount_paid_correction": {
            "header": "Corrected dentist pay amount",
            "format": writer.book.add_format({"num_format": "$#,##0.00"}),
            "formula": dedent("""
                =(
                    Details[@[Charges]] *
                    Details[@[Corrected percentage]]
                )
            """),
            "total_function": "sum"
        },
        "amount_paid_percentage_correction": {
            "header": "Corrected percentage",
            "format": writer.book.add_format({"num_format": "0%"}),
            "formula": dedent("""
                =IF(
                    AND(
                        Details[@[Schedule dept.]] = "Hygiene",
                        IF(
                            IFERROR(MATCH(Details[@Code], {"01110", "01120", "01130", "01200", "01250"}, 0), FALSE) = FALSE,
                            FALSE,
                            TRUE
                        )
                    ),
                    0.25,
                    0.35
                )
            """)
        },
    }

    # cell_range[2] -> +1: Adds a row for rollup row
    cell_range = (0, 0, len(data) + 1, len(data[0].keys()) - 1)

    # for row_num, record in enumerate(data):
    #     values = [v for v in record.values()]
    #     worksheet.write_row(row_num + 1, 0, values)

    worksheet.add_table(
        *cell_range,
        {
            "name": "Details",
            "autofilter": True,
            "style": "Table Style Medium 9",
            "total_row": True,
            "columns": [col for col in columns.values()],
            "data": [[v for v in record.values()] for record in data]
        },
    )

# new code
def sheet2(writer, worksheet):
    data = [
        {"responsible_label":"Dr Jeanette Valencia","amount_charged": 0,"amount_paid_correction":0},
        {"responsible_label":"Dr Alexis Morgan","amount_charged": 0,"amount_paid_correction":0},
        {"responsible_label":"Dr Jeanette Valencia","amount_charged": 0,"amount_paid_correction":0},
        {"responsible_label":"Dr Stephanie Jackson","amount_charged": 0,"amount_paid_correction":0},
    ]

    columns = {
        "responsible_label": {
            "header": "Assigned Dentist"
        },
        "amount_charged": {
            "header": "Charges",
            "format": writer.book.add_format({"num_format": "$#,##0.00"}),
            "formula": dedent("""
                =IFERROR(
                    SUM(
                        FILTER(
                            Details[Charges],
                            (Details[Visible]=1) *
                            (
                                Details[Assigned dentist] = 
                                Summary[@[Assigned dentist]]
                            )
                        )
                    ),
                    0
                )
            """),
            "total_function": "sum"
        },
        "amount_paid_correction": {
            "header": "Corrected dentist pay amount",
            "format": writer.book.add_format({"num_format": "$#,##0.00"}),
            "formula": dedent("""
                =IFERROR(
                    SUM(
                        FILTER(
                            Details[Corrected dentist pay amount],
                            (Details[Visible]=1) *
                            (
                            Details[Assigned dentist] = 
                            Summary[@[Assigned dentist]]
                            )
                        )
                    ),
                    0
                )
            """),
            "total_function": "sum"
        },
    }

    # cell_range[2] -> +1: Adds a row for rollup row
    cell_range = (0, 0, len(data) + 1, len(data[0].keys()) - 1)

    # for row_num, record in enumerate(data):
    #     values = [v for v in record.values()]
    #     worksheet.write_row(row_num + 1, 0, values)

    worksheet.add_table(
        *cell_range,
        {
            "name": "Summary",
            "autofilter": True,
            "style": "Table Style Medium 9",
            "total_row": True,
            "columns": [col for col in columns.values()],
            "data": [[v for v in record.values()] for record in data]
        },
    )

if __name__ == "__main__":
    main()

And the crash report from Excel here:

https://gist.github.com/philippefutureboy/986a5d0951fdbe9b85b3783f0bcc6b07

There is a Paypal link on the GitHub page but it isn't necessary. :-)

Great, it will be my pleasure still :)

jmcnamara added a commit that referenced this issue Oct 9, 2023
@jmcnamara
Copy link
Owner

This being said, for cases where there is a second Table which contains a column that uses a formula referring to the first Table, the file cannot be opened and crashes Excel (across all files that were opened).

The issue there is that Filter is a Future Function in Excel and should be written as _xlfn._xlws.FILTER. XlsxWriter generally handles this automatically for cell formulas but misses it for the internal copy used for tables. I've fixed that on main and you can try it out. Here is the output from your sample program after the change:

screenshot

Also note, you can/should omit the = at the start of the formula. XlsxWriter usually removes it but there is also leading whitespace/newline and that is deliberately maintained (although in your case you probably don't intend it (?). Also the Filter formula refers to Assigned dentist but that should be sentence case Assigned Dentist. Neither of these affect Excel but it is best to fix them in case something else has an issue with them.

@philippefutureboy
Copy link
Author

Fantastic, that works! 🎉
On my end this issue is resolved! Thanks for your prompt & positive support in this matter!!

jmcnamara added a commit that referenced this issue Oct 9, 2023
@jmcnamara
Copy link
Owner

Fixed upstream in version v3.1.7. Thanks for the report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants