Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDF with expand=table disappears on recalculation #1164

Closed
nickderobertis opened this issue Oct 16, 2019 · 8 comments · Fixed by #1318
Closed

UDF with expand=table disappears on recalculation #1164

nickderobertis opened this issue Oct 16, 2019 · 8 comments · Fixed by #1318
Milestone

Comments

@nickderobertis
Copy link

nickderobertis commented Oct 16, 2019

OS (e.g. Windows 10 or macOS Sierra)

Windows 10 Enterprise

Versions of xlwings, Excel and Python (e.g. 0.11.8, Office 365, Python 3.7)

xlwings 0.15.10, Excel Professional Plus 2016, Python 3.7

Describe your issue (incl. Traceback!)

Thanks for including the expand functionality in the library, it seems very useful if I can get it working smoothly.

I had noticed as I was building a workbook using UDFs with expand='table' that sometimes my UDFs would disappear entirely upon recalculation. I had to do a lot of experimenting to narrow down exactly when this occurs.

This happens for UDFs with expand='table' for both @xw.ret and @xw.arg if they are referencing another UDF which has expand='table' for @xw.ret. It also only occurs if this is the last function which was defined referencing the cells of the table.

E.g. the setup looks something like this:
Plain Cells -> Reference plain cells by UDF with expand='table' for both @xw.ret and @xw.arg -> Reference first UDF output by another UDF with expand='table' for @xw.ret.

If any of these conditions are not met, the problem will not occur. Calculating any function, UDF or otherwise (=SUM works), on the full cell range of the table after defining the UDF will make the UDF call not disappear. If the same UDF is called a second time, the second occurrence will disappear, but the first will not.

If expand='table' is excluded for either @xw.ret or @xw.arg on the first UDF in the chain, the problem will not occur. If expand='table' is excluded for @xw.ret on the second UDF, the problem will not occur.

A temporary workaround on the test project is to call my UDF first in the spot where I want it, then call it in an empty spot that I don't want it. Upon recalculate, the one in the empty spot disappears, but the first one will always persist from that point on. However there may be still some conditions where this occurs that I don't understand, as I have tried this fix in my full workbook, and it works initially, but after recalculating many times, some functions will still disappear. In this workbook I have a chain of 4 UDFs using expand.

Include a minimal code sample to reproduce the issue (and attach a sample workbook if required!)

While this will happen for any UDF matching the conditions above, I have included the most simple version possible here.

@xw.func
@xw.arg('data', expand='table')
@xw.ret(expand='table', index=False)
def print_table(data):
    return data

This sample project shows the issue. Open the workbook, and recalculate with CTRL + ALT + F9, and the second UDF disappears on my machine.

Try opening the workbook again, if instead, you call print_table a second time referencing the first UDF, then only this newly created UDF will disappear on recalculation, and not ever the original second UDF again.

Try again, if you call =SUM(E4:E6), then recalculate, the second UDF will not disappear. You can even delete the =SUM call then, and recalculate as many times as you want and the second UDF will not disappear.

disappearing.zip

I appreciate any help you could give. I could also try to fix this if you point me in the right direction.

@nickderobertis
Copy link
Author

In the first post, I described some conditions where the problem always occurs. But it seems to happen in some other situations as even with my workaround, my UDFs were still disappearing sometimes on my full project.

This seems to happen somewhat randomly and it's linked to the speed of calculation. If I recalculate the workbook many times quickly, it is likely that the UDFs disappear. If the calculation is not as fast, it doesn't happen. I'm trying to run Monte Carlo simulations on a workbook with UDFs. I'm using xlwings from Python to change the inputs and collect the outputs. If I do this without a delay or too short of a delay, the UDFs will disappear after enough iterations. If I introduce a 0.5s sleep between iterations, then the UDFs are able to persist for me.

@Colin-b
Copy link
Contributor

Colin-b commented Oct 16, 2019

I think this is a duplicate of #1010

@fzumstein
Copy link
Member

Thanks for adding a reproducible sample. Here are some thoughts: expand='table' is a giant hack so it's doesn't surprise me too much that there are some issues and I am not entirely sure how big the effort is to solve them. I've once started an attempt to fix #1010 which is indeed something similar and had some success on branch xlwings-1010-improve-async but it doesn't seem to help here.
Now, in Office 365, monthly channel, Microsoft has introduced dynamic arrays natively and if anyhow possible that should be the way to solve your issue as it's guaranteed free of side-effects (see also https://youtu.be/Gk0p3hFRo0k) and much faster. I know, it probably takes corporate environments years to get this, but in my eyes it makes more sense to focus on other issues that Microsoft hasn't solved yet.

@nickderobertis
Copy link
Author

Ah yes thanks @Colin-b , I looked for existing issues on this but I guess didn't hit the right key words.

@fzumstein Yes I totally understand that you want to focus on native dynamic arrays going forward. This is a small team and you have plenty to work on so it's hard to invest in a feature that will ultimately be removed or at least become obsolete. But I think there will still be a lot of users on the older versions of Excel for a while, especially in corporate and academic environments where it's not up to the user what version they are using. As such I may try to submit a PR for this if you'd be willing to accept it.

I've looked over the UDF code. It seems that for dynamic arrays, there is a resize operation which is happening async on the win32event API. That resize operation removes the formula and re-adds it if the output is the wrong size. I need to dig into this a lot more to really understand it, but I wonder if this is some timing issue where after the formula is cleared but before it is written back to the cell, something happens which prevents it writing to the cell. Perhaps we could add some kind of retry behavior there where it checks if the formula is truly there and re-writes it if necessary before exiting. Let me know if you think I'm on the right track at all.

@fzumstein
Copy link
Member

Sure, if you want to dig into this, I am happy to accept a PR. There's a few things going on connected with resizing and async and more so when you use a combination of the two. First of all, resizing/async can cause the formula to fire up to 3 times which is why there is a caching mechanism. What I saw when looking at #1010 is that querying the formula during a resize even causes a com error (see: https://github.com/xlwings/xlwings/compare/xlwings-1010-improve-async#diff-7a2b1d6768f72b43e18a5449f6c0beb7R300-R304). Since the resize with the legacy CSE arrays requires at some point to delete the formulas, this is most likely the reason things get tripped up if they read the formula during this time when it was just deleted. Maybe an approach would be to get rid of async first and fix the legacy dynamic arrays first before taking another shot at the async stuff.

@sakvaua
Copy link

sakvaua commented Mar 5, 2020

First, thanks for the terrific library. It made Excel infinitely expandable :)
I upgraded xlwings to the latest version and it still erases my UDFs from time to time. So I assume there's no fix for this? And there won't be any? Is there any workaround? Upgrading from Office 2016 to Office 365 is not possible now.
Fortunately, I have only one UDF per sheet now and can just restore formula after it was killed by a macro. But that's terribly inflexible.

@davtoh
Copy link

davtoh commented May 1, 2020

First, thanks for the terrific library. It made Excel infinitely expandable :)
I upgraded xlwings to the latest version and it still erases my UDFs from time to time. So I assume there's no fix for this? And there won't be any? Is there any workaround? Upgrading from Office 2016 to Office 365 is not possible now.
Fortunately, I have only one UDF per sheet now and can just restore formula after it was killed by a macro. But that's terribly inflexible.

Unfortuantelly this happens from time to time if there are undesired errors in our application iff the function could not finish well. Nontheless i have been trying to address that issue and you can use the udfs.py script in my fork to solve it as it has worked for me even loading the whole workbook without a single error nor deleting the equations.

@nickderobertis
Copy link
Author

First, thanks for the terrific library. It made Excel infinitely expandable :)
I upgraded xlwings to the latest version and it still erases my UDFs from time to time. So I assume there's no fix for this? And there won't be any? Is there any workaround? Upgrading from Office 2016 to Office 365 is not possible now.
Fortunately, I have only one UDF per sheet now and can just restore formula after it was killed by a macro. But that's terribly inflexible.

Yes, I believe I have this fixed in my PR, #1177 , though dynamic UDFs with legacy arrays seem to not be a priority for xlwings right now and possibly not ever. You could install from this fork, though it seems it is starting to get out of date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants