Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching to return Python objects to Excel cells #555

Open
fzumstein opened this issue Aug 24, 2016 · 8 comments
Open

Add caching to return Python objects to Excel cells #555

fzumstein opened this issue Aug 24, 2016 · 8 comments

Comments

@fzumstein
Copy link
Member

No description provided.

@fzumstein fzumstein added this to the short list milestone Aug 24, 2016
@ericremoreynolds
Copy link
Contributor

References: 1 2

@fzumstein fzumstein removed this from the short list milestone Sep 25, 2016
@skasi7
Copy link
Contributor

skasi7 commented Jan 19, 2018

I have a implementation of this feature (based on https://www.pyxll.com/docs/examples/objectcache.html but simpler). The problem is this feature requires event handling (that I also implemented) to be efficient.

If you are interested, I can send you more information/examples/PRs.

@sdementen
Copy link
Contributor

+1

@fzumstein
Copy link
Member Author

sure, please send!

@audetto
Copy link

audetto commented Mar 13, 2021

This is similar to the issue I recently opened #1541
I work in finance and have see countless poor implementations of this concept which make spreadsheets a nightmare to follow, with issues like: F9 does not work, memory builds up, multiple sheet do not work together, painful to use the same functions outside Excel.

I think Microsoft should solve it once and for all, but IMHO xlwings is the next best location to do so for python objects.

What I am writing at the moment uses these ideas

  • put the object in a global variable (a dictionary)
  • keyed by the cell address
  • add some visible tweaks to identify as objects, giving the impression that the object has been recomputed

so one could use keys like

object:[filename.xls]Sheet3!B4@10:45:56

where the time is not used in the lookup (because we must delete the object already in the same cell).
optionally, add the type as visual info (not used in lookup)

DataFrame:[filename.xls]Sheet3!B4@10:45:56

Converters are really a good place for this, as they ensure a full separation between object handling and function code (another pitfall of the many bad designs).

@sdementen
Copy link
Contributor

You may try the converter CacheConverter below that can be used as:

import pandas
import xlwings as xw


@xw.func
@xw.arg("df", pandas.DataFrame)
@xw.ret(CacheConverter)
def write_df2cache(df):
    return dict(my_value=df, my_other_value="45", the_third_value=dict(info=len(df),cols=df.columns))



@xw.func
@xw.arg("df", CacheConverter)
def read_cache(df):
    return df


@xw.func
def see_cached_keys():
    return [[e] for e in _cache_name_timestamp]

In Excel, you use write_df2cache to read a range. This will return 3 values as reference similar to:
my_value@Sheet1!$E$7[2022-02-06 09:55:11.750726]{DataFrame-3}
my_other_value@Sheet1!$E$7[2022-02-06 09:55:11.750726]{str-2}
the_third_value@Sheet1!$E$7[2022-02-06 09:55:11.750726]{dict-2}

You can thereafter extract the values behind one of the 3 references by using read_cache with as argument one of the 3 references.

The API is not final but could be a good start for discussion.

from xlwings.conversion import Converter

_cache_name_timestamp = {}


class CacheConverter(Converter):
    """Convert an object to a reference. The reference can include the 'caller' address to ensure
    uniqueness (not needed if singleton - caller=False).

    The object are stored in a global dict by their key and looked up when reading.
    
    When used on a return value, the converter will transform the value into a reference. 
    If the value is a dict, it will return a reference for each item in the dict.
    
    When used in an input, the converter will return the value that was stored in a reference.
    """

    @staticmethod
    def read_value(value, options):
        if not value:
            return
        key = value.split("[")[0]
        if key in _cache_name_timestamp:
            return _cache_name_timestamp[key]
        else:
            raise ValueError(f"No data available for {key}")

    @staticmethod
    def write_value(value, options):
        if value is None:
            return
        now = datetime.datetime.today()
        if options.get("caller", True):
            caller = xw.apps.active.api.Caller
            caller = f"@{caller.Worksheet.Name}!{caller.Address}"
        else:
            caller = ""

        if not isinstance(value, dict):
            if not caller:
                raise ValueError("You cannot use 'caller'=False if the function does not return a dict.")
            value = dict(anonymous=value)

        hashes = []
        for name, item in value.items():
            key = f"{name}{caller}"
            _cache_name_timestamp[key] = item
            hashes.append(f"{key}[{now}]{{{type(item).__name__}-{len(item)}}}")

        if options.get("vertical", True):
            hashes = [[h] for h in hashes]

        return hashes

@sdementen
Copy link
Contributor

Hi @fzumstein
Do you think my converter could be added to xlwings or to specific?

@fzumstein
Copy link
Member Author

Yeah I guess we still want to add that object caching functionally at some point, it just hasn't made it to the top...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants