Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile individual functions (or entire model) to python code #44

Closed
allenlawrence94 opened this issue Mar 29, 2020 · 3 comments
Closed

Comments

@allenlawrence94
Copy link

I'm working on converting a large excel workbook into a python tool. The workbook has gotten complex enough that it takes a long time to run, and maintaining it is painstaking. This great package you've built largely solves my first problem - I could convert the workbook into an ExcelModel to be used behind an API, which would speed things up and enable parallelization. However, it doesn't make my tool easier to maintain.

I would like to be able to convert the DAGs that ExcelModel and Parser build into a readable python module. Each function in the DAG would become an atomic function in the module. As an example: if we had an excel sheet where the D1 cell contains the formula =SUM(A1:C1), this tool might produce a function like this:

def d1(a1_to_c1: list) -> float:
    return sum(a1_to_c1)

This would lay out working (albeit not very clear) code that would be the starting point for turning a workbook into a clean python project.

I started fiddling with this idea, but my ignorance of the inner workings of formulas and schedula made this a very slow-going process. I was hoping someone might have some insights into how this could be done. Totally understood if you think this shouldn't be a feature of this package, but I would really appreciate some ideas regardless.

@vinci1it2000
Copy link
Owner

The easiest solution and the most portable one is to dump the ExcelModel (i.e., the DAG) into a readable JSON format. In this way, you may "easily" maintain your code from the JSON, and afterward, you load it into the ExcelModel.

What do you think? Is it a suitable solution for your problem?

@vinci1it2000
Copy link
Owner

I have implemented the solution that I mentioned above. Follows an example of usage:

import json
import formulas
xl_model = formulas.ExcelModel().loads('test/test_files/excel.xlsx').finish()
xl_json = json.dumps(xl_model.to_dict(), indent=True)

The JSON has the following format:

{
 "'[EXCEL.XLSX]DATA'!A1": "inputs",
 "'[EXCEL.XLSX]DATA'!B1": "Intermediate",
 "'[EXCEL.XLSX]DATA'!C1": "outputs",
 "'[EXCEL.XLSX]DATA'!D1": "defaults",
 "'[EXCEL.XLSX]DATA'!A2": 2,
 "'[EXCEL.XLSX]DATA'!D2": 1,
 "'[EXCEL.XLSX]DATA'!A3": 6,
 "'[EXCEL.XLSX]DATA'!A4": 5,
 "'[EXCEL.XLSX]DATA'!B2": "=('[EXCEL.XLSX]DATA'!A2 + '[EXCEL.XLSX]DATA'!A3)",
 "'[EXCEL.XLSX]DATA'!C2": "=(('[EXCEL.XLSX]DATA'!B2 / '[EXCEL.XLSX]DATA'!B3) + '[EXCEL.XLSX]DATA'!D2)",
 "'[EXCEL.XLSX]DATA'!B3": "=('[EXCEL.XLSX]DATA'!B2 - '[EXCEL.XLSX]DATA'!A3)",
 "'[EXCEL.XLSX]DATA'!C3": "=(('[EXCEL.XLSX]DATA'!C2 * '[EXCEL.XLSX]DATA'!A2) + '[EXCEL.XLSX]DATA'!D3)",
 "'[EXCEL.XLSX]DATA'!D3": "=(1 + '[EXCEL.XLSX]DATA'!D2)",
 "'[EXCEL.XLSX]DATA'!B4": "=MAX('[EXCEL.XLSX]DATA'!A3:A4, '[EXCEL.XLSX]DATA'!B2)",
 "'[EXCEL.XLSX]DATA'!C4": "=(('[EXCEL.XLSX]DATA'!B3 ^ '[EXCEL.XLSX]DATA'!C2) + '[EXCEL.XLSX]DATA'!D4)",
 "'[EXCEL.XLSX]DATA'!D4": "=(1 + '[EXCEL.XLSX]DATA'!D3)"
}

To retrieve the model:

xl_mdl = ExcelModel().from_dict(json.loads(xl_json))

@allenlawrence94
Copy link
Author

This is great - I ended up going in an altogether different direction with my project, but I think I'll try out this new feature as well. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants