# Welcome
Solution Extraction is a process by which we take a Project Drawdown Solution, in the form of an Excel Workbook, and create a corresponding python solution that implements _most_ of the same functionality.  This notebook will guide you through that process.  See also `Extraction_Guide.md` for more explanation and notes.

The first step is _make a copy of this notebook_.  Give it a name that represents the model you will be working on.  That way it won't collide with other notebooks when you check in or merge fixes.

## Setup


In [1]:
from tools import solution_xls_extract as sxe
from tools import create_expected_zip as cez
from tools import expected_ghost
from tests import test_excel_integration as tei
from solution import factory
from pathlib import Path
import pandas as pd
import openpyxl
import importlib

In [2]:
# Identify where you will be storing your Excel file while you work on it, and what directory the final result will go into.

excelfile = Path("/Users/anshul/Developer/drawdown/excel/ImprovedCattleFeed_Apr2021.xlsm")
outdir = Path("/Users/anshul/Developer/drawdown/solutions/solution/improvedlivestockfeed")
outdir.mkdir(parents=True, exist_ok=True)

In [3]:
# If you make changes to the extraction code (or any other code), reload it
# NOTE: This kind of reloading DOES NOT work for solutions themselves, unfortunately.  If you re-generate or modify your solution,
# you have to restart the Jupyter kernel to get it to reload properly.

importlib.reload(sxe)

<module 'tools.solution_xls_extract' from '/Users/anshul/Developer/drawdown/solutions/tools/solution_xls_extract.py'>

## Extract Code
Exctraction is done by the `sxe.ouput_solution_python_file` function.  This function reads most of the data it needs to extract from the `ScenarioRecord` tab and additional data from the TAM, Adoption and other tabs, and writes them to a solution directory in the form of an `__init__.py` file and a bunch of csv and json files.  All of the solutions in `/solution` were produced this way.

In [4]:
# Expect to see some warnings from openpyxl; these can be ignored.  If there are other warnings, please note them, but they are not necessarily
# a problem.

sxe.output_solution_python_file(outputdir=outdir, xl_filename=str(excelfile))

  warn(msg)
  warn(msg)
  warn(msg)
  warn(msg)


In [5]:
# %debug is your friend.  If the extraction fails with an exception, jump in and see if anything looks wrong

# %debug

If you are working on one of the [Excel import issues](https://github.com/ProjectDrawdown/solutions/issues?q=is%3Aissue+is%3Aopen+label%3A%22Excel+Import%22), please add comments to it describing problems you run into.  And if you have found something that looks like a general problem, please [open a new issue](https://github.com/ProjectDrawdown/solutions/issues/new) for it on the github repo.

I can't overemphasize this: 
> **Finding, researching and reporting issues is hugely valuable for us, even if you don't fully solve them.**


## Load Code / Sniff Test

Once the code has been sucessfully extracted and placed into a directory in `solution/`, all the tools that work with solutions should become available.

In [6]:
# factory.one_solution_scenarios loads a single solution, by name.  It returns a constructor that can construct scenario objects
# for this solution, and a list of the scenario names.

constructor, scenarios = factory.one_solution_scenarios("improvedlivestockfeed")

Creating AdvancedControls object from file /Users/anshul/Developer/drawdown/solutions/solution/improvedlivestockfeed/ac/PDS-45p2050-avg7scen.json
Creating AdvancedControls object from file /Users/anshul/Developer/drawdown/solutions/solution/improvedlivestockfeed/ac/PDS-28p2050-low7scen.json


KeyError: '"[\'SOLUTION Operating Cost per Functional Unit per Annum\', \'SOLUTION Fixed Operating Cost (FOM)\']" must be included in vmas to calculate mean/high/low.vmas included: dict_keys([\'Current Adoption\', \'CONVENTIONAL First Cost per Implementation Unit\', \'SOLUTION First Cost per Implementation Unit\', \'CONVENTIONAL Lifetime Capacity\', \'SOLUTION Lifetime Capacity\', \'CONVENTIONAL Average Annual Use\', \'SOLUTION Average Annual Use\', \'CONVENTIONAL Variable Operating Cost (VOM) per Functional Unit\', \'SOLUTION Variable Operating Cost (VOM) per Functional Unit\', \'CONVENTIONAL Fixed Operating Cost (FOM)\', \'SOLUTION Revenue from Increased Milk Yield\', \'CONVENTIONAL Total Energy Used per Functional Unit\', \'SOLUTION Energy Efficiency Factor\', \'SOLUTION Total Energy Used per Functional Unit\', \'CONVENTIONAL Fuel Consumed per Functional Unit\', \'SOLUTION Fuel Efficiency Factor\', \'CONVENTIONAL Direct Emissions per Functional Unit\', \'SOLUTION Direct Emissions per Functional Unit\', \'CONVENTIONAL Indirect CO2 Emissions per Unit\', \'SOLUTION Indirect CO2 Emissions per Unit\', \'CH4-CO2eq Tons Reduced\', \'N2O-CO2eq Tons Reduced\', \'CONVENTIONAL Revenue per Functional Unit\', \'SOLUTION Revenue per Functional Unit\', \'CONVENTIONAL Milk Yield\', \'SOLUTION Increased Milk Yield\'])'

In [None]:
# What scenarios did we get?

scenarios

In [None]:
# Let's build the 2nd one

myscenario = constructor(scenarios[1])

In [None]:
# %debug is your friend.

%debug

## Look at some results

TODO: it would be nice put some examples below, for example showing a little graph of something.

In [None]:
myscenario.c2.co2_mmt_reduced()

## Create Test Results

**This step requires the Excel application, and thus only can be run on Windows or Mac.**

Create a clean temporary directory to generate the test set in.  Put (a copy of) your Excel spreadsheet in that directory.

Follow the instructions in `tools/CREATING_EXPECTED_ZIP.md` to create the CSV files in that directory.

In [None]:
# Run the VB macros first!

# Assemble the resulting csv files into the expected_zip file

csvdirectory = Path("C:\\Working\\temp")
cez.create_expected_zip(csvdirectory)

In [None]:
# Move the resulting file where it belongs.

testdirectory = outdir / "testdata"
testdirectory.mkdir(exist_ok=True)

!cp $csvdirectory/expected.zip $testdirectory/expected.zip

## Create the Test

Now we are going to add your new solution to the testing infrastructure.  In an editor, open the file `tests/test_excel_integration.py`.
Scroll down to the bottom, copy one of the test functions there and modify it to fit your new model.  It will look something
like the following, where you subsitute `TESTNAME` with a unique name for your test, and `MODULE` with the name of your solution module (which is the same as the name of the directory it is in).

```
def test_<SOLUTIONNAME>_RRS():
    from solution import <MODULE>
    zipfilename = str(solutiondir.joinpath(
        '<MODULE>', 'testdata', 'expected.zip'))
    zip_f = zipfile.ZipFile(file=zipfilename)
    for scenario in <MODULE>.scenarios.keys():
        obj = <MODULE>.Scenario(scenario=scenario)
        verify = RRS_solution_verify_list(obj=obj, zip_f=zip_f)
        check_excel_against_object(
            obj=obj, zip_f=zip_f, scenario=scenario, verify=verify)
```

Be sure to note whether your model is an RSS model or a Land model and copy the right kind of test!

## Run the Test

The following shell command runs the test you just created (swapping in your solution name, of course)

In [None]:
# You could pytest from the shell, but it is a little more convenient to call the 
# test function directly:
tei.test_Composting_RRS()

If the test throws an exception, you might be tempted to use %debug to look at it. Unfortunately the main location where exceptions get thrown has already lost the context of the error. Instead what you generally have to do is figure out where in the test suite the failure was and add a breakpoint() (also known as import pdb; pdb.set_trace() to old-schoolers) there, then run it again. Hint: the error message will probably contain an Excel range, like B91:C137. This is a good string to search for in test_excel_integration.py.

From there, you work your way back to the same questions we were working on above: is this a failure in extraction, model code, the excel workbook, or the test? Rinse and repeat.

## Hot off the Press for the Hackathon: Improved Test Control!

If you look at the last two solution test definitions in `tests\test_excel_integration.py`, you will see they have a couple of extra arguments:
 * `scenario_skip`: if present, an array of scenario indices to skip over
 * `test_skip`: if present, an array of strings that should match the descriptions of tests to skip
 * `test_only`: if present, an array of strings such that _only_ tests whose description matches one of them will be executed.
 
 So for example, you could skip the second scenario, and only do the 'First Cost' and 'Operating Cost' tests, but skip the first 'First Cost' test, with the following:

In [None]:
tei.test_WaveAndTidal_RRS(scenario_skip=[1], test_only=['First Cost', 'Operating Cost'], test_skip=['C37:C82'] )

Copy from one of those last two tests and you will be able to control your testing without having to resort to hacks like commenting out tests.


# Tips

## Don't forget to restart the Jupyter Notebook kernel if you have modified code

If you change code you need to either reload the library (the 3nd cell of this notebook) or restart the kernel.

## When comparing to Excel, make sure you've loaded the right Scenario

On the `ScenarioRecord` tab, cell `B9` shows the currently loaded scenario.  When a workbook is first opened, this is usally empty, meaning you don't know which
scenario was last loaded.  Select the scenario you are debugging against from the dropdown, and click on 'Load Scenario'.

## Beautifier for Excel Formulas

Are you looking at an excel formula with five nested `IF(...` expressions?  Try [https://www.excelformulabeautifier.com/](https://www.excelformulabeautifier.com/).  You're welcome.

# You Finished!
Did you get a clean test run?  Hurrah!  You've finished this import task.  Project Drawdown thanks you!