# Welcome
Solution Extraction is a process by which we take a Project Drawdown Solution, in the form of an Excel Workbook, and create a corresponding python solution that implements _most_ of the same functionality.  This notebook will guide you through that process.  See also `Extraction_Guide.md` for more explanation and notes.

The first step is _make a copy of this notebook_.  Give it a name that represents the model you will be working on.  That way it won't collide with other notebooks when you check in or merge fixes.

## Setup


In [1]:
from tools import solution_xls_extract as sxe
from tools import create_expected_zip as cez
from tools import expected_ghost
from solution import factory
from pathlib import Path
import pandas as pd
import openpyxl
import importlib

In [4]:
# Identify where you will be storing your Excel file while you work on it, and what directory the final result will go into.

excelfile = Path("C:\\Working\\ModelsNew\\Glass_RRS_Model_Residential-Nov19.xlsm")
outdir = Path("C:\\Working\\solutions\\solution\\residentialglass")
outdir.mkdir(parents=True, exist_ok=True)

In [4]:
# If you make changes to the extraction code (or any other code), reload it
# NOTE: This kind of reloading DOES NOT work for solutions themselves, unfortunately.  If you re-generate or modify your solution,
# you have to restart the Jupyter kernel to get it to reload properly.

importlib.reload(sxe)

<module 'tools.solution_xls_extract' from 'C:\\Working\\solutions\\tools\\solution_xls_extract.py'>

## Extract Code

> Note: if you are working on a model that has already been extracted, skip this step and move on to whichever next step is appropriate.

Exctraction is done by the `sxe.ouput_solution_python_file` function.  This function reads most of the data it needs to extract from the `ScenarioRecord` tab and additional data from the TAM, Adoption and other tabs, and writes them to a solution directory in the form of an `__init__.py` file and a bunch of csv and json files.  All of the solutions in `/solution` were produced this way.

In [5]:
# Expect to see some warnings from openpyxl; these can be ignored.  If there are other warnings, please note them, but they are not necessarily
# a problem.

sxe.output_solution_python_file(outputdir=outdir, xl_filename=str(excelfile))

  warn(msg)
  warn(msg)
  warn(msg)
  warn(msg)


In [7]:
# %debug is your friend.  If the extraction fails with an exception, jump in and see if anything looks wrong

%debug

> [1;32mc:\projects\project drawdown\solutions\tools\extraction\solution_xls_extract.py[0m(878)[0;36mwrite_ad[1;34m()[0m
[1;32m    876 [1;33m    [0mf[0m[1;33m.[0m[0mwrite[0m[1;33m([0m[1;34m"             'Middle East and Africa', 'Latin America', 'China', 'India', 'EU', 'USA'],\n"[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m    877 [1;33m    [0mf[0m[1;33m.[0m[0mwrite[0m[1;33m([0m[1;34m"            ['trend', self.ac.soln_pds_adoption_prognostication_trend, "[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m--> 878 [1;33m    [0mf[0m[1;33m.[0m[0mwrite[0m[1;33m([0m[0mq[0m[1;33m([0m[0mxls[0m[1;33m([0m[0ma[0m[1;33m,[0m [1;34m'L17'[0m[1;33m)[0m[1;33m)[0m [1;33m+[0m [1;34m",\n"[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m    879 [1;33m    [0mf[0m[1;33m.[0m[0mwrite[0m[1;33m([0m[1;34m"             "[0m [1;33m+[0m [0mq[0m[1;33m([0m[0mxls[0m[1;33m([0m[0ma[0m[1;33m,[0m [1;34m'L20'[0m[1;33m)[0m

It is not uncommon to encounter issues at this stage or later.  I can't overemphasize this: 
> **Finding, researching and reporting issues is hugely valuable for us, even if you don't fully solve them.**

As you work through issues, please keep a log of what you have done; it can help the next person to pick up where you leave off.  Our convention is to create a file named `changelog` in your solution directory, so the information stays with the solution.

## Load Code / Sniff Test

Once the code has been sucessfully extracted and placed into a directory in `solution/`, all the tools that work with solutions should become available.

In [4]:
# factory.one_solution_scenarios loads a single solution, by name.  It returns a constructor that can construct scenario objects
# for this solution, and a list of the scenario names.

(constructor,scenarios) = factory.one_solution_scenarios("residentialglass")

In [5]:
# What scenarios did we get?

scenarios

['PDS1-73p2050-2.75% Retrofit Rate (Integrated)',
 'PDS1-73p2050-Based on 2.75% Retrofit Rate',
 'PDS1-83p2050-2.75%retrofit rate-40%initial adoption-Plausible',
 'PDS1-97p2050-based on a 2.75% retrofit rate',
 'PDS2-100p2050-based on a 5% retrofit rate',
 'PDS2-87p2050-5% Retrofit Rate (Integrated)',
 'PDS2-87p2050-Based on 5% Retrofit Rate',
 'PDS2-93p2050-5.0-%retrofit rate-40%initial adoption-Drawdown',
 'PDS3-100p2050-based on an 8% retrofit rate',
 'PDS3-95p2050-8% Retrofit Rate (Integrated)',
 'PDS3-95p2050-Based on 8% Retrofit Rate',
 'PDS3-98p2050-8.0-%retrofit rate-40%initial adoption-Optimum']

In [6]:
# Let's build the 2nd one

myscenario = constructor(scenarios[1])

In [2]:
# %debug is your friend.

%debug

ERROR:root:No traceback has been produced, nothing to debug.


## Look at some results

TODO: it would be nice put some examples below, for example showing a little graph of something.

In [28]:
myscenario.c2.co2_mmt_reduced()

Unnamed: 0_level_0,World,OECD90,Eastern Europe,Asia (Sans Japan),Middle East and Africa,Latin America,China,India,EU,USA
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2014,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2015,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2016,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2017,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2018,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2019,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2020,39.412745,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2021,58.587796,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2022,77.415573,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2023,95.903612,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Create Test Results

**This step requires the Excel application, and thus only can be run on Windows or Mac.**

Create a clean temporary directory to generate the test set in.  Put (a copy of) your Excel spreadsheet in that directory.

Follow the instructions in `tools/CREATING_EXPECTED_ZIP.md` to create the CSV files in that directory.

In [21]:
# Run the VB macros first!

# Assemble the resulting csv files into the expected_zip file

csvdirectory = Path("C:\\Working\\temp")
cez.create_expected_zip(csvdirectory)

In [7]:
# Move the resulting file where it belongs.

testdirectory = outdir / "tests"
testdirectory.mkdir(exist_ok=True)

!cp $csvdirectory/expected.zip $testdirectory/expected.zip

## Create the Solution Test File

Copy the template file `tools/solution_test_template.py` to your new `tests` directory and give it a unique name based on the solution name:


In [8]:
solution_name='residentialglass'
solution_testfile_name=f"test_{solution_name}.py"

!cp tools/solution_test_template.py $testdirectory/$solution_testfile_name

Open the output file and replace all the occurrances of the string SOLUTION with the name of your solution.
Also replace the string IS_LAND with True or False depending on whether this is a Land solution or an RSS solution.
Save the file and exit.

## Run the Test

Now you can run your new test!

In [10]:
!python -m pytest $outdir

platform win32 -- Python 3.9.6, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: C:\Working\solutions, configfile: tox.ini
collected 2 items

solution\composting\tests\test_composting.py .F                          [100%]

___________________________ test_composting_results ___________________________

scenario_skip = None, test_skip = None, test_only = None

    @pytest.mark.slow
    def test_composting_results(scenario_skip=None, test_skip=None, test_only=None):
        """Test computed results against stored Excel results"""
        scenario_skip = scenario_skip or SCENARIO_SKIP
        test_skip = test_skip or TEST_SKIP
>       expected_result_tester.one_solution_tester(
            solution_name,
            expected_file, is_land=False,
            scenario_skip=scenario_skip, test_skip=test_skip, test_only=test_only)

solution\composting\tests\test_composting.py:31: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tools\expected_result_tester.py:10

If errors occur, look through the error output for an Excel range (like Q135:AA181 in the result above).  Search on this string in `tools/expected_result_tester.py` to find the specific test that failed.  From there, you work your way back to the same questions we were working on above: is this a failure in extraction, model code, the excel workbook, or the test?  Rinse and repeat.

## Digging into the Excel Variations

The code in `__init__.py` creates a series of class objects: usually a `CustomAdoption` object, a `HelperTables` object, a `UnitAdoption` object, etc.  Because the Excel models have varied from each other, and over time, there are often quirks that need to be managed, especially for these three classes.  If you look at the definitions of those classes, you will see a set of parameters that control which variation the Python code should run in order to match the Excel behavior.  Look carefully at the formulas in your Excel and at the results the Excel is producing compared to what the Python is producing in the corresponding tests.  Many (but not all) issues can be solved by setting these special parameters correctly in the constructors in `__init__.py` for a specific solution.

## Controlling which Tests Run

The solution results tests acutally run many, many tests.  You may want to skip past some of those tests to find and work on other issues.  There is a way to do that, but it requires running the tests from within python, rather than via pytest.

If you look at the second function definition in your test file, you will see it some extra arguments that you can set:
 * `scenario_skip`: if present, an array of scenario indices to skip over
 * `test_skip`: if present, an array of strings that should match the descriptions of tests to skip
 * `test_only`: if present, an array of strings such that _only_ tests whose description matches one of them will be executed.
 
 So for example, you could skip the second scenario, and only do the 'First Cost' and 'Operating Cost' tests, but skip the first 'First Cost' test, with the following (substituting your own solution name, of course):

In [1]:
import solution.afforestation.tests.test_afforestation as mytests
mytests.test_afforestation_results(scenario_skip=[1],test_only=['First Cost', 'Operating Cost'], test_skip=['C37:C82'])

Checking scenario 0: PDS-100p2050-Drawdown-CustomPDS-high-Aug2019
**** Skipped scenario 1 'PDS-100p2050-Drawdown-CustomPDS-High-July2019'
Checking scenario 2: PDS-100p2050-Optimum-CustomPDS-HighgrowthKreidenweis-July2018
Checking scenario 3: PDS-100p2050-Optimum-PDScustom-high-kreidnweis-Aug2019
Checking scenario 4: PDS-57p2050-Plausible-cusomPDS-avg-30Jan2020
Checking scenario 5: PDS-57p2050-Plausible-CustomPDS-Avg-Jan2020
Checking scenario 6: PDS-62p2050-Plausible-PDSCustom-low-Nov2019
Checking scenario 7: PDS-65p2050-drawdown-customPDS-30Jan2020
Checking scenario 8: PDS-65p2050-Drawdown-CustomPDS-high0.5stdv-Jan2020
Checking scenario 9: PDS-69p2050-Drawdown-PDSCustom-avg-Nov2019
Checking scenario 10: PDS-82p2050-Optimum-PDSCustom-high-Nov2019
Checking scenario 11: PDS-84p2050-Plausible-PDScustom-low-BookVersion1
Checking scenario 12: PDS-87p2050-plausible-avg-aug
Checking scenario 13: PDS-99p2050-Drawdown-Optimum-PDScustom-high-BookVersion1



# Tips

## Don't forget to restart the Jupyter Notebook kernel if you have modified code

If you change code you need to either reload the library (the 3nd cell of this notebook) or restart the kernel.

## When comparing to Excel, make sure you've loaded the right Scenario

On the `ScenarioRecord` tab, cell `B9` shows the currently loaded scenario.  When a workbook is first opened, this is usally empty, meaning you don't know which
scenario was last loaded.  Select the scenario you are debugging against from the dropdown, and click on 'Load Scenario'.

## Beautifier for Excel Formulas

Are you looking at an excel formula with five nested `IF(...` expressions?  Try [https://www.excelformulabeautifier.com/](https://www.excelformulabeautifier.com/).  You're welcome.


# Contributing your Result

Ideally you end up with a clean test run.  But even if you don't, we want to use the work you have done.  If you have made it as far as getting a Scenario object to load, please create a PR with your result.  Make sure to include any changes you made, and your observations of what worked and didn't, in a `changelog` file in your solution directory.

Thank you for helping!