Copyright © 2023, SAS Institute Inc., Cary, NC, USA.  All Rights Reserved.
SPDX-License-Identifier: Apache-2.0

# Automatic Generation of the requirements.json File
In order to validate Python models within a container publishing destination, the Python packages which contain the modules that are used in the Python score code file and its score resource files must be installed in the run-time container. You can install the packages when you publish a Python model or decision that contains a Python model to a container publishing destination by adding a `requirements.json` file that includes the package install statements to your model.

This notebook provides an example execution and assessment of the create_requirements_json() function added in python-sasctl v1.8.0. The aim of this function is help to create the instructions (aka the `requirements.json` file) for a lightweight Python container in SAS Model Manager. Lightweight here meaning that the container will only install the packages found in the model's pickle files and python scripts.

### **User Warnings**
The methods utilized in this function can determine package dependencies and versions from provided scripts and pickle files, but there are some stipulations that need to be considered:

1. If run outside of the development environment that the model was created in, the create_requirements_json() function **CANNOT** determine the required package _versions_ accurately. 
2. Not all Python packages have matching import and install names and as such some of the packages added to the requirements.json file may be incorrectly named (i.e. `import sklearn` vs `pip install scikit-learn`).

As such, it is recommended that the user check over the requirements.json file for package name and version accuracy before deploying to a run-time container in SAS Model Manager.

---

As an example, let's create the requirements.json file for the HMEQ Decision Tree Classification model created and uploaded in pzmmModelImportExample.ipynb. Simply import the function and aim it at the model directory.

In [1]:
from pathlib import Path
from sasctl import pzmm

In [None]:
model_dir = Path.cwd() / "data/hmeqModels/DecisionTreeClassifier"
requirements_json = pzmm.JSONFiles.create_requirements_json(model_dir)

Let's take a quick look at what packages were determined for the Decision Tree Classifier model:

In [3]:
import json
print(json.dumps(requirements_json, sort_keys=True, indent=4))

[
    {
        "command": "pip install pandas==1.5.3",
        "step": "install pandas"
    },
    {
        "command": "pip install sklearn==0.23.1",
        "step": "install sklearn"
    },
    {
        "command": "pip install numpy==1.23.5",
        "step": "install numpy"
    }
]


Note how we have returned the `sklearn` import, which is attempting to refer to the scikit-learn package, but would fail to install the correct package via `pip install sklearn` and also could not collect a package version.

Let's modify the name and add the version in Python and rewrite the requirements.json file to match.

In [6]:
for requirement in requirements_json:
    if 'sklearn' in requirement['step']:
        requirement['command'] = requirement["command"].replace('sklearn', 'scikit-learn')
        requirement['step'] = requirement['step'].replace('sklearn', 'scikit-learn')

print(json.dumps(requirements_json, sort_keys=True, indent=4))

[
    {
        "command": "pip install pandas==1.5.3",
        "step": "install pandas"
    },
    {
        "command": "pip install scikit-learn==0.23.1",
        "step": "install scikit-learn"
    },
    {
        "command": "pip install numpy==1.23.5",
        "step": "install numpy"
    }
]


In [5]:
with open(Path(model_dir) / "requirements.json", "w") as req_file:
    req_file.write(json.dumps(requirements_json, indent=4))

Now we have a complete and accurate requirements.json file for deploying models to containers in SAS Model Manager!