# Notebook to use a model

Once the model is trained and uploaded to the Artifacts Server, we can use it in a new notebook. Here, we apply the model with the Punch Python Kernel.

Since the datascientist doesn't have access to the production dataset, this notebook uses the test dataset. This notebook will be packaged to generate a punchline. The production data set will be used in this punchline.

### Adding dependencies to the environment

We reuse the pex created in the previous notebook and add the model in the dependencies list.

In [None]:
%%punch_dependencies
additional-pex:demo:dependencies:1.0.0
model:demo:credit_card:1.0.0

++ java -Xmx1g -Xms256m -Dlog4j.configurationFile=/punch/conf/log4j2/log4j2-stdout.xml -cp /punch/resourcectl.jar com.github.punchplatform.resourcectl.ResourceCtl -u http://artifacts-server.punch-artifacts:4245 download -r additional-pex:demo:dependencies:1.0.0 -o /usr/share/punch/extlib/python


Resource additional-pex:demo:dependencies:1.0.0 downloaded to /usr/share/punch/extlib/python/dependencies-1.0.0.pex


++ java -Xmx1g -Xms256m -Dlog4j.configurationFile=/punch/conf/log4j2/log4j2-stdout.xml -cp /punch/resourcectl.jar com.github.punchplatform.resourcectl.ResourceCtl -u http://artifacts-server.punch-artifacts:4245 download -r model:demo:credit_card:1.0.0


Resource model:demo:credit_card:1.0.0 downloaded to /usr/share/punch/artifacts/demo/credit_card/1.0.0/credit_card_1.0.0.zip


SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.


<IPython.core.display.Javascript object>

In [1]:
import mlflow

### Loading the model

Punch provides you a magic line to get back the path of the model into a variable. We can thus use this variable to load the model according to the model type (ex mlflow)

In [2]:
%punch_get_model --model demo:credit_card:1.0.0 --output model_path

List of files in the model directory:
	 requirements.txt
	 credit_card_1.0.0.zip
	 conda.yaml
	 MLmodel
	 model.pkl
	 python_env.yaml

Model path is available in model_path variable.


In [3]:
credit_card_model = mlflow.pyfunc.load_model(model_path)

### Reading data

In [4]:
%%punch_source --type s3 --name data -o 
bucket: demo
prefix: test/test.csv

Data is available in data variable.
Execution time: 0:00:00.217423


In [5]:
data = data[['distance_from_home', 'distance_from_last_transaction',
       'ratio_to_median_purchase_price', 'repeat_retailer', 'used_chip',
       'used_pin_number', 'online_order', 'fraud']]

### Adding parameters cell

You can define parameters whose value can be overridden when the punchline is executed.

In [6]:
#parameters
nb_rows = 10000

In [7]:
data = data[0:nb_rows]

### Application of the model

In [8]:
data["prediction"] = credit_card_model.predict(data.drop('fraud', axis=1))
data.head()



Unnamed: 0,distance_from_home,distance_from_last_transaction,ratio_to_median_purchase_price,repeat_retailer,used_chip,used_pin_number,online_order,fraud,prediction
0,11.188842,0.067784,1.659848,1.0,0.0,0.0,1.0,0.0,0.0
1,8.359728,0.186258,0.495259,1.0,1.0,0.0,0.0,0.0,0.0
2,11.401608,17.712808,2.364811,1.0,0.0,0.0,0.0,0.0,0.0
3,3.102588,0.258822,4.853085,1.0,1.0,0.0,0.0,0.0,0.0
4,4.660351,2.72908,5.257262,1.0,0.0,0.0,1.0,1.0,1.0


In [9]:
data.groupby(["fraud", "prediction"]).size()

fraud  prediction
0.0    0.0           9144
       1.0              1
1.0    1.0            855
dtype: int64

### Save results

In [10]:
%%punch_sink --type s3 -df data
bucket: demo
path: results/df.csv

[34mcreated results/df.csv object; bucket: demo ; etag: "2b74a3252136719561fd17f2b87b7525"[0m
Data saved.
Execution time: 0:00:00.065528
