# Introduction

In this tutorial, we will go through an example to update a preexisting model. This might be useful when you come across additional data that you would want to consider, without having to train a model from scratch.

The main abstraction that Lightwood offers for this is the `BaseMixer.partial_fit()` method. To call it, you need to pass new training data and a held-out dev subset for internal mixer usage (e.g. early stopping). If you are using an aggregate ensemble, it's likely you will want to do this for every single mixer. The convienient `PredictorInterface.adjust()` does this automatically for you.


# Initial model training

First, let's train a Lightwood predictor for the `concrete strength` dataset:

In [1]:
from lightwood.api.high_level import ProblemDefinition, json_ai_from_problem, predictor_from_json_ai
import pandas as pd

In [2]:
# Load data
df = pd.read_csv('https://raw.githubusercontent.com/mindsdb/lightwood/staging/tests/data/concrete_strength.csv')

df = df.sample(frac=1, random_state=1)
train_df = df[:int(0.2*len(df))]
update_df = df[int(0.2*len(df)):int(0.8*len(df))]
test_df = df[int(0.8*len(df)):]

print(f'Train dataframe shape: {train_df.shape}')
print(f'Update dataframe shape: {update_df.shape}')
print(f'Test dataframe shape: {test_df.shape}')

Train dataframe shape: (206, 10)
Update dataframe shape: (618, 10)
Test dataframe shape: (206, 10)


Note that we have three different data splits.

We will use the `training` split for the initial model training. As you can see, it's only a 20% of the total data we have. The `update` split will be used as training data to adjust/update our model. Finally, the held out `test` set will give us a rough idea of the impact our updating procedure has on the model's predictive capabilities.

In [3]:
# Define predictive task and predictor
target = 'concrete_strength'
pdef = ProblemDefinition.from_dict({'target': target, 'time_aim': 200})
jai = json_ai_from_problem(df, pdef)

# We will keep the architecture simple: a single neural mixer, and a `BestOf` ensemble:
jai.outputs[target].mixers = [{
    "module": "Neural",
    "args": {
        "fit_on_dev": False,
        "stop_after": "$problem_definition.seconds_per_mixer",
        "search_hyperparameters": False,
    }
}]

jai.outputs[target].ensemble = {
    "module": "BestOf",
    "args": {
        "args": "$pred_args",
        "accuracy_functions": "$accuracy_functions",
    }
}

# Build and train the predictor
predictor = predictor_from_json_ai(jai)
predictor.learn(train_df)

[32mINFO:lightwood-825583:Dropping features: [][0m
[32mINFO:lightwood-825583:Analyzing a sample of 979[0m
[32mINFO:lightwood-825583:from a total population of 1030, this is equivalent to 95.0% of your data.[0m
[32mINFO:lightwood-825583:Using 7 processes to deduct types.[0m
[32mINFO:lightwood-825583:Infering type for: slag[0m
[32mINFO:lightwood-825583:Infering type for: cement[0m
[32mINFO:lightwood-825583:Infering type for: flyAsh[0m
[32mINFO:lightwood-825583:Infering type for: id[0m
[32mINFO:lightwood-825583:Infering type for: water[0m
[32mINFO:lightwood-825583:Infering type for: superPlasticizer[0m
[32mINFO:lightwood-825583:Infering type for: coarseAggregate[0m
[32mINFO:lightwood-825583:Column cement has data type float[0m
[32mINFO:lightwood-825583:Column coarseAggregate has data type float[0m
[32mINFO:lightwood-825583:Column flyAsh has data type float[0m
[32mINFO:lightwood-825583:Column slag has data type float[0m
[32mINFO:lightwood-825583:Column id has

In [4]:
# Train and get predictions for the held out test set
predictions = predictor.predict(test_df)
predictions

[32mINFO:lightwood-825583:Dropping features: [][0m
[32mINFO:lightwood-825583:Cleaning the data[0m
[32mINFO:lightwood-825583:Featurizing the data[0m
[32mINFO:lightwood-825583:The block ICP is now running its explain() method[0m
[32mINFO:lightwood-825583:The block AccStats is now running its explain() method[0m
[32mINFO:lightwood-825583:AccStats.explain() has not been implemented, no modifications will be done to the data insights.[0m


Unnamed: 0,prediction,truth,confidence,lower,upper
0,52.993881,71.30,0.9991,32.340706,73.647056
1,27.877298,39.60,0.9991,7.224123,48.530473
2,18.540179,10.79,0.9991,0.000000,39.193354
3,16.238102,4.83,0.9991,0.000000,36.891276
4,32.959752,47.71,0.9991,12.306577,53.612927
...,...,...,...,...,...
201,47.220826,40.93,0.9991,26.567652,67.874001
202,42.638142,52.82,0.9991,21.984967,63.291317
203,31.631880,39.66,0.9991,10.978705,52.285054
204,29.147330,13.29,0.9991,8.494156,49.800505


## Updating the predictor

As previously mentioned, you can update any given mixer with a `BaseMixer.partial_fit()` call. If you have multiple mixers and want to update them all at once, you should use `PredictorInterface.adjust()`. 

For both of these methods, two encoded datasources are needed as input (for `adjust` you need to wrap them in a dictionary with 'old' and 'new' keys). 

Let's `adjust` our predictor:

In [5]:
from lightwood.data import EncodedDs

train_ds = EncodedDs(predictor.encoders, train_df, target)
update_ds = EncodedDs(predictor.encoders, update_df, target)

predictor.adjust({'old': train_ds, 'new': update_ds})

[32mINFO:lightwood-825583:Updating the mixers[0m
[32mINFO:lightwood-825583:Loss @ epoch 1: 0.06395960412919521[0m
[32mINFO:lightwood-825583:Loss @ epoch 2: 0.0760517530143261[0m
[32mINFO:lightwood-825583:Loss @ epoch 3: 0.06467204913496971[0m
[32mINFO:lightwood-825583:Loss @ epoch 4: 0.0686721174667279[0m
[32mINFO:lightwood-825583:Loss @ epoch 5: 0.059960046162207924[0m
[32mINFO:lightwood-825583:Loss @ epoch 6: 0.05878346599638462[0m
[32mINFO:lightwood-825583:Loss @ epoch 7: 0.059159028654297195[0m
[32mINFO:lightwood-825583:Loss @ epoch 8: 0.05405611855288347[0m
[32mINFO:lightwood-825583:Loss @ epoch 9: 0.054099527498086296[0m
[32mINFO:lightwood-825583:Loss @ epoch 10: 0.05619463324546814[0m


In [6]:
new_predictions = predictor.predict(test_df)
new_predictions

[32mINFO:lightwood-825583:Dropping features: [][0m
[32mINFO:lightwood-825583:Cleaning the data[0m
[32mINFO:lightwood-825583:Featurizing the data[0m
[32mINFO:lightwood-825583:The block ICP is now running its explain() method[0m
[32mINFO:lightwood-825583:The block AccStats is now running its explain() method[0m
[32mINFO:lightwood-825583:AccStats.explain() has not been implemented, no modifications will be done to the data insights.[0m


Unnamed: 0,prediction,truth,confidence,lower,upper
0,54.147115,71.30,0.9991,33.493941,74.800290
1,29.181826,39.60,0.9991,8.528651,49.835001
2,16.265376,10.79,0.9991,0.000000,36.918551
3,13.720440,4.83,0.9991,0.000000,34.373615
4,32.030441,47.71,0.9991,11.377266,52.683616
...,...,...,...,...,...
201,42.245359,40.93,0.9991,21.592184,62.898534
202,37.356423,52.82,0.9991,16.703248,58.009598
203,29.892014,39.66,0.9991,9.238839,50.545189
204,28.064979,13.29,0.9991,7.411804,48.718153


Nice! Our predictor was updated, and new predictions are looking good. Let's compare the old and new accuracies:

In [7]:
from sklearn.metrics import r2_score

old_acc = r2_score(predictions['truth'], predictions['prediction'])
new_acc = r2_score(new_predictions['truth'], new_predictions['prediction'])

print(f'Old Accuracy: {round(old_acc, 3)}\nNew Accuracy: {round(new_acc, 3)}')

Old Accuracy: 0.589
New Accuracy: 0.632


After updating, we see an increase in the R2 score of predictions for the held out test set.

## Conclusion

We have gone through a simple example of how Lightwood predictors can leverage newly acquired data to improve their predictions. The interface for doing so is fairly simple, requiring only some new data and a single call to update.

You can further customize the logic for updating your mixers by modifying the `partial_fit()` methods in them.