# Introduction

In this tutorial, we will go through an example to update a preexisting model. This might be useful when you come across additional data that you would want to consider, without having to train a model from scratch.

The main abstraction that Lightwood offers for this is the `BaseMixer.partial_fit()` method. To call it, you need to pass new training data and a held-out dev subset for internal mixer usage (e.g. early stopping). If you are using an aggregate ensemble, it's likely you will want to do this for every single mixer. The convienient `PredictorInterface.adjust()` does this automatically for you.


# Initial model training

First, let's train a Lightwood predictor for the `concrete strength` dataset:

In [1]:
from lightwood.api.high_level import ProblemDefinition, json_ai_from_problem, predictor_from_json_ai
import pandas as pd

In [2]:
# Load data
df = pd.read_csv('https://raw.githubusercontent.com/mindsdb/lightwood/staging/tests/data/concrete_strength.csv')

df = df.sample(frac=1, random_state=1)
train_df = df[:int(0.2*len(df))]
update_df = df[int(0.2*len(df)):int(0.8*len(df))]
test_df = df[int(0.8*len(df)):]

print(f'Train dataframe shape: {train_df.shape}')
print(f'Update dataframe shape: {update_df.shape}')
print(f'Test dataframe shape: {test_df.shape}')

Train dataframe shape: (206, 10)
Update dataframe shape: (618, 10)
Test dataframe shape: (206, 10)


Note that we have three different data splits.

We will use the `training` split for the initial model training. As you can see, it's only a 20% of the total data we have. The `update` split will be used as training data to adjust/update our model. Finally, the held out `test` set will give us a rough idea of the impact our updating procedure has on the model's predictive capabilities.

In [4]:
# Define predictive task and predictor
target = 'concrete_strength'
pdef = ProblemDefinition.from_dict({'target': target, 'time_aim': 200})
jai = json_ai_from_problem(df, pdef)

# We will keep the architecture simple: a single neural mixer, and a `BestOf` ensemble:
jai.outputs[target].mixers = [{
    "module": "Neural",
    "args": {
        "fit_on_dev": False,
        "stop_after": "$problem_definition.seconds_per_mixer",
        "search_hyperparameters": False,
    }
}]

jai.outputs[target].ensemble = {
    "module": "BestOf",
    "args": {
        "args": "$pred_args",
        "accuracy_functions": "$accuracy_functions",
    }
}

# Build and train the predictor
predictor = predictor_from_json_ai(jai)
predictor.learn(train_df)

[32mINFO:lightwood-91181:Dropping features: [][0m
[32mINFO:lightwood-91181:Analyzing a sample of 979[0m
[32mINFO:lightwood-91181:from a total population of 1030, this is equivalent to 95.0% of your data.[0m
[32mINFO:lightwood-91181:Using 15 processes to deduct types.[0m
[32mINFO:lightwood-91181:Starting statistical analysis[0m
[32mINFO:lightwood-91181:Finished statistical analysis[0m
[32mINFO:lightwood-91181:Unable to import black formatter, predictor code might be a bit ugly.[0m
[32mINFO:lightwood-91181:Dropping features: [][0m
[32mINFO:lightwood-91181:Performing statistical analysis on data[0m
[32mINFO:lightwood-91181:Starting statistical analysis[0m
[32mINFO:lightwood-91181:Finished statistical analysis[0m
[32mINFO:lightwood-91181:Cleaning the data[0m
[32mINFO:lightwood-91181:Splitting the data into train/test[0m
[32mINFO:lightwood-91181:Preparing the encoders[0m
[32mINFO:lightwood-91181:Encoder prepping dict length of: 1[0m
[32mINFO:lightwood-91181:En

[37mDEBUG:lightwood-91181:Loss @ epoch 72: 0.05157444253563881[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 73: 0.05137106031179428[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 74: 0.05131785199046135[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 75: 0.05133713781833649[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 76: 0.05156172439455986[0m
[32mINFO:lightwood-91181:Ensembling the mixer[0m
[32mINFO:lightwood-91181:Mixer: Neural got accuracy: 0.5960601553597429[0m
[32mINFO:lightwood-91181:Picked best mixer: Neural[0m
[32mINFO:lightwood-91181:Analyzing the ensemble of mixers[0m
[32mINFO:lightwood-91181:The block ICP is now running its analyze() method[0m
[32mINFO:lightwood-91181:The block AccStats is now running its analyze() method[0m
[32mINFO:lightwood-91181:The block GlobalFeatureImportance is now running its analyze() method[0m
[32mINFO:lightwood-91181:Adjustment on validation requested.[0m
[32mINFO:lightwood-91181:Updating the mixers[0m
torch.cuda.amp.GradScaler is ena

In [6]:
# Train and get predictions for the held out test set
predictions = predictor.predict(test_df)
predictions

[32mINFO:lightwood-91181:Dropping features: [][0m
[32mINFO:lightwood-91181:Cleaning the data[0m
[32mINFO:lightwood-91181:Featurizing the data[0m
[32mINFO:lightwood-91181:The block ICP is now running its explain() method[0m
[32mINFO:lightwood-91181:The block AccStats is now running its explain() method[0m
[32mINFO:lightwood-91181:AccStats.explain() has not been implemented, no modifications will be done to the data insights.[0m
[32mINFO:lightwood-91181:The block GlobalFeatureImportance is now running its explain() method[0m
[32mINFO:lightwood-91181:GlobalFeatureImportance.explain() has not been implemented, no modifications will be done to the data insights.[0m


Unnamed: 0,prediction,truth,confidence,lower,upper
0,51.193603,71.30,0.9991,30.540443,71.846764
1,28.503390,39.60,0.9991,7.850229,49.156551
2,18.356139,10.79,0.9991,0.000000,39.009300
3,16.062094,4.83,0.9991,0.000000,36.715254
4,32.623629,47.71,0.9991,11.970469,53.276790
...,...,...,...,...,...
201,45.633811,40.93,0.9991,24.980650,66.286972
202,41.613209,52.82,0.9991,20.960048,62.266369
203,31.297044,39.66,0.9991,10.643883,51.950204
204,29.409258,13.29,0.9991,8.756097,50.062418


## Updating the predictor

As previously mentioned, you can update any given mixer with a `BaseMixer.partial_fit()` call. If you have multiple mixers and want to update them all at once, you should use `PredictorInterface.adjust()`. 

For both of these methods, two encoded datasources are needed as input (for `adjust` you need to wrap them in a dictionary with 'old' and 'new' keys). 

Let's `adjust` our predictor:

In [8]:
from lightwood.data import EncodedDs

train_ds = EncodedDs(predictor.encoders, train_df, target)
update_ds = EncodedDs(predictor.encoders, update_df, target)

predictor.adjust({'old': train_ds, 'new': update_ds})

[32mINFO:lightwood-91181:Updating the mixers[0m
torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
[37mDEBUG:lightwood-91181:Loss @ epoch 1: 0.06545061928530534[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 2: 0.0679960281898578[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 3: 0.07171888339022796[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 4: 0.07307156516859929[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 5: 0.06360626469055812[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 6: 0.06457449619968732[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 7: 0.057915804286797844[0m
[37mDEBUG:lightwood-91181:Loss @ epoch 8: 0.06492673171063264[0m


In [9]:
new_predictions = predictor.predict(test_df)
new_predictions

[32mINFO:lightwood-91181:Dropping features: [][0m
[32mINFO:lightwood-91181:Cleaning the data[0m
[32mINFO:lightwood-91181:Featurizing the data[0m
[32mINFO:lightwood-91181:The block ICP is now running its explain() method[0m
[32mINFO:lightwood-91181:The block AccStats is now running its explain() method[0m
[32mINFO:lightwood-91181:AccStats.explain() has not been implemented, no modifications will be done to the data insights.[0m
[32mINFO:lightwood-91181:The block GlobalFeatureImportance is now running its explain() method[0m
[32mINFO:lightwood-91181:GlobalFeatureImportance.explain() has not been implemented, no modifications will be done to the data insights.[0m


Unnamed: 0,prediction,truth,confidence,lower,upper
0,53.392253,71.30,0.9991,32.739093,74.045414
1,27.886292,39.60,0.9991,7.233132,48.539453
2,16.301788,10.79,0.9991,0.000000,36.954948
3,13.862827,4.83,0.9991,0.000000,34.515988
4,31.421035,47.71,0.9991,10.767875,52.074196
...,...,...,...,...,...
201,42.631037,40.93,0.9991,21.977876,63.284197
202,37.502444,52.82,0.9991,16.849283,58.155604
203,29.491487,39.66,0.9991,8.838326,50.144647
204,28.013570,13.29,0.9991,7.360410,48.666731


Nice! Our predictor was updated, and new predictions are looking good. Let's compare the old and new accuracies:

In [10]:
from sklearn.metrics import r2_score

old_acc = r2_score(predictions['truth'], predictions['prediction'])
new_acc = r2_score(new_predictions['truth'], new_predictions['prediction'])

print(f'Old Accuracy: {round(old_acc, 3)}\nNew Accuracy: {round(new_acc, 3)}')

Old Accuracy: 0.583
New Accuracy: 0.624


After updating, we see an increase in the R2 score of predictions for the held out test set.

## Conclusion

We have gone through a simple example of how Lightwood predictors can leverage newly acquired data to improve their predictions. The interface for doing so is fairly simple, requiring only some new data and a single call to update.

You can further customize the logic for updating your mixers by modifying the `partial_fit()` methods in them.