# Toying out with MindsDB

In this Notebook, we will obtain a quick overview on MindsDB for automatically training a Neural Network, with just a few lines of code, and leaving all the hard ML work to this library.

For simplicity, no data pre-processing steps will be taken.

<a href="https://colab.research.google.com/github/ggasbarri/data-science-portfolio/blob/master/fun/minds_db.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
%%capture
!pip install mindsdb --user

You could restart runtime by now... (Ctrl + M)

Now, we import the library

In [0]:
from mindsdb import *

Then, we fit the data contained in the CSV  Colab sample "california_housing_train". 
Insights on this data can be found [here](https://developers.google.com/machine-learning/crash-course/california-housing-data-description).

This might take a lot of time since MindsDB will try to figure out the best model it can automatically (in my case, it took 744.85 seconds).

In [5]:
MindsDB().learn(
    from_data="sample_data/california_housing_train.csv", # the path to the file where we can learn from, (note: can be url)
    predict='median_house_value', # the column we want to learn to predict given all the data in the file
    model_name='housing_price' # the name of this model
)



[START] DataExtractor
- Train: 13668 rows
- Test: 1643 rows
- Validation: 1689 rows
-- Total: 17000 rows
[END] DataExtractor, execution time: 0.298 seconds
[START] StatsGenerator
population_size=17000,  sample_size=2820  16.59%
[END] StatsGenerator, execution time: 0.222 seconds
[START] DataVectorizer
[END] DataVectorizer, execution time: 1.864 seconds
[START] ModelTrainer
Training: model housing_price, epoch 0
Starting model...
Training model...
Test Error:0.18670423328876495, Accuracy:0.006562607823397015 | Best Accuracy so far: 0
Test Error:0.19144566357135773, Accuracy:0.023733085490065697 | Best Accuracy so far: 0.006562607823397015
Test Error:0.1876358836889267, Accuracy:0.03364800032626125 | Best Accuracy so far: 0.006562607823397015
Test Error:0.18636871874332428, Accuracy:0.015645175243118703 | Best Accuracy so far: 0.006562607823397015
[SAVING MODEL] Lowest ERROR so far! - Test Error: 0.18636871874332428, Accuracy: 0.015645175243118703
Test Error:0.1890260875225067, Accuracy:

Now we can make predictions based on the learnt model:

In [8]:
# use the model to make predictions
result = MindsDB().predict(predict='median_house_value', 
                           when={
                               'households': 900.0,
                               'total_rooms': 2800.0,
                               'median_income': 11.0
                           },
                           model_name='housing_price')

[START] StatsLoader
[END] StatsLoader, execution time: 0.015 seconds
[START] DataExtractor
[END] DataExtractor, execution time: 0.003 seconds
[START] DataVectorizer
[END] DataVectorizer, execution time: 0.000 seconds
[START] ModelPredictor
Predict: model housing_price, epoch 0
Starting model...
Inferring from model and data...
predicting batch...
Predict: model housing_price [OK], TOTAL TIME: 0.03 seconds
[END] ModelPredictor, execution time: 0.030 seconds


And finally, profit...

In [9]:
# you can now print the results
print('The predicted price is ${price} with {conf} confidence'.format(price=result.predicted_values[0]['median_house_value'], conf=result.predicted_values[0]['prediction_confidence']))

The predicted price is $320232. with 0.12 confidence
