Example from colab [here](https://colab.research.google.com/gist/JohannesFerner/88773019cd385fe6ba0a9377a4779f40/mindsdb.ipynb)

`rental_price` is the column we want to learn to predict given all the data in the file

In [5]:
import mindsdb
from mindsdb import *

# First we initiate MindsDB
mdb = MindsDB()

In [2]:
import pandas as pd
import io
import requests
url="https://raw.githubusercontent.com/mindsdb/main/master/docs/examples/basic/home_rentals.csv"
s=requests.get(url).content
df=pd.read_csv(io.StringIO(s.decode('utf-8')))

In [3]:
df.head()

Unnamed: 0,number_of_rooms,number_of_bathrooms,sqft,location,days_on_market,initial_price,neighborhood,rental_price
0,0,1,4848,great,10,2271,south_side,2271.0
1,1,1,674,good,1,2167,downtown,2167.0
2,1,1,554,poor,19,1883,westbrae,1883.0
3,0,1,529,great,3,2431,south_side,2431.0
4,3,2,1219,great,3,5510,south_side,5510.0


In [8]:
df.corr()

Unnamed: 0,number_of_rooms,number_of_bathrooms,days_on_market,initial_price,rental_price
number_of_rooms,1.0,0.772407,-0.032797,0.930444,0.920471
number_of_bathrooms,0.772407,1.0,-0.01582,0.841306,0.831994
days_on_market,-0.032797,-0.01582,1.0,-0.220385,-0.265681
initial_price,0.930444,0.841306,-0.220385,1.0,0.998529
rental_price,0.920471,0.831994,-0.265681,0.998529,1.0


`rental_price` is strongly correlated with the other features so I don't expect this to be a difficult problem

In [6]:
%%time
# We tell mindsDB what we want to learn and from what data
mdb.learn(
    from_data=url, # the path to the file where we can learn from, (note: can be url)
    predict='rental_price', # the column we want to learn to predict given all the data in the file
    model_name='home_rentals' # the name of this model
)



[START] DataExtractor
- Train: 3975 rows
- Test: 497 rows
- Validation: 565 rows
-- Total: 5037 rows
[END] DataExtractor, execution time: 0.155 seconds
[START] StatsGenerator
population_size=5037,  sample_size=2023  40.16%
[END] StatsGenerator, execution time: 0.348 seconds
[START] DataVectorizer
[END] DataVectorizer, execution time: 0.265 seconds
[START] ModelTrainer
Training: model home_rentals, epoch 0
Starting model...
Training model...
Test Error:0.16059516370296478, Accuracy:0.16008341784684021 | Best Accuracy so far: 0
Test Error:0.14714638888835907, Accuracy:0.35131176209433856 | Best Accuracy so far: 0.16008341784684021
[SAVING MODEL] Lowest ERROR so far! - Test Error: 0.14714638888835907, Accuracy: 0.35131176209433856
Test Error:0.13824787735939026, Accuracy:0.5685227261368614 | Best Accuracy so far: 0.35131176209433856
[SAVING MODEL] Lowest ERROR so far! - Test Error: 0.13824787735939026, Accuracy: 0.5685227261368614
Test Error:0.13353748619556427, Accuracy:0.703512442636438

In [7]:
# use the model to make predictions
result = mdb.predict(predict='rental_price', when={'number_of_rooms': 2,'number_of_bathrooms':1, 'sqft': 1190}, model_name='home_rentals')

# you can now print the results
print('The predicted price is ${price} with {conf} confidence'.format(price=result.predicted_values[0]['rental_price'], conf=result.predicted_values[0]['prediction_confidence']))

[START] StatsLoader
[END] StatsLoader, execution time: 0.020 seconds
[START] DataExtractor
[END] DataExtractor, execution time: 0.002 seconds
[START] DataVectorizer
[END] DataVectorizer, execution time: 0.000 seconds
[START] ModelPredictor
Predict: model home_rentals, epoch 0
Starting model...
Inferring from model and data...
predicting batch...
Predict: model home_rentals [OK], TOTAL TIME: 0.04 seconds
[END] ModelPredictor, execution time: 0.039 seconds
The predicted price is $3166.38 with 0.65 confidence
