In [1]:
%pip install --upgrade scikit-learn==0.23.0

Note: you may need to restart the kernel to use updated packages.


In [2]:
from sklearn.datasets import load_boston

In [3]:
X,y = load_boston(return_X_y=True)

In [4]:
from sklearn.neighbors import KNeighborsRegressor


KNeighborsRegressor is a machine learning model for regression based on the k-nearest neighbors (KNN) idea.
Let’s break it down step by step.

What problem does it solve?
It predicts a continuous value (like price, temperature, score, etc.) based on similar past data.
Example tasks:


Predicting house prices


Estimating exam scores


Predicting temperature from weather features



What does “k-nearest neighbors” mean?
Instead of learning a formula, the model:


Stores all training data


When given a new data point:


Finds the k most similar data points (neighbors)


Uses their target values to make a prediction




“Similar” usually means close in distance (Euclidean distance by default).

How does KNeighborsRegressor make a prediction?
Suppose:


k = 3


Your neighbors’ target values are: 10, 12, 14


Then the prediction is usually:
prediction=10+12+143=12\text{prediction} = \frac{10 + 12 + 14}{3} = 12prediction=310+12+14​=12
(You can also use weighted averaging, where closer neighbors matter more.)

Important parameters
KNeighborsRegressor(
    n_neighbors=5,   # k value
    weights='uniform',  # or 'distance'
    metric='minkowski'  # distance measure (Euclidean by default)
)

Key ones to know:


n_neighbors: How many neighbors to look at


weights:


'uniform' → all neighbors count equally


'distance' → closer neighbors have more influence




metric: How distance is computed



Why is this model “non-parametric”?


It does not learn coefficients like linear regression


No training equation


All learning happens at prediction time



Strengths
✔ Simple and intuitive
✔ Works well for small datasets
✔ No assumption about data shape

Weaknesses
✘ Slow for large datasets
✘ Sensitive to feature scaling (you should use normalization)
✘ Performance depends heavily on choosing a good k

Very small example
from sklearn.neighbors import KNeighborsRegressor

model = KNeighborsRegressor(n_neighbors=3)
model.fit(X_train, y_train)

prediction = model.predict(X_test)


One-sentence summary
KNeighborsRegressor predicts a value by averaging the outputs of the k most similar training points.
If you want, I can also:


Compare it to KNN classification


Show a visual example


Explain how to choose k properly




In [5]:
mod = KNeighborsRegressor()


In [6]:
mod.fit(X,y)

KNeighborsRegressor()