# Nonlinear Regression with Scikit Learn – Predicting Ice Cream Sales

### Project Description:
This project applies polynomial regression techniques to sales of ice cream using temperature as a feature.

### Objectives:
* 

### Public dataset source:
[Kaggle Ice Cream Selling Data Set](https://www.kaggle.com/datasets/mirajdeepbhandari/polynomial-regression)
The data contains information on the temperature and the corresponding number of units of ice cream sold

In [11]:
# Importing libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline
from sklearn.metrics import mean_squared_error, r2_score

In [3]:
# Establish file path and import data
path = 'ice_cream_sales.csv'
df = pd.read_csv(path)
df.head()

Unnamed: 0,Temperature (°C),Ice Cream Sales (units)
0,-4.662263,41.842986
1,-4.316559,34.66112
2,-4.213985,39.383001
3,-3.949661,37.539845
4,-3.578554,32.284531


In [13]:
# Relationship looks quadratic, so a polynomial degree of 2
X = df[['Temperature (°C)']] # Input
y = df['Ice Cream Sales (units)'] # Target

In [15]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.8, random_state=42)

In [None]:
# # Commenting this out to make more streamlined using pipeline
# poly = PolynomialFeatures(degree=2)
# X_train_poly = poly.fit_transform(X_train)

# lr = LinearRegression()
# lr.fit(X_train_poly, y_train)

In [16]:
# Pipeline to chain together a sequence of steps (transformer or estimator)
degree = 2  # Try 2, 3, etc.
model = make_pipeline(PolynomialFeatures(degree), LinearRegression())

# Fit the model
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Evaluate
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("MSE:", mse)
print("R² score:", r2)

MSE: 14.87879644098148
R² score: 0.843055137193884
