Load data from a CSV file into a pandas DataFrame.

In [None]:
import pandas as pd

data = pd.read_csv('data.csv')

Explore the data by displaying the first few rows and summary statistics.

In [None]:
print(data.head())
print(data.describe())

Preprocess the data by splitting it into features and target variable, then into train and test sets.

In [None]:
from sklearn.model_selection import train_test_split
X = data.drop('Price', axis=1)
y = data['Price']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Visualize the price by date using a line plot.

In [None]:
import matplotlib.pyplot as plt
plt.plot(data['Date'], data['Price'])
plt.title('Price by Date')
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()

Create a Kernel Density Estimate (KDE) plot for the price.

In [None]:
import seaborn as sns
sns.kdeplot(data['Price'])
plt.title('KDE Plot for Price')
plt.show()

Generate a pair plot to visualize relationships between variables.

In [None]:
sns.pairplot(data)
plt.title('Pair Plot')
plt.show()

Train a linear regression model using the training data.

In [None]:
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train, y_train)

Compare predictions with actual prices using a scatter plot.

In [None]:
predictions = model.predict(X_test)
plt.scatter(y_test, predictions)
plt.xlabel('Actual Prices')
plt.ylabel('Predicted Prices')
plt.title('Predictions vs Actual Prices')
plt.show()

Visualize the distribution of errors between actual and predicted prices.

In [None]:
sns.histplot(y_test - predictions, kde=True)
plt.title('Error Distribution')
plt.show()

Calculate and display evaluation metrics: Mean Squared Error and R-squared.

In [None]:
from sklearn.metrics import mean_squared_error, r2_score
mse = mean_squared_error(y_test, predictions)
r2 = r2_score(y_test, predictions)
print(f'MSE: {mse}, R2: {r2}')