### Problem Statement

You are a data scientist / AI engineer at a meteorological consulting firm. You have been provided with a dataset named **`"weather_data.csv"`**, which includes detailed records of various weather conditions. The dataset comprises the following columns:

- `hours_sunlight:` The total number of hours of sunlight received in a day.
- `humidity_level:` The humidity level as a percentage.
- `daily_temperature:` The temperature recorded at the end of the day in degrees Celsius.

Your task is to use this dataset to build a linear regression model to predict the daily temperature based on the hours of sunlight and humidity level. You will need to split the data into training and test sets, train the model, and evaluate its performance using appropriate metrics.

**Import Necessary Libraries**

In [43]:
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

### Task 1: Data Preparation and Exploration
1. Import the data from the `"weather_data.csv"` file and store it in a variable df.
2. Display the number of rows and columns in the dataset.
3. Display the first few rows of the dataset to get an overview.
4. Check for any missing values in the dataset.

In [22]:

# Step 1: Import the data from the "song_popularity.csv" file and store it in a variable 'df'
df = pd.read_csv("D:\\CodeBasics\\ML_Regression_Exercise3\\weather_data.csv")
df.head()

# Step 2: Display the number of rows and columns in the dataset
df.shape

# Step 3: Display the first few rows of the dataset to get an overview
df.head()
df.shape

(49, 3)

In [13]:
# Step 4: Check for any missing values in the dataset
df.isna().sum()

hours_sunlight       0
humidity_level       0
daily_temperature    0
dtype: int64

### Task 2: Train a Linear Regression Model

1. Select the features (hours_sunlight, humidity_level) and the target variable (daily_temperature) for modeling.
2. Split the data into training and test sets with a test size of 30%.
3. Create a Linear Regression model and fit it using the training data.
4. Print the model's coefficients and intercept.

In [27]:
# Step 1: Select the features and target variable for modeling
features = ['hours_sunlight', 'humidity_level']
X = df[features]
y = df[['daily_temperature']]

# Step 2: Split the data into training and test sets with a test size of 30%
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state=10)
X_train.shape

(36, 2)

In [39]:
# Step 3: Create a Linear Regression model and fit it using the training data
model = LinearRegression()
model.fit(X_train, y_train) #Trainign the data

# Step 4: Print the model's coefficients and intercept
print('this is models coefficient, ', model.coef_)
print('this is models intercept, ', model.intercept_)

this is models coefficient,  [[ 0.98517225 -0.07570413]]
this is models intercept,  [17.38453711]


(36, 1)

### Task 3: Model Evaluation

1. Make predictions on the test set using the trained model.
2. Evaluate the model using Mean Squared Error (MSE) and R-squared (R2) metrics.
3. Print the MSE and R2 values.
4. Display the first few actual vs. predicted values for the daily temperature.

In [45]:
# Step 1: Make predictions on the test set using the trained modely_pred = model.predict(X_test)
y_pred = model.predict(X_test)

# Step 2: Evaluate the model using Mean Squared Error (MSE) and R-squared (R2) metrics
mse = mean_squared_error(y_test, y_pred)
print('this is mse, ', mse)
r2score = r2_score(y_test, y_pred)
print('this is r2score: ', r2score)

this is mse,  0.10558258285909675
this is r2score:  0.9822177145587306


In [54]:
# Step 3: Print the MSE and R2 values
print('MSE: ', mse)
print('R2_Score: ', r2score)

# Step 4: Display the first few actual vs. predicted values for the daily temperature
print('this is model score: ',model.score(X_test, y_test))

MSE:  0.10558258285909675
R2_Score:  0.9822177145587306
this is model score:  0.9822177145587306
