## Problem Statement

You are a data scientist / AI engineer at a meteorological consulting firm. You have been provided with a dataset named **"weather_data.csv"**, which includes detailed records of various weather conditions. The dataset comprises the following columns:

- **`hours_sunlight:`** The total number of hours of sunlight received in a day.
- **`humidity_level:`** The humidity level as a percentage.
- **`daily_temperature:`** The temperature recorded at the end of the day in degrees Celsius.

Your task is to use this dataset to build a linear regression model to predict the daily temperature based on the hours of sunlight and humidity level.

**Import Necessary Libraries**

In [1]:
#import necessary libraries

import pandas as pd
from sklearn.linear_model import LinearRegression

In [2]:
df = pd.read_csv("datasets\\weather_data.csv")

### Task 1: Train a Linear Regression with Single Variable

1. Import the data from the "weather_data.csv" file and store it in a variable df.
2. Display the number of rows and columns in the dataset.
3. Display the first few rows of the dataset to get an overview.
4. Create a Linear Regression model and fit it using only the `hours_sunlight` variable to predict `daily_temperature`.
5. Print the model's coefficient and intercept.
6. Predict the daily temperature with the following hours of sunlight:
   - 5 hours
   - 8 hours
   - 12 hours

In [3]:
# Step 1: Import the data from the "weather_data.csv" file and store it in a variable 'df'

df = pd.read_csv("datasets\\weather_data.csv")
# Step 2: Display the number of rows and columns in the dataset
print(df.shape)

# Step 3: Display the first few rows of the dataset to get an overview
df.head()

(49, 3)


Unnamed: 0,hours_sunlight,humidity_level,daily_temperature
0,10.5,65,22.3
1,9.2,70,21.0
2,7.8,80,18.5
3,6.4,90,17.2
4,8.1,75,19.4


In [10]:
# Step 4: Create a Linear Regression model and fit it using only the 'hours_sunlight' variable to predict 'daily_temperature'
model = LinearRegression()
model.fit(df[["hours_sunlight"]], df["daily_temperature"])

In [11]:
# Step 5: Print the model's coefficient and intercept
model.coef_, model.intercept_

(array([1.36753934]), 8.533832092006133)

In [17]:
# Step 6: Predict the daily temperature for the following hours of sunlight: 5, 8, and 12
temp_predict = model.predict([[5], [8], [12]])


# Print the predicted temperatures
for index_temp, temp in enumerate(temp_predict):
    print(index_temp, temp)

0 15.371528783998372
1 19.474146799193715
2 24.944304152787506




### Task 2: Train a Linear Regression with Multiple Variable

- Create a Linear Regression model and fit it using both `hours_sunlight` and `humidity_level` variables to predict `daily_temperature`.
- Print the model's coefficients and intercept.
- Predict the daily temperature for the following conditions:
    - Hours of sunlight: 5 hours, Humidity level: 60%
    - Hours of sunlight: 8 hours, Humidity level: 75%
    - Hours of sunlight: 12 hours, Humidity level: 50%

In [18]:
#Step1: Create a Linear Regression model and fit it using both 'hours_sunlight' and 'humidity_level' variables to predict 'daily_temperature'
model = LinearRegression()
model.fit(df[["hours_sunlight", "humidity_level"]], df["daily_temperature"])

In [19]:
#Step2: Print the model's coefficients and intercept
model.coef_, model.intercept_

(array([ 1.09803993, -0.05430624]), 14.833402427458843)

In [29]:
# Step3: Predict the daily temperature for the following conditions:
# Hours of sunlight: 5 hours, Humidity level: 60%
# Hours of sunlight: 8 hours, Humidity level: 75%
# Hours of sunlight: 12 hours, Humidity level: 50%
test_data = pd.DataFrame([{"hours_sunlight" : 5, "humidity_level" : 60},
                          {"hours_sunlight" : 8, "humidity_level" : 75},
                          {"hours_sunlight" : 12, "humidity_level" : 50}])


In [30]:
model.predict(test_data)

array([17.06522775, 19.54475397, 25.29456968])

In [1]:
import pandas as pd
df = pd.read_csv("datasets\\home_prices.csv")
df

Unnamed: 0,area_sqr_ft,price_lakhs
0,656,39.0
1,1260,83.2
2,1057,86.6
3,1259,59.0
4,1800,140.0
5,1325,80.1
6,1085,116.0
7,1110,45.0
8,1700,100.0
9,960,89.0
