#Lab6
This repository contains the code and analysis for a Machine Learning lab (ARTI308) focused on implementing Linear Regression. The project is divided into two parts, applying predictive modeling to two distinct industries: Real Estate and E-commerce.
The goal of this project is to build machine learning models that can predict continuous target variables based on various features.
- Part 1 (USA Housing): Helping a real estate agent predict house prices based on regional features.
- Part 2 (E-commerce): Helping an online business decide whether to focus their efforts on their mobile app experience or their website by predicting customer yearly spend.
- Python 3
- Jupyter Notebook
- Pandas (Data manipulation and analysis)
- NumPy (Numerical operations)
- Matplotlib & Seaborn (Data visualization)
- Scikit-Learn (Machine Learning model building and evaluation)
Contains information about houses in various regions of the United States.
- Features: Avg. Area Income, Avg. Area House Age, Avg. Area Number of Rooms, Avg. Area Number of Bedrooms, Area Population.
- Target Variable: Price (The price the house sold at).
Contains customer data from an online clothing store that offers both in-store style advice sessions and online platforms (App and Website).
- Features: Avg. Session Length, Time on App, Time on Website, Length of Membership.
- Target Variable: Yearly Amount Spent.
- Data Loading & Exploration: Importing the datasets, checking data types, and exploring basic statistical summaries (
head(),info(),describe()). - Exploratory Data Analysis (EDA): Visualizing data distributions and relationships between variables using Seaborn.
- Data Preparation: Splitting the data into feature variables (
X) and target variables (y), and usingtrain_test_splitto create training and testing sets. - Model Training: Initializing and fitting a
LinearRegressionmodel from Scikit-Learn. - Model Evaluation: Interpreting model coefficients and calculating regression evaluation metrics (MAE, MSE, RMSE).
Based on the Linear Regression coefficients, we discovered how different customer behaviors impact their yearly spend:
- Length of Membership is the most significant factor. For every 1 unit increase in Length of Membership, the Yearly Amount Spent increases by ~$61.28.
- Time on App is significantly more profitable than Time on Website. A 1 unit increase in Time on App results in an increase of ~$38.59 spent, whereas a 1 unit increase in Time on Website only results in a ~$0.19 increase.
- Business Recommendation: The company should either focus on further developing their Mobile App (since it is already working well and driving sales) or overhaul their Website to catch up to the App's performance. Fostering customer loyalty (Length of Membership) should also be a top priority.
- Clone this repository to your local machine:
git clone [https://github.com/your-username/your-repo-name.git](https://github.com/your-username/your-repo-name.git)