A Machine Learning project analyzing a dataset of laptops to understand price determinants and predict laptop prices and types. This project involves data analysis, visualization, and machine learning using Python.
This project explores a dataset containing information on various laptop models, including:
- Brand, Product, and Type
- Screen size and resolution
- RAM, storage, and CPU/GPU specifications
- Operating system, weight, and price in euros
The goal is to analyze the factors affecting laptop prices, build predictive models, and visualize data patterns.
The dataset laptop_prices.csv contains the following columns:
Company,Product,TypeName,Inches,Ram,OS,WeightPrice_euros,Screen,ScreenW,ScreenH,RetinaDisplayCPU_company,CPU_freq,CPU_modelPrimaryStorage,SecondaryStorage,PrimaryStorageType,SecondaryStorageTypeGPU_company,GPU_model
The project addresses the following tasks:
- Brand Analysis: Identify the top 5 laptop brands by the number of products.
- Price Analysis: Calculate average laptop prices per brand and find the highest/lowest average.
- Correlation Analysis: Find correlations between
Price_eurosand numeric features likeCPU_freq,Ram,Inches, andWeight. - Feature Engineering: Create a new feature
StorageTotal=PrimaryStorage+SecondaryStorage. - Regression Modeling: Predict
Price_eurosusing features likeRam,Inches,CPU_freq,PrimaryStorage, andGPU_company. Suggest the best regression model. - Classification Modeling: Predict
TypeNameusing features likeInches,Ram,PrimaryStorage, andWeight. Identify important features and the best classification model.
- Load the dataset in a Jupyter Notebook or Google Colab:
import pandas as pd
df = pd.read_csv("laptop_prices.csv")-
Perform analysis and modeling using Python libraries like
pandas,numpy,matplotlib,seaborn, andscikit-learn. -
Follow the project goals to answer all analysis and modeling questions.
pandasnumpymatplotlibseabornscikit-learn
Install via pip if needed:
pip install pandas numpy matplotlib seaborn scikit-learn