Project Overview
This project involves comprehensive data analysis and predictive modeling using R Studio. The project leverages multiple datasets, including Iris, DHFR, and Boston Housing, to explore data patterns, build machine learning models, and provide insights through visualizations. The focus is on using various R packages and techniques to preprocess data, train models, and evaluate their performance.
Datasets Used
Iris Dataset: A classic dataset used for pattern recognition and classification. DHFR Dataset: A dataset used to predict biological activity based on molecular features. Boston Housing Dataset: A dataset used to predict median house prices based on various features.
For the Iris dataset, explored species-specific patterns and built a Support Vector Machine (SVM) model with a polynomial kernel. The model was developed using stratified random sampling and cross-validation to ensure robust species classification. Visualization techniques, including scatter plots and feature importance plots, were used to communicate model insights effectively.
In the DHFR dataset analysis, focused on predicting biological activity by implementing data cleaning processes, such as removing zero variance variables and handling missing data through imputation. Developed and evaluated SVM models and Random Forest models, leveraging parallel processing to optimize computational efficiency. The project included visualizations of model performance and feature importance to ensure reliable predictions.
For the Boston Housing dataset, the project involved predicting median house prices using a linear regression model. Data preprocessing steps included scaling and centering features, followed by stratified random sampling for accurate model training and testing. The model’s performance was assessed through scatter plots comparing actual versus predicted values, providing insights into the factors driving housing prices.
Contact
For questions, suggestions, or collaboration opportunities, feel free to reach out:
Email: herc.ju@gmail.com