Exploratory Data Analysis (EDA), Data cleaning & Data preprocessing and Features Engineering,
-
Updated
Feb 24, 2022 - HTML
Exploratory Data Analysis (EDA), Data cleaning & Data preprocessing and Features Engineering,
Reviewed unstructured data to understand the patterns and natural categories that the data fits into. Used multiple algorithms and both empirically and theoretically compared and contrasted their results. Made predictions about the natural categories of multiple types in a dataset, then checked these predictions against the result of unsupervise…
End-to-end projects: customer churning prediction using the Random Forest Classifier Algorithm with 97% accuracy; performing pre-processing steps; EDA and Visulization fitting data into the algorithm; and hyper-parameter tuning to reduce TN and FN values to perform our model with new data. Finally, deploy the model using the Streamlit web app.
Reviewed unstructured data to understand the patterns and natural categories that the data fits into. Made predictions about the natural categories of multiple types in a dataset, then checked these predictions against the result of unsupervised analysis.
This workshop is part of the "Machine Learning in R" graduate course held at University of Münster, School of Business and Economics (winter term 2020/21). 🎓
Forecasting Oil Prices with Time Series & Generalized Additive Models for Location, Scale and Shape
Which of the top 10 most common features OR activities make trails in national parks most popular OR highest rated
Homework Solutions for Statistical Learning Course as Computer Science B.Sc. Student at Department of Mathematical Sciences, Sharif University of Technology
A Web application that estimates your app's rating in google play store based on the features you decide
Statistical Learning with Applications in R
Complete data analysis project for the Statistical learning course. Data from a store dataset coming from Kaggle is used.
Basketball project for the Kaggle competition: "Kobe Bryant Shot Selection". While the competition is closed, my best submission currently places me in the 93rd percentile (top 7%) of the leaderboard for this competition.
Detecção de Fraudes no Tráfego de Cliques em Propagandas de Aplicações Mobile
Wind energy prediction employing PySpark in Databricks.
Investigate the reasons behind bankruptcy and attempt to identify early warning signs. Perform exploratory data analytics using pandas profiling and apply missing value treatments and oversampling
VAST Challenge MC2 using Python Altair
Creating Customer Segments
Collaboration with TSMC
Machine Learning Classification on Unbalanced Real World Dataset
Detecting potential corruption events from public expenditure time-series data
Add a description, image, and links to the feature-selection topic page so that developers can more easily learn about it.
To associate your repository with the feature-selection topic, visit your repo's landing page and select "manage topics."