-
The Web Application Firewall (WAF) project is aimed at developing a robust and comprehensive solution to safeguard web applications against a wide range of security threats. The project incorporates machine learning techniques for threat detection, real-time monitoring, and response mechanisms to protect web applications from common attacks such as Cross-Site Scripting (XSS), SQL Injection (SQLI), Command Injection (CMDI), and Path Traversal (PATHT).
-
Project Paper: Project Paper Link
-
Project Video: Project Video Link
-
Project Website: Project Website Link
- This application helps you find the most suitable job based on your technological skills. Additionally, it suggests skills to develop further in your chosen field.
- project steps : Features selection ,Data cleaning ,Feature Engineering ,EDA ,Data preprocessing ,Modeling ,prediction pipeline ,Deployment
- Project article: Medium
- project web site : SteamLit
- The Hybrid-based Book Recommendation System encompasses an end-to-end development process, incorporating data exploration, cleaning, and the implementation of both content-based and collaborative filtering recommendation systems using Python.
- Project article: Medium
- This project is dedicated to constructing an advanced movie recommendation system by leveraging popularity and content-based filtering. The model integrates diverse data features, such as genres, cast, crew, overview, release year, runtime, keywords, vote average, and director information, obtained from a comprehensive dataset of several thousand films. The recommendation system employs advanced techniques like cosine similarity and Euclidean distance to enhance the accuracy and personalization of movie suggestions.
- project steps : Features selection ,Data cleaning ,Feature Engineering ,Data preprocessing , Similarity Matrix ,prediction app ,Deployment
- project web site : SteamLit
-
Airbnb-Listing-EDA: This project involves performing an exploratory data analysis (EDA) on Airbnb listing data for a particular city. The analysis focuses on factors such as price, availability, location, and property type to identify trends and patterns in the demand for Airbnb listings in the city. The project includes data cleaning, visualization, and statistical analysis.
-
Electronics store sales EDA: In this data analysis project, I took on the challenge of exploring 12 months' worth of sales data for an electronics store. The dataset included a vast array of purchase information, including product types, costs, purchase addresses, and more.
-
Startup expansion: In this project, I worked with a comprehensive dataset that provided valuable insights into the dynamics of startup growth, covering key aspects such as location, marketing spending, and revenue. and I made a Dashboard and Report To ensure that the project's insights were easily digestible
-
tracking Maji Ndogo's water funds (Power BI): Our mission is to communicate with transparency: Where did the money go? We will track the total budget against project completion, monitor teams' performance, and compare budgeted versus actual costs to flag potential corruption, promoting transparency and accountability in addressing Maji Ndogo’s water crisis.
- FastML With EDA python package (My own package) : FastML_With_EDA is a versatile Python package designed to simplify the machine learning pipeline, from exploratory data analysis (EDA) and preprocessing to automated machine learning (AutoML) model training and evaluation. Whether you're a beginner or an experienced data scientist, FastML provides the tools you need to streamline your workflow and make data-driven decisions.
- I also created a WEB APP as GUI for this package
-
Heart disease prediction (Logistic Regression): Create a Classification Model that can predict whether or not a person has the presence of heart disease based on physical features of that person (age, sex, cholesterol, etc...).
-
Detecting a Rock or a Mine (KNN): Create a machine learning model capable of detecting the difference between a rock or a mine based on the response of the 60 separate sonar frequencies.
-
Fraud in Wine (SVM)): Develop a machine learning model that attempts to predict if a wine is "Legit" or "Fraud" based on various chemical features.
-
Customer Churn (Tree Methods Focus)): Create a model to predict whether or not a customer will Churn. using [Decision Tree, Random Forest, AdaBoost, Gradient Boosting]
-
Text Classification (NLP , Naive_bayes): Classifying text reviews of films as positive or negative
-
Customer Segmentation for E commerce (KNN , PCA): This project focuses on customer segmentation using K-means clustering and PCA (Principal Component Analysis). The goal is to identify distinct groups of customers based on their purchasing behavior in an e-commerce dataset. Customer segmentation enables personalized marketing strategies and recommendations.
-
PCA Manual Implementation: Principal Component Analysis (PCA) is a dimensionality reduction technique commonly used in machine learning and data analysis. This project provides step-by-step instructions for implementing PCA.
-
Monthly Retail Sales Forecasting: This project involves forecasting the monthly sales for clothing and clothing accessory stores using LSTM (Long Short-Term Memory) neural networks. The dataset consists of monthly sales data (in millions of dollars, not seasonally adjusted) retrieved from the FRED, Federal Reserve Bank of St. Louis.
-
Tanserflow and keras regression: This project analyzes housing data using Pandas, NumPy, and TensorFlow & keras. It explores data distribution, geographical properties, and performs feature engineering. A neural network model is built, trained, and evaluated for predicting house prices, achieving insights and predictions.
-
Tanserflow and keras classification: The project conducts t-tests on each feature to assess its significance in predicting a binary target variable (breast cancer diagnosis). It then selects features with p-values below a significance level (0.05). The data is split into training and testing sets, scaled, and used to train a neural network model. And addresses overfitting by implementing early stopping and dropout techniques in the neural network model
-
LendingClub Loan Default Prediction: This project aims to build a Deep learning model to predict whether a borrower will repay their loan based on historical loan data from LendingClub. LendingClub is a peer-to-peer lending platform that facilitates loan origination by connecting borrowers with investors. The dataset includes various features such as loan amount, interest rate, borrower employment details, and credit history. The main target variable for prediction is loan_status, which indicates if a loan was "Fully Paid" or "Charged Off" (defaulted).
-
Image Classification with CNN and Transfer learning : This project utilizes a Convolutional Neural Network (CNN) with TensorFlow and Keras to classify cat and dog images. It demonstrates transfer learning with the VGG16 model, improving model performance and reducing training time. Additionally, it offers insights into interpreting model decisions, aiding in understanding how the CNN makes predictions based on learned features from the images.
-
Malaria Cell Image Classification: This project aims to classify cell images as either parasitized or uninfected using a Convolutional Neural Network (CNN). The dataset consists of cell images with labels indicating whether the cells are parasitized by the malaria parasite or not. The goal is to build and train a CNN model that can accurately classify the cell images.
- Monthly Retail Sales Forecasting: This project involves forecasting the monthly sales for clothing and clothing accessory stores using LSTM (Long Short-Term Memory) neural networks. The dataset consists of monthly sales data (in millions of dollars, not seasonally adjusted) retrieved from the FRED, Federal Reserve Bank of St. Louis.
Time series forecasting using LSTM
- Self Organizing Map for Fraud Detection: implementation of a Self-Organizing Map (SOM) for detecting fraud in credit card applications. The SOM is trained on a dataset of credit card applications to learn patterns and identify potential outliers that may indicate fraudulent activities.
-
AutoEncoder for Movie Recommendations: This project implements a Stacked AutoEncoder (SAE) for movie recommendations using PyTorch. The SAE is trained on the MovieLens 100k dataset to learn the underlying patterns in user-movie interactions and predict ratings for unrated movies.
-
Text classification using TFIDF: The goal of this project is to develop a machine learning model that can accurately classify tweets as either related to real disasters or not.
-
Text Classification with GloVe and LSTM: This project employs LSTM neural networks for text classification on disaster-related messages. It preprocesses and tokenizes text data, utilizes pre-trained word embeddings, and trains the model with Keras. Finally, it evaluates the model's performance, generates predictions, and creates a submission file for Kaggle competition.
-
Text Generation with GRU: This code employs TensorFlow and Keras to build a text generation model. It preprocesses Shakespearean text data, creating sequences for training. The model architecture includes an Embedding layer and a GRU layer. After training, the model can generate text based on given starting seeds. This approach facilitates the generation of coherent text sequences.