Skip to content
View JensBender's full-sized avatar
Block or Report

Block or report JensBender

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
JensBender/README.md

Profile-banner

Hi there! I'm Jens

👋 About Me

I'm an enthusiastic data scientist with over eight years of experience in data analysis, data visualization, and data storytelling. I enjoy solving challenging problems, harnessing the power of machine learning to derive valuable insights, and effectively communicating complex information.

🛠️ Skills

Category Skill
Programming Python MySQL
Data Manipulation NumPy Pandas
Data Visualization Matplotlib Plotly Power BI
Machine Learning scikit-learn TensorFlow
Big Data Spark
Web Framework Flask
Cloud Computing AWS
Version Control Git
Code Editors Jupyter Notebook PyCharm Spyder Google Colab

💻 Portfolio

  • Motivation: Simplify the process of finding rental properties in Singapore's expensive real estate market by using machine learning to estimate rental prices.
  • Data collection: Scraped 1680 property listings from an online property portal, including information on price, size, address, bedrooms, bathrooms and more.
  • Exploratory data analysis: Visualized property locations on an interactive map, generated a word cloud to extract insights from property agent descriptions, and examined descriptive statistics, distributions, and correlations.
  • Data preprocessing: Handled missing address data and engineered location-related features using the Google Maps API, extracted property features from agent descriptions and systematically evaluated multiple outlier handling methods.
  • Model training: Trained five machine learning models with baseline configurations, selected an XGBoost regression model with optimized hyperparameters, and achieved a test dataset performance with an RMSE of 995, a MAPE of 0.13, and an R² of 0.90.
  • Model deployment: Created a web application for serving the XGBoost model using the Flask framework. Containerized this application using Docker and successfully deployed the Docker container on render.com.
  • Motivation: Develop a hate speech detector for social media comments.
  • Data: Utilized the ETHOS Hate Speech Detection Dataset.
  • Models: Trained and evaluated the performance of three deep learning models using TensorFlow and scikit-learn. The fine-tuned BERT model demonstrated superior performance (78.0% accuracy) compared to the SimpleRNN (66.3%) and LSTM (70.7%) models.
  • Deployment: Prepared the fine-tuned BERT model for production by integrating it into a web application and an API endpoint using the Flask web framework.
Fine-tuned BERT: Confusion Matrix Model Deployment
BERT-confusion-matrix
  • Developed an AI-assisted cover letter generator that empowers job seekers in crafting personalized and professional cover letters tailored to specific job offers.
  • Scraped job postings by employing Python and Beautiful Soup and utilized the ChatGPT API to extract key information, including requirements and tasks, in JSON format.
  • By leveraging the ChatGPT API further, cover letter suggestions were generated, aligning the candidate's education, work experience, skills, and motivation with the specific job's requirements and tasks.
{
  "employer": "OpenAI",
  "job title": "Research Scientist",
  "requirements": [
    "Track record of coming up with new ideas or improving 
    upon existing ideas in machine learning",
    "Ability to own and pursue a research agenda",
    "Excitement about OpenAI's approach to research",
    "Nice to have: Interested in and thoughtful about the 
    impacts of AI technology",
    "Nice to have: Past experience in creating high-performance 
    implementations of deep learning algorithms"
  ],
  "tasks": [
    "Develop innovative machine learning techniques",
    "Advance the research agenda of the team",
    "Collaborate with peers across the organization"
  ],
  "contact person": "unknown",
  "address": "San Francisco, California, United States"
}

🏅 Course Certificates

Advanced SQL: MySQL for Ecommerce & Web Analytics, Udemy, February 2024, 🔗 see certificate
Skills: MySQL · SQL

AWS Certified Cloud Practitioner, AWS, January 2024, 🔗 see certificate
Skills: Amazon Web Services (AWS)

Ultimate AWS Certified Cloud Practitioner CLF-C02, Udemy, January 2024, 🔗 see certificate
Skills: Amazon Web Services (AWS)

Spark and Python for Big Data with PySpark, Udemy, January 2024, 🔗 see certificate
Skills: Spark · PySpark · AWS · Python · Machine Learning · Linear Regression · Logistic Regression · Decision Trees · Random Forest · Gradient Boosting · k-means clustering · Recommender Systems · Natural Language Processing (NLP)

Microsoft Power BI Data Analyst, Udemy, November 2023, 🔗 see certificate
Skills: Power BI

Deep Learning, alfatraining Bildungszentrum GmbH, April 2023
Skills: TensorFlow · NumPy · Natural Language Processing (NLP) · Python · Deep Learning · Recurrent Neural Networks (RNN) · Neural Networks · Scikit-Learn · Reinforcement Learning · Transfer Learning · Convolutional Neural Networks (CNN) · Time Series Analysis

Machine Learning by Stanford University & DeepLearning.AI, Coursera, April 2023, 🔗 see certificate
Skills: Decision Trees · Recommender Systems · Anomaly Detection · Python · Linear Regression · Neural Networks · Logistic Regression · Reinforcement Learning · Principal Component Analysis · k-means clustering

Python for Machine Learning & Data Science Masterclass, Udemy, March 2023, 🔗 see certificate
Skills: Decision Trees · Support Vector Machine (SVM) · Matplotlib · Random Forest · Naive Bayes · NumPy · Seaborn · Hierarchical Clustering · Natural Language Processing (NLP) · Pandas · Python · Linear Regression · Scikit-Learn · Logistic Regression · Principal Component Analysis · Gradient Boosting · DBSCAN · k-means clustering · K-Nearest Neighbors (KNN)

Machine Learning, alfatraining Bildungszentrum GmbH, February 2023
Skills: Decision Trees · Support Vector Machine (SVM) · Matplotlib · Naive Bayes · NumPy · Hierarchical Clustering · Pandas · Python · Linear Regression · Neural Networks · Scikit-Learn · Principal Component Analysis · DBSCAN · k-means clustering · K-Nearest Neighbors (KNN)

The Ultimate MySQL Bootcamp: Go from SQL Beginner to Expert, Udemy, December 2022, 🔗 see certificate
Skills: MySQL · SQL

👨‍💻 GitHub Statistics

Top Languages

©️ Credits

Profile banner GIF based on the video by RDNE Stock project from Pexels

Popular repositories Loading

  1. rental-price-prediction rental-price-prediction Public

    A machine learning project for estimating rental property prices in Singapore.

    Jupyter Notebook 3

  2. chatgpt-cover-letter-generator chatgpt-cover-letter-generator Public

    AI-assisted cover letter writing leveraging the ChatGPT API and webscraping techniques.

    Jupyter Notebook 2 2

  3. internet-speed-database internet-speed-database Public

    Utilize webscraping to automate internet speed tests and file complaints to the internet service provider via Twitter.

    Python

  4. hate-speech-detection hate-speech-detection Public

    Employing deep learning techniques to train and deploy a hate speech detection model for social media comments.

    PureBasic 1

  5. JensBender JensBender Public

  6. youtube-channel-analytics youtube-channel-analytics Public

    Jupyter Notebook