Skip to content

This repository contains the code for a K-Nearest Neighbors (KNN) model built to classify customer segments in Türkiye using the teleCust1000T dataset. The project includes data cleaning, visualization, feature scaling, model training, and evaluation with accuracy metrics.

License

Notifications You must be signed in to change notification settings

Prometheussx/knn-customer-segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Customer Segmentation With K-NN

Project: Customer Segmentation with K-Nearest Neighbors Algorithm

This project aims to segment customers in the teleCust1000T dataset using the K-Nearest Neighbors (KNN) algorithm. The project involves data visualization, feature analysis, model training and evaluation, and identification of the optimal number of neighbors for KNN.

Data and Libraries

The project utilizes the following:

  • Data: teleCust1000T.csv containing information about customers, such as tenure, age, income, and customer category.
  • Libraries: NumPy, Pandas, Scikit-learn, matplotlib

You can install these libraries using pip:

pip install numpy
pip install pandas
pip install scikit-learn
pip install seaborn
pip install matplotlib

Project Structure

The project is organized into the following sections:

  1. Data Import and Exploration: Reads the CSV data, analyzes data distribution, and identifies potential outliers.

  2. Feature Selection: Selects relevant features for the KNN model.

  3. Data Preprocessing: Standardizes numeric features and encodes categorical features.

  4. Train-Test Split: Divides data into training and testing sets for model training and evaluation.

  5. KNN Model Training: Trains a KNN model with different values of K.

  6. Model Evaluation: Evaluates the performance of trained models using metrics like accuracy and confusion matrix.

  7. Finding Optimal K: Identifies the optimal number of neighbors for KNN based on model performance.

  8. Visualization: Plots data distributions, accuracy curves, and confusion matrices for different K values.

  9. Results and Conclusion: Summarizes key findings and interpretations of the KNN model's performance.

Clone the project repository:

git clone https://github.com/Prometheussx/knn-customer-segmentation.git
cd knn-customer-segmentation

Key Results

  • The implemented KNN model achieved a best accuracy of [accuracy value]% with [optimal K value] neighbors.
  • The model was able to successfully identify patterns and segment customers into different categories based on their features.
  • The results demonstrate the effectiveness of KNN for customer segmentation and provide valuable insights for targeted marketing campaigns.

Future Work

  • This project can be extended by incorporating additional features and exploring other machine learning algorithms for customer segmentation.
  • Further analysis could be done to understand the influence of individual features on customer segmentation and develop explainable models.
  • The model could be integrated into a real-world application for customer targeting and personalized recommendations.

Images

The README.md file includes images to visualize data distributions, accuracy curves, and confusion matrices for different K values. This enhances the understanding of the project's results and provides visual aids for interpreting the KNN model's performance.

image

License

This project is released under the MIT License.

Author

Feel free to reach out if you have any questions or need further information about the project.

About

This repository contains the code for a K-Nearest Neighbors (KNN) model built to classify customer segments in Türkiye using the teleCust1000T dataset. The project includes data cleaning, visualization, feature scaling, model training, and evaluation with accuracy metrics.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages