TELCO CUSTOMER CHURN

About

Solution for customer churn prediction. You can find used dataset here.

Built with

Python 3.9
Spark 3.2.1

Getting started

To get a local copy of the repository please run the following commands on your terminal:

$ cd <folder>
$ git clone https://github.com/charapennikaurm/telco_customer_churn

Install python dependencies

$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Install required Spark jars

$ wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/1.12.211/aws-java-sdk-1.12.211.jar -P $SPARK_HOME/jars/
$ wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-s3/1.12.211/aws-java-sdk-s3-1.12.211.jar -P $SPARK_HOME/jars/
$ wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-core/1.12.211/aws-java-sdk-core-1.12.211.jar -P $SPARK_HOME/jars/
$ wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-dynamodb/1.12.211/aws-java-sdk-dynamodb-1.12.211.jar -P $SPARK_HOME/jars/
$ wget https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/3.3.1/hadoop-aws-3.3.1.jar -P $SPARK_HOME/jars/
$ wget https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kinesis-asl_2.13/3.2.1/spark-streaming-kinesis-asl_2.13-3.2.1.jar -P $SPARK_HOME/jars/
$ wget https://repo1.maven.org/maven2/com/qubole/spark/spark-sql-kinesis_2.12/1.2.0_spark-3.0/spark-sql-kinesis_2.12-1.2.0_spark-3.0.jar -P $SPARK_HOME/jars/

Model training

To run training, put your own env variables in src/models/example.env and then run following commands:

$ source src/models/example.env
$ python src/models/run.py

Notes

Hyperopt is used for hyperparameter optimization
Knock Knock is used for telegram notifications about training
Best trained model can be saved locally as well as to Amazon S3

Inference

Local inference

This options starts a process that connects to AWS Kinesis using SparkStreaming. The code is located in src/inference To run it set env variables in file src/inference/example.env and then run following commands:

$ source src/inference/example.env
$ python src/inference/inference.py

AWS Lambda Inference

The code presented in src/lambda_inference is used for runnig inference using AWS Lambda. Lambda function is triggered Kinesis Stream when records arrives. Docker container is used because PySpark is too big to upload code to AWS Lambda as zip-package.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks

notebooks

src

src

.gitignore

.gitignore

.pre-commit-config.yaml

.pre-commit-config.yaml

README.md

README.md

requirements.txt

requirements.txt

setup.cfg

setup.cfg

Repository files navigation

TELCO CUSTOMER CHURN

About

Built with

Getting started

Install python dependencies

Install required Spark jars

Model training

Notes

Inference

Local inference

AWS Lambda Inference

About

Releases

Packages

Languages

charapennikaurm/telco_customer_churn

Folders and files

Latest commit

History

Repository files navigation

TELCO CUSTOMER CHURN

About

Built with

Getting started

Install python dependencies

Install required Spark jars

Model training

Notes

Inference

Local inference

AWS Lambda Inference

About

Resources

Stars

Watchers

Forks

Languages