Skip to content

ayush-agarwal-0502/Cassandra22-Data-Science

Repository files navigation

Cassandra22-Data-Science

Invoice payement time prediction model .

Solution to cassandra 22 , data science event

Introduction :

Our team secured 3rd position in "CASSANDRA" in 2022 , the Data Science event held under "Udyam" , the Electronics Department fest of IIT BHU . This repository contains our work for the event .

image

(Link to leaderboard - https://www.udyamfest.com/leaderboard (link may not work after 2022) )

Link to the code - https://github.com/ayush-agarwal-0502/Cassandra22-Data-Science/blob/main/Cassandra_PAV_BHU_JEE.ipynb (uploaded to this repository )

Link to the competition - https://www.kaggle.com/competitions/cassandra-udyam-2022/overview

Link to the dataset - https://www.kaggle.com/competitions/cassandra-udyam-2022/data (I've also added a copy of the dataset to this repository in case this dosen't work )

Link to Final presentation slides on canva - https://www.canva.com/design/DAE9kYtOh4I/ewkPV5L1gdrpSoMIfRGwFA/view#4 ( Can refer to the slides in this repository too if link dosent work )

The PS and the Solution :

For PS , refer to the Kaggle page whose link is given above . Roughly explaining , we were required to predict when an invoice would be paid back , in number of days , based on the data given to us .

For the explanation of out solution , best refer the slides , since it explains everything about how we solved , I'll add a few important and noteworthy slides in this readme too .

Noteworthy slides :

The difference feature :

image

Fraud detection in invoices using this dataset :

image

Mutual Information Scores (and Pandas Profiling report correlation) :

image

Usage of K-Mean Clustering in predictions :

image

Releases

No releases published

Packages

No packages published