R project in Data Science for Business program X-HEC
SUBJECT - "Win money when your train is late"
SNCF trains are reportedly making a habit of being late (30 min or more)
We decided to deep dive into SNCF historic datasets and its real time API to gather some insights. We want to build a model predicting how often a train of a specific line / period of time.
Our objective is to build a model able to give us a confidence interval of the probability that a specific train on a specific line will be late by 30 min or more. Then, share those results to the public in the format of a dashboard.
Creation of a basic MVP on which you can bet if the train is going to be late or not. Our system will automatically compute the quote of a specific train and compare it the the real data. If you bet correctly you get your money*1/probability