Skip to content
/ Bloc-3 Public

Bloc n°3 : Analyse prédictive de données structurées par l'intelligence artificielle.

Notifications You must be signed in to change notification settings

g0thier/Bloc-3

Repository files navigation

Bloc n°3 : Analyse prédictive de données structurées par l'intelligence artificielle.

Contact

voguant-cal0n@icloud.com

Video explain

Bloc n°3 : Analyse prédictive de données structurées par l'intelligence artificielle.

Goals

Walmart Sales

  • Part 1 : make an EDA and all the necessary preprocessings to prepare data for machine learning
  • Part 2 : train a linear regression model (baseline)
  • Part 3 : avoid overfitting by training a regularized regression model

Conversion Rate Challenge

  • Part 1 : make an EDA and the preprocessings and train a baseline model with the file data_train.csv
  • Part 2 : improve your model's f1-score on your test set (you can try feature engineering, feature selection, regularization, non-linear models, hyperparameter optimization by grid search, etc...)
  • Part 3 : Once you're satisfied with your model's score, you can use it to make some predictions with the file data_test.csv. You will have to dump the predictions into a .csv file that will be sent to Kaggle (actually, to your teacher/TA 🤓). You can make as many submissions as you want, feel free to try different models !
  • Part 4 : Take some time to analyze your best model's parameters. Are there any lever for action that would help to improve the newsletter's conversion rate ? What recommendations would you make to the team ?

Uber Pickups.

  • Create an algorithm to find hot zones
  • Visualize results on a nice dashboard

Informations about files:

Walmart Sales

  • 01 EDA : This file convert date to variables and categorie int to object for futur preprocessing
  • 02 Linear regression : This Linear regression find the bests explications value for find the Weekly_Sales
  • 03 Lasso Rigde : This model find a solution for a little bit prediction about 0.957 vs 0.943 (+ 1.4 %)

Conversion Rate Challenge

  • 01 EDA : Dataset is cleaned and columns are plots on converted users
  • 02 Model Prediction : A random forest predict a model and tree of decision is display
  • 03 Model Conclusion : The model is reused on test set and plot explains values for this conclusion.

Uber Pickups.

  • 01 Uber Data : Datasets are merges and date cleaned
  • 02 Uber Kmeans : Clusters are found with a kmeans algorithme

About

Bloc n°3 : Analyse prédictive de données structurées par l'intelligence artificielle.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published