Skip to content

Latest commit

 

History

History
377 lines (191 loc) · 19.5 KB

README.md

File metadata and controls

377 lines (191 loc) · 19.5 KB

Must Read Papers for Data Science, ML, and DL

Curated collection of Data Science, Machine Learning and Deep Learning papers, reviews and articles that are on must read list.


NOTE: 🚧 in process of updating, let me know what additional papers, articles, blogs to add I will add them here.

How to use

👉 ⭐ this repo

Contributing

  • 👉 🔃 Please feel free to Submit Pull Request, if links are broken, or I am missing any important papers, blogs or articles.

Maintenance

👇 READ THIS 👇

  • 👉 Reading paper with heavy math is hard, it takes time and effort to understand, most of it is dedication and motivation to not quit, don't be discouraged, read once, read twice, read thrice,... until it clicks and blows you away.

🥇 - Read it first

🥈 - Read it second

🥉 - Read it third


Data Science

📊 Pre-processing & EDA

🥇 📄Data preprocessing - Tidy data - by Hadley Wickham

📓 General DS

🥇 📄 Statistical Modeling: The Two Cultures - by Leo Breiman

🥈 📄 A study in Rashomon curves and volumes: A new perspective on generalization and model simplicity in machine learning

🥇 📄 Frequentism and Bayesianism: A Python-driven Primer by Jake VanderPlas


Machine Learning

🎯 General ML

🥇 📄 Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning - by Sebastian Raschka

🥇 📄 A Brief Introduction into Machine Learning - by Gunnar Ratsch

🥉 📄 An Introduction to the Conjugate Gradient Method Without the Agonizing Pain - by Jonathan Richard Shewchuk

🥉 📄 On Model Stability as a Function of Random Seed

🔍 Outlier/Anomaly detection

🥇 📰 Outlier Detection : A Survey

🚀 Boosting

🥈 📄 XGBoost: A Scalable Tree Boosting System

🥈 📄 LightGBM: A Highly Efficient Gradient BoostingDecision Tree

🥈 📄 AdaBoost and the Super Bowl of Classifiers - A Tutorial Introduction to Adaptive Boosting

🥉 📄 Greedy Function Approximation: A Gradient Boosting Machine

📖 Unraveling Blackbox ML

🥉 📄 Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation

🥉 📄 Data Shapley: Equitable Valuation of Data for Machine Learning

✂️ Dimensionality Reduction

🥇 📄 A Tutorial on Principal Component Analysis

🥈 📄 How to Use t-SNE Effectively

🥉 📄 Visualizing Data using t-SNE

📈 Optimization

🥇 📄 A Tutorial on Bayesian Optimization

🥈 📄 Taking the Human Out of the Loop: A review of Bayesian Optimization


Famous Blogs

Sebastian Raschka Chip Huyen


🎱 🔮 Recommenders

Surveys

🥇 📄 A Survey of Collaborative Filtering Techniques

🥇 📄 Collaborative Filtering Recommender Systems

🥇 📄 Deep Learning Based Recommender System: A Survey and New Perspectives

🥇 📄 🤔 ⭐ Explainable Recommendation: A Survey and New Perspectives

Case Studies

🥈 📄 The Netflix Recommender System: Algorithms, Business Value,and Innovation

🥈 📄 Two Decades of Recommender Systems at Amazon.com

🥈 🌐 How Does Spotify Know You So Well?

👉 More In-Depth study, 📕 Recommender Systems Handbook


Famous Deep Learning Blogs 🤠

🌐 Stanford UFLDL Deep Learning Tutorial

🌐 Distill.pub

🌐 Colah's Blog

🌐 Andrej Karpathy

🌐 Zack Lipton

🌐 Sebastian Ruder

🌐 Jay Alammar


📚 Neural Networks and Deep Learning Neural Networks

⭐ 🥇 📰 The Matrix Calculus You Need For Deep Learning - Terence Parr and Jeremy Howard

🥇 📰 Deep learning -Yann LeCun, Yoshua Bengio & Geoffrey Hinton

🥇 📄 Generalization in Deep Learning

🥇 📄 Topology of Learning in Artificial Neural Networks

🥇 📄 Dropout: A Simple Way to Prevent Neural Networks from Overfitting

🥈 📄 Polynomial Regression As an Alternative to Neural Nets

🥈 🌐 The Neural Network Zoo

🥈 🌐 Image Completion with Deep Learning in TensorFlow

🥈 📄 Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

🥉 📄 A systematic study of the class imbalance problem in convolutional neural networks

🥉 📄 All Neural Networks are Created Equal

🥉 📄 Adam: A Method for Stochastic Optimization

🥉 📄 AutoML: A Survey of the State-of-the-Art

🖼️ CNNs

🥇 📄 Visualizing and Understanding Convolutional Networks -by Andrej Karpathy Justin Johnson Li Fei-Fei

🥈 📄 Deep Residual Learning for Image Recognition

🥈 📄AlexNet-ImageNet Classification with Deep Convolutional Neural Networks

🥈 📄VGG Net-VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

🥉 📄 A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction

🥉 📄 Large-scale Video Classification with Convolutional Neural Networks

🥉 📄 Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

⚫ CapsNet 🔱

🥇 📄 Dynamic Routing Between Capsules

🏞️ 💬 Image Captioning

🥇 📄 Show and Tell: A Neural Image Caption Generator

🥈 📄 Neural Machine Translation by Jointly Learning to Align and Translate

🥈 📄 StyleNet: Generating Attractive Visual Captions with Styles

🥈 📄 Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

🥈 📄 Where to put the Image in an Image Caption Generator

🥈 📄 Dank Learning: Generating Memes Using Deep Neural Networks

🚗 🚶‍♂️ Object Detection 🦅 🏈

🥈 📄ResNet-Deep Residual Learning for Image Recognition

🥈 📄 YOLO-You Only Look Once: Unified, Real-Time Object Detection

🥈 📄 Microsoft COCO: Common Objects in Context

🥈 📄 (R-CNN) Rich feature hierarchies for accurate object detection and semantic segmentation

🥈 📄 Fast R-CNN

🥈 📄 Faster R-CNN

🥈 📄 Mask R-CNN

🚗 🚶‍♂️ 👫 Pose Detection 🏃 💃

🥈 📄 DensePose: Dense Human Pose Estimation In The Wild

🥈 📄 Parsing R-CNN for Instance-Level Human Analysis

🔡 🔣 Deep NLP 💱 🔢

🥇 📄 A Primer on Neural Network Models for Natural Language Processing

🥇 📄 Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

🥇 📄 On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

🥇 📄 LSTM: A Search Space Odyssey - by Klaus Greff et al.

🥇 📄 A Critical Review of Recurrent Neural Networksfor Sequence Learning

🥇 📄 Visualizing and Understanding Recurrent Networks

⭐ 🥇 📄 Attention Is All You Need

🥇 📄 An Empirical Exploration of Recurrent Network Architectures

🥇 📄 Open AI (GPT-2) Language Models are Unsupervised Multitask Learners

🥇 📄 BERT: Pre-training of Deep Bidirectional Transformers forLanguage Understanding

🥉 📄 Parameter-Efficient Transfer Learning for NLP

🥉 📄 A Sensitivity Analysis of (and Practitioners’ Guide to) ConvolutionalNeural Networks for Sentence Classification

🥉 📄 A Survey on Recent Advances in Named Entity Recognition from Deep Learning models

🥉 📄 Convolutional Neural Networks for Sentence Classification

🥉 📄 Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

🥉 📄 Single Headed Attention RNN: Stop Thinking With Your Head

👽 GANs

🥇 📄 Generative Adversarial Nets - Goodfellow et al.

📚 GAN Rabbit Hole -> GAN Papers

⭕➖⭕ GNNs (Graph Neural Networks)

🥉 📄 A Comprehensive Survey on Graph Neural Networks


👨‍⚕️ 💉 Medical AI 💊 🔬

Machine learning classifiers and fMRI: a tutorial overview - by Francisco et al.


👇 Cool Stuff 👇

🔊 📄 SoundNet: Learning Sound Representations from Unlabeled Video

🎨 📄 CAN: Creative Adversarial NetworksGenerating “Art” by Learning About Styles andDeviating from Style Norms

🎨 📄 Deep Painterly Harmonization

🕺 💃 📄 Everybody Dance Now

Soccer on Your Tabletop

👱‍♀️ 💇‍♀️ 📄 SC-FEGAN: Face Editing Generative Adversarial Network with User's Sketch and Color

📸 📄 Handheld Mobile Photography in Very Low Light

🏯 🕌 📄 Learning Deep Features for Scene Recognitionusing Places Database

🚅 🚄 📄 High-Speed Tracking withKernelized Correlation Filters

🎬 📄 Recent progress in semantic image segmentation

Rabbit hole -> 🔊 🌐 Analytics Vidhya Top 10 Audio Processing Tasks and their papers

:blonde_man: -> 👴 📄 📄 Face Aging With Condintional GANS

:blonde_man: -> 👴 📄 📄 Dual Conditional GANs for Face Aging and Rejuvenation

⚖️ 📄 BAGAN: Data Augmentation with Balancing GAN

labml.ai Annotated PyTorch Paper Implementations


📰 Cap Stone Projects 📰

8 Awesome Data Science Capstone Projects

10 Powerful Applications of Linear Algebra in Data Science

Top 5 Interesting Applications of GANs

Deep Learning Applications a beginner can build in minutes


CHANGELOG

2019-10-28 Started must-read-papers-for-ml repo

2019-10-29 Added analytics vidhya use case studies article links

2019-10-30 Added Outlier/Anomaly detection paper, separated Boosting, CNN, Object Detection, NLP papers, and added Image captioning papers

2019-10-31 Added Famous Blogs from Deep and Machine Learning Researchers

2019-11-1 Fixed markdown issues, added contribution guideline

2019-11-20 Added Recommender Surveys, and Papers

2019-12-12 Added R-CNN variants, PoseNets, GNNs

2020-02-23 Added GRU paper