Skip to content

gmonce/datascience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Computer Science, Machine Learning & Natural Language Processing

Under (permanent) construction

Historical papers

  1. On Computable Numbers with an Application to the Entscheidungsproblem - By A.M. Turing, 1936
  2. Computing Machinery and Intelligence By A.M. Turing, 1950.
  3. A Mathematical Theory of Communication By C. Shannon, 1948
  4. The perceptron: A probabilistic model for information storage and organization in the brain by F. Rossenblat.
  5. Learning representations by back-propagating errors By D.Rumelhart, G.Hinton y R.Williams.

Books

  1. Probability Theory: The Logic of Science By E.T. Jaynes
  2. Speech and Language Processing By J.H. Martin and D. Jurafsky (3rd edition draft)
  3. Reinforcement Learning: an Introduction - By R. Sutton and A. Barto
  4. Machine Learnning - By T. Mitchell

General Surveys

  1. Turing Machines - An article in 1984 Scientific American by John E. Hopcropt, about Turing Machines, A.M. Turing, and the history of computability and computational complexity.
  2. Deep Learning - A review of Deep Learning for Nature. By LeCun, Bengio & Hinton

Foundations

  1. Machine Learning is fun! - A really nice machine learning intro, a topic that actually needs an intro. By Adam Geitgey.
  2. Intuition for Simulated Annealing - Shake!. By Robb Seaton.
  3. Everything You Wanted to Know about the Kernel Trick (But Were Too Afraid to Ask). By Eric Kim.
  4. Principal Component Analysis (PCA) vs Ordinary Least Squares (OLS): A Visual Explanation - By J.D. Long
  5. Markov Chains - A visual explanation. By Lewis Lehe.
  6. A Beginner’s Guide to Eigenvectors, PCA, Covariance and Entropy - by Skymind. The most intuitive introduction to Eigenvectors and Eigenvalues I've found so far.
  7. Visual Information Theory - by C. Olah. Entropy, Cross-entropy, and KL-divergence visually explained...
  8. The Matrix Calculus You Need For Deep Learning - by Terrence Parr and Jeremy Howard.
  9. Seeing Theory By Daniel Kunin. A visual introduction to Probability and Statistics

Causality

  1. The book of why - by J. and D. Mackenzie
  2. Casual Inference in Statistics - A Primer - by J. Pearl webpage and references
  3. Fairness and machine learning - Chapter 4: Causality by S. Barocas et al.
  4. Causality: Models Reasoning and Inference by J. Pearl
  5. Causality for Machine Learning by B. Schölkopf
  6. ML beyond Curve Fitting: An Intro to Causal Inference and do-Calculus by F. Huszár
  7. Introduction to Causal Inference course by B. Neal

Deep Learning

  1. Deep Learning, NLP, and Representations - By C. Olah
  2. Neural Networks and Deep Learning - By Micheal Nielsen. A great online book on neural networks.
  3. Calculus on computational graphs: backpropagation - by C. Olah. Backpropagation explained as calculus on computational graphs
  4. Understanding LSTM Networks - by C.Olah
  5. The Unreasonable Effectiveness of Recurrent Neural Networks - by A. Karpathy. An introduction to RNN and charater-level language models.
  6. Understanding Convolutions - by C.Olah (2014)
  7. Conv Nets: A Modular Perspective - by C.Olah (2014) - How convolutional neural networks work.
  8. Attention is All you Need: Before you Read Transformer - Video tutorial by @NamVo about the Transformer Architecure presente in the paper Attention is All You Need

Unsupervised Learning

  1. A tutorial on PCA - Lindsay Smith - 2002 - Very clear, step by step, introduction to Principal Component Analysis

Supervised Learning

  1. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning - Sebastian Raschka - A great overview of supervised learning methodology

Programming Machine Learning

  1. Introduction to NumPy - By Sebastian Raschka (Appendix F)
  2. An introduction to NumPy and SciPy - By M. Scott Shell
  3. Implementing a Principal Component Analysis (PCA) - by Sebastian Raschka. Using Python and NumPy.

Visualization

  1. Visual Vocabulary (.png) - By ft.com - How to visualize your data, depending on what you want to emphasize.
  2. Visualizing the uncertainty in data - By Nathan Yau
  3. Fundamentals of Data Visualization - By Claus Wilke - "The book is meant as a guide to making visualizations that accurately reflect the data, tell a story, and look professional."
  4. How to make beautiful data visualizations in Python with matplotlib - By Randal Olson

Applications

  1. Movie Recommendations with k-Nearest Neighbors and Cosine Similarity - By Nicole White.
  2. Sentiment Analysis on Movie Reviews - By Rafael Carrascosa. Sentiment Analysis using Random Forests.

Yet assorted

  1. Logs, Tails, Long Tails - By Ryan Moulton. Why log probabilities are useful. Why long tails matter.
  2. Tiny Data, Approximate Bayesian Computation and the Socks of Karl Broman - By Rasmus Bååth.

Advanced

  1. Deep Reinforcement Learning Doesn't Work Yet - By Alex Irpan.
  2. The Bitter Lesson by Richard Sutton. Reflections on The Bitter Lesson by Michael Nielsen.
  3. On the Bias-Variance Tradeoff: Textbooks Need an Update - By Brady Neal

Reviews

  1. NLP Year in Review 2019 - By Elvis. Very comprehensive.

My tutorials / Guías (English/Spanish)

  1. Yet Another Python Encoding Tutorial (Python 2)
  2. Matrices for Data Scientists
  3. Natural Language Parsing with Python
  4. Ciencia de Datos: lo mínimo que hay que saber

Presentaciones (in Spanish)

  1. Seminario Ciencia de Datos - Slides for a 8-hour seminar on Data Science. Facultad de Ciencias Económicas - Universidad de la República - Uruguay
  2. Veinte Años de Aprendizaje Automático - Talk at the GX27 Meeting - Uruguay - 2017
  3. Machine Learning, Python y el Titanic - Talk at Tech Meetup Uruguay - 2014 - Slides
  4. Aprendizaje automático en el mundo real - Talk at the GX28 Meeting - Uruguay 2018
  5. Olas, inviernos, ciencia y tecnología: Lo que aprendí del Procesamiento de Lenguaje Natural - Talk at the GX29 Meeting - Uruguay - 2019
  6. Computabilidad y Máquinas de Turing - Talk about computability for a Cognitive Sciences course.

Amusements / Entretenimiento

  1. Figuritas
  2. Mentiras, malditas mentiras, y encuestas
  3. Mi "predicción" para las elecciones 2014 en Uruguay
  4. Sobreajuste - Una increíblemente precisa predicción de casos de COVID-19