misc thoughts and reminders

topics to touch on in some fashion

uncertain excursions (talking about uncertainty)
cross validation and related

Interludes

Practical modeling vs. idealistic (academic) modeling
Data Issues
- missing data
  - recommend assessment of predicted data similarity to observed data
- data quality and reliability/measurement
- Sparsity
- Outliers
- Imbalanced data
- 'Big' Data, Scalability
- Data types
  - Categorical
  - Ordinal
  - Continuous
  - Time series
  - Text
  - Images
  - Audio
  - Video
  - Geospatial
  - etc.
- Feature Engineering/Pre-processing/Categorical Embeddings/Dimensionality Reduction/Feature Selection/Feature Extraction
- misc feature types: ordinal, zero-infated, etc.
- Transformations: std, log, max
- Data leakage
- Data drift
- Data bias (lack of representativeness), vs. statistical bias
- Misc:
  - Data privacy, security, ethics
  - Data provenance, governance
Causality
- Causal inference
- Techniques: experimental design, matching, meta-learners, uplift modeling, etc.
Model interpretability
- Feature importance
- Model explainability
- Model transparency (e.g. model cards)
- Model debugging
- Model fairness: models are ideas, ideas may not be correct, may be ill-posed, or generally off-base, and even wrong by most standards. The data may be inaccurate, or not representative. None of this is the model's fault.
Uncertainty
- Bayesian inference
- Bootstrap
- Conformal Predictions
Misc Models
- Graphical/Network models
- Survival models, censoring
- MMM
- Mixture models/Clustering
- zero-infated/altered/adjusted
- Time series models
- Reinforcement learning
- Ensemble models
- Regression vs. Classification
Inference vs./is not Prediction (somewhere along the way these ideas were conflated). Prediction could be said to be a form of inference, but not all inference is prediction. Inference does not require a data-driven model, nor even the direction of generalization implied by such models. In the modeling sense, inference strongly suggests a causal framework. We can make inference mean whatever we want in the modeling context- causal modeling, understanding the data generating process, prediction, or whatever, but doing so doesn't add clarity because of its long-standing usage outside of the modeling context (which also applies to modeling in a general way). Refs: ISL, https://stats.stackexchange.com/questions/244017/what-is-the-difference-between-prediction-and-inference

random thoughts

optimization function that was developed one the spot in the classroom? RMSProp by Hinton
https://arxiv.org/pdf/1609.04747.pdf (Ruder)

issues of focus

The initial 'book' was just some algorithms, this book should not be

Folks to quote

Literati
- Arthur C. Clarke
- Barthelme
- Bukowski
- Huxley
Music
- Chuck D.
- Wu Tang
- Joy Division
- David Berman (Oh data, you shine with an evil light...)
Film
- Star wars/trek
Science
- Jacques Cousteau
- Carl Sagan
- Tukey

Preface:

Book does not need to be read incrementally, take what you need, but realize there is a thread.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

misc_thoughts_and_reminders.md

misc_thoughts_and_reminders.md

topics to touch on in some fashion

cross validation and related

Interludes

random thoughts

issues of focus

Folks to quote

Files

misc_thoughts_and_reminders.md

Latest commit

History

misc_thoughts_and_reminders.md

File metadata and controls

topics to touch on in some fashion

cross validation and related

Interludes

random thoughts

issues of focus

Folks to quote