Skip to content

2020.12.21 Spring based regression with different spring strengths during training vs. prediction inference

fcrimins edited this page Dec 28, 2020 · 19 revisions

Smart Regression or One-of-Many-Possibilities Regression

  • 2020-12-28

    • Rather than allowing for a "curvy" model, as ML would provide, allow for a "wider" model where the center of the model is not a line but rather a region.
    • The effect of this should be that the center of the model is only used to predict the belly of the distribution point while the extremes of the distribution are predicted by whichever part of the wider model is closest (i.e. look for a "reason" for an outlier)
    • Effectively (also) this would be like using all of the datapoints for training (maybe discounting the outliers or using Ridge/Lasso) but then using different datapoint-specific (orthogonal distance to "wider" region) models for each datapoint during prediction/inference.
    • What is a simple mathematical formulation of such a model? Something akin to Ridge/Lasso/Normalization?
  • 2020-12-21

    • Even massively overfit ML models still don't allow for coding individual neurons to handle individual cases/datapoints. Every independent variable (or tranformation/combination thereof has the same parameters, same activations). The springs are all the same strength.
    • Perhaps what's missing is the allowance for individual parameters to be different strengths based on the dependent variable (i.e. "special case regression"). For example if the dependent variable (DV) is an outlier, then search the independent variable (IV) space for a value that could explain that outlier, even if not fully.
    • Maybe soft labels (or here get at this same idea somehow, in other words by adjusting the outlier DVs so the they are "softer," more flexible, looser springs when it comes to their effect on the whole model, but tighter springs when it comes to making a prediction of their own E[DV]
    • Kws: smart regression
    • http://feedproxy.google.com/~r/marginalrevolution/feed/~3/JlM_lLt_1wk/my-conversation-with-john-o-brennan.html
    • search the IV space for a value that could explain that outlier

      • Call this "one of many possibilities" regression; one independent variable being allowed to explain the dependent variable in the absence of the others and for a specific datapoint. Allow specific datapoints, especially if they are outliers, to be explained by individual independent variables.
  • 2020-12-21

  • 2020-12-15

    • What are the most important statistical ideas of the past 50 years?
      • "We argue that the most important statistical ideas of the past half century are: counterfactual causal inference, bootstrapping and simulation-based inference, overparameterized models and regularization, multilevel models, generic computation algorithms, adaptive decision analysis, robust inference, and exploratory data analysis."
    • Is there another form of regression where the strength of each spring is proportional to the inverse level of outlier'ness of each dependent variable value. That's what should matter when trying to "explain" an outlier. You might not be able to explain it fully, but if there's even something a little out of the ordinary going on (in the independent variables) then that should be sufficient explanation.
    • Maybe it's similar to going from linear regression with its vertical "springs" to PCA with its orthogonal "springs" to something even moreso with beyond-orthogonal "springs?"
  • 2020-11-22

    • When buying a signal (dependent variable)... and neutralizing it wrt some other signal (independent variable) via linear regression, you end up buying the extremes of the first signal (DV) but also selling the extremes of the latter (IV), which is suboptimal because you have no opinion of the latter.
    • So is there some sort of point-by-point regression technique? This is similar to the "reasons for movements" theory.
  • 2020-11-20

  • 2020-11-10

    • "Truth is in the extremes. There is no noise in the far tails." - Nassim Taleb
    • What is the implication here for quant development? It's that P(info) is really high in the tails (of a forecast distribution)! So the tails should not be traded. What is the profitability of positions that are the result of large forecasts? What is the profitability of large positions vs. small positions?
  • 2020-10-22

    • From an analogy mentioned by Jacob Kline that quant investing pictures of companies that have been taken with a camera. You aren't working with real cats, you're working with pictures of cats. And you're trying to predict the picture, not the cat.
    • Try to predict the picture, the image. The image is the return and the fundamentals and all the other "obvious," typical financial datapoints. The typical data are dependent variables, not independent. They are part of the image, part of the story, along with the future return.
    • And that is the analogy to the camera. Predict the picture. Pictures of cats.
    • Look, prices move. To say that you know how much a price should move based on a particular change in information is one thing. But to relate a particular price move, regardless of size, to a particular change in information is something much less (look for any "reason" or "reason for movements"). Who are we to say that we can guess by how much a price should move according to particular change in information?
  • 2020-10-07

    • Always start with returns and ignore prices
    • Start with returns and look for things that cause them ("reasons"), using nonlinear methods. Reasons for returns are not linear when put through the lens of humanity/groupthink/emotion, which modulate reasons. This is like p(info) but for all ratios including price, not just SRs.
    • All we know is that there are times of stability, of a particular framing/rationalization/regime. And nonlinear shifts between those regimes, either specific instrument shifts or cross sectional regime shifts. Either can be used to explain returns, just don't trust the price either after or before because those prices can lead to prolonged periods of constant forecasts (if price used in a ratio) and large static positions as a result.
    • Never use prices as numerator or denominator of a forecast because it implies an implicit dependency on return or price change.
    • And use volume and return to fit (discount/neutralize) NLP sentiment.
  • Inference vs. Prediction

    • "Inference: Use the model to learn about the data generation process."
    • "Prediction: Use the model to predict the outcomes for new data points."
    • "Model interpretability is a necessity for inference"
Clone this wiki locally