Testing the predictive power of various CFB metrics with career yards per game as the target variable.
The data set is players who played in the NFL and CFB data begins in 2003. 2004 is the first year a player can appear in the dataset. Career NFL yards per game averages were calculated through 2016. Data was collected from Yahoo sports pages and player stats were spot checked against College Football Reference. Some errors may still be present.
The out-of-sample predict is pretty garbage. There are probably way to get incremental improvement with depth of target and route type data but this is typical and generally state of the art.
Here is the first of two feature importance plots. This one measures the loss in performance when a feature is "shuffled" using mean squared error.
Same idea but measureing loss using the mean absolute error.
In general market share of CFB yards are important - and it's possible it might be even more important than what this analysis shows - but it's not a huge driver of predictive power with a rather typical mix of explanatory variables.