Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End of 2nd day materials #62

Merged
merged 7 commits into from
Jul 21, 2022
Merged

End of 2nd day materials #62

merged 7 commits into from
Jul 21, 2022

Conversation

juliasilge
Copy link
Member

Closes #35 with a few more slides and an addition to 06-classwork.qmd

I have a couple of questions about how this turned out. Take a look at the coefficients:

══ Workflow [trained] ════════════════════════════════════════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: logistic_reg()

── Preprocessor ──────────────────────────────────────────────────────────────────────────────────────────────────────────────
8 Recipe Steps

• step_lencode_mixed()
• step_dummy()
• step_mutate()
• step_rm()
• step_zv()
• step_ns()
• step_ns()
• step_normalize()

── Model ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Call:  stats::glm(formula = ..y ~ ., family = stats::binomial, data = data)

Coefficients:
               (Intercept)                      period                   game_time                      player  
                -0.2407680                   0.0241843                  -0.0717082                  -0.4339961  
               player_diff           offense_goal_diff                        year         period_type_regular  
                -0.0055436                  -0.1205708                   0.0466566                   0.0433042  
strength_even_short_handed         strength_power_play       strength_short_handed            offense_team_ARI  
                -0.1049770                  -0.0542697                          NA                   0.0285016  
          offense_team_BOS            offense_team_BUF            offense_team_CAR            offense_team_CBJ  
                 0.0292468                   0.0136300                   0.0488204                   0.0420298  
          offense_team_CGY            offense_team_CHI            offense_team_COL            offense_team_DAL  
                 0.0205413                   0.0301954                   0.0562829                   0.0806207  
          offense_team_DET            offense_team_EDM            offense_team_FLA            offense_team_LAK  
                 0.0343408                   0.0685353                  -0.0008946                   0.0403156  
          offense_team_MIN            offense_team_MTL            offense_team_NJD            offense_team_NSH  
                 0.0189033                   0.0339079                   0.0401949                   0.0373214  
          offense_team_NYI            offense_team_NYR            offense_team_OTT            offense_team_PHI  
                 0.0369202                   0.0565962                   0.0497961                   0.0909577  
          offense_team_PIT            offense_team_SJS            offense_team_STL            offense_team_TBL  
                 0.1072885                   0.0407732                   0.0026542                  -0.0011043  
          offense_team_TOR            offense_team_VAN            offense_team_WPG            offense_team_WSH  
                 0.0268908                   0.0261062                   0.0084148                   0.0604129  
          defense_team_ARI            defense_team_BOS            defense_team_BUF            defense_team_CAR  
                 0.0223947                  -0.0009265                  -0.0109686                   0.0128539  
          defense_team_CBJ            defense_team_CGY            defense_team_CHI            defense_team_COL  
                 0.0627095                   0.0291604                   0.0266095                   0.0658975  
          defense_team_DAL            defense_team_DET            defense_team_EDM            defense_team_FLA  
                -0.0311725                  -0.0245353                   0.0327484                   0.0308394  
          defense_team_LAK            defense_team_MIN            defense_team_MTL            defense_team_NJD  
                -0.0013984                   0.0318766                  -0.0106160                  -0.0237498  
          defense_team_NSH            defense_team_NYI            defense_team_NYR            defense_team_OTT  
                -0.0156214                  -0.0116193                   0.0752893                  -0.0084344  
          defense_team_PHI            defense_team_PIT            defense_team_SJS            defense_team_STL  
                 0.0546110                          NA                   0.0417056                   0.0079831  
          defense_team_TBL            defense_team_TOR            defense_team_VAN            defense_team_WPG  
                -0.0526966                   0.0008425                  -0.0175121                  -0.0005599  
          defense_team_WSH           game_type_playoff         position_defenseman             position_goalie  
                 0.0695311                  -0.0156897                   0.1292470                  -0.1044855  
        position_left_wing         position_right_wing                     dow_Mon                     dow_Tue  
                 0.0043754                   0.0287042                   0.1174710                  -0.0193820  
                   dow_Wed                     dow_Thu                     dow_Fri                     dow_Sat  
                 0.0051505                  -0.0253829                  -0.0494033                   0.0066051  
                 month_Feb                   month_Mar                   month_Apr                   month_May  
                 0.0540164                   0.0136917                  -0.0025169                   0.0870004  
                 month_Jun                   month_Oct                   month_Nov                   month_Dec  
                 0.0467819                   0.0594591                   0.0090612                          NA  
          behind_goal_line                  angle_ns_1                  angle_ns_2                  angle_ns_3  
                -0.0513810                   0.0534008                   0.0394828                  -0.2836569  

...
and 10 more lines.

Defense has a higher probability than the other positions. Is that right?

Here is the PDPs look:
pdp
Notice the shape (this is with the splines -- ) and how the positions are arranged.

Should we be saying, like I did here, "Predicted probability of not being on goal" on the y-axis? Are the factor levels getting messed up at some point? Am I misunderstanding this somehow and this is all right, and I need to change to "Predicted probability of being on goal"?

Also, just for the record, these decks are not reproducible, I don't think (need more seeds?). This is more a note for later, but a lot changed when I rendered again in terms of xgboost results. I did not check in those lines with changes.

@juliasilge
Copy link
Member Author

juliasilge commented Jul 21, 2022

One more note: we are getting a loooooooot of prediction from a rank-deficient fit may be misleading that participants will see when they work along through the feature engineering and tuning materials. We need to prep how to talk about this, or do we need to add step_zv() or similar to fix it?

@topepo
Copy link
Member

topepo commented Jul 21, 2022

There's a long thing in the annotations about the rank deficiency. We can solve it by adding the step for linear combination. I wouldn't do that right away in the notes but we can add it later to get rid of warnings.

That shape makes a lot of sense. The x axis is going to give symmetric patterns Since it is the distance from centerline to each goal. It makes sense for shots to get higher probability of being on goal as you get closer to the 89 foot mark on either side. It should rapidly decrease after that because you are behind the goal.

@juliasilge
Copy link
Member Author

And defenders are more likely to have a shot be on goal when they take it?

@topepo
Copy link
Member

topepo commented Jul 21, 2022

And defenders are more likely to have a shot be on goal when they take it?

There are, in general, less likely to have a shot on goal since almost all of the other players are in front of them.

@topepo
Copy link
Member

topepo commented Jul 21, 2022

I'm getting into the habit of only committing sources (qmd + new pre-made images) and rendering in main once merged. The number of files changes + conflicts are a pain.

@topepo topepo merged commit 6161e43 into main Jul 21, 2022
@topepo topepo deleted the end-of-2nd-day branch July 21, 2022 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add material for end of Day 2
2 participants