Gradient boosting tutorial never actually touches `X_train` and `y_train` #185

mobiusklein · 2023-07-03T16:23:13Z

Pre-issue creation checklist

No duplicates of this issue have already been created.

Name of the tutorial/dataset

NIST (part 2): Traditional ML: Gradient Boosting

Describe the mistake or typo

Gradient boosting tutorial never actually touches X_train and y_train

Additional context

Lines using X_test when context suggests it should be X_train

ProteomicsML/tutorials/fragmentation/_nist-2-traditional-ml-gradient-boosting.ipynb

Lines 402 to 412 in 512dc93

    
           "outputs": [], 
        
           "source": [ 
        
            "reg =  GradientBoostingRegressor()\n", 
        
            "\n", 
        
            "X_train = train_val_encoded.drop(columns=[\"spectrum_id\", \"b_target\",  \"y_target\"])\n", 
        
            "y_train = train_val_encoded[\"b_target\"]\n", 
        
            "X_test = test_encoded.drop(columns=[\"spectrum_id\", \"b_target\",  \"y_target\"])\n", 
        
            "y_test = test_encoded[\"b_target\"]\n", 
        
            "\n", 
        
            "reg.fit(X_test, y_test)" 
        
           ]

ProteomicsML/tutorials/fragmentation/_nist-2-traditional-ml-gradient-boosting.ipynb

Lines 442 to 452 in 512dc93

    
           "cell_type": "code", 
        
           "execution_count": null, 
        
           "metadata": {}, 
        
           "outputs": [], 
        
           "source": [ 
        
            "def objective(n_estimators):\n", 
        
            "    # Define algorithm\n", 
        
            "    reg =  GradientBoostingRegressor(n_estimators=n_estimators)\n", 
        
            "\n", 
        
            "    # Fit model\n", 
        
            "    reg.fit(X_test, y_test)\n",

Also, this may be a failure in my reading/understanding the notebook, but the notebook says the model it trains can only predict y ions, but the training target is listed as b_target, not y_target (see line 407) in first snippet.

ProteomicsML/tutorials/fragmentation/_nist-2-traditional-ml-gradient-boosting.ipynb

Line 619 in 512dc93

    
           "And of course, this model can only predict y-ion intensities. You can repeat the\n",

The text was updated successfully, but these errors were encountered:

RobbinBouwmeester assigned RalfG Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient boosting tutorial never actually touches `X_train` and `y_train` #185

Gradient boosting tutorial never actually touches `X_train` and `y_train` #185

mobiusklein commented Jul 3, 2023 •

edited

Gradient boosting tutorial never actually touches X_train and y_train #185

Gradient boosting tutorial never actually touches X_train and y_train #185

Comments

mobiusklein commented Jul 3, 2023 • edited

Pre-issue creation checklist

Name of the tutorial/dataset

Describe the mistake or typo

Additional context

Gradient boosting tutorial never actually touches `X_train` and `y_train` #185

Gradient boosting tutorial never actually touches `X_train` and `y_train` #185

mobiusklein commented Jul 3, 2023 •

edited