Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'XGBRanker' object has no attribute 'score' #19

Closed
paulperry opened this issue Jan 31, 2019 · 7 comments
Closed

'XGBRanker' object has no attribute 'score' #19

paulperry opened this issue Jan 31, 2019 · 7 comments

Comments

@paulperry
Copy link

Ma sta robba non funziona! ;-)

I've created an XGBRanker object with the sklearn API and tried to use the rankeval effectiveness analysis. It requires a score() function, which makes sense, but I don't see that XGBRanker has one, and I don't know if sklearn requires one. Thoughts?

from rankeval.analysis.effectiveness import model_performance

model_perf = model_performance(
    datasets=[x_valid], 
    models=[model], 
    metrics=[precision_10, recall_10, ndcg_10])
model_perf.to_dataframe()

with the following output

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-707-76c7248fd353> in <module>()
      4     datasets=[x_valid],
      5     models=[model],
----> 6     metrics=[precision_10, recall_10, ndcg_10])
      7 model_perf.to_dataframe()

~/rankeval/rankeval/rankeval/analysis/effectiveness.py in model_performance(datasets, models, metrics, cache)
     54     for idx_dataset, dataset in enumerate(datasets):
     55         for idx_model, model in enumerate(models):
---> 56             y_pred = model.score(dataset, detailed=False, cache=cache)
     57             for idx_metric, metric in enumerate(metrics):
     58                 data[idx_dataset][idx_model][idx_metric] = metric.eval(dataset,


AttributeError: 'XGBRanker' object has no attribute 'score'
@strani
Copy link
Contributor

strani commented Jan 31, 2019

Sta robba funziona se usata correttamente ;-)

You are missing to translate the XGBoost model into an RTEnsemble model (the one that models an ensemble of regression trees in RankEval).

For example, take a look at the following notebook showing how to use the RTEnsemble model in conjunction with LightGBM (XGBoost is almost the same).

So to summarize:

  • train the model with the xgboost API
  • save the model on a file in textual format (not binary). This is done with the dump_model method of the XGBoost API.
  • load the model from the aforementioned file with the RTEnsemble class
  • score/predict/analyze/do whatever you want with this model, from now on in the RankEval format

P.S. Also the dataset has to be in the RankEval format. It is not allowed to use a raw numpy array, but you can create a RankEval dataset from a numpy array easily (using X, y and query_ids)

@paulperry
Copy link
Author

Closer, I'm not able to reload the model. Do I need to worry about how the model was trained?

filename = 'xgb.model'
xgb_model.get_booster().dump_model(filename)
rankeval_model = RTEnsemble(filename, name="XGB model", format="XGBoost")

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-728-5975ee1d8cc2> in <module>()
      6 rankeval_model = None
      7 xgb_model.get_booster().dump_model(filename)
----> 8 rankeval_model = RTEnsemble(filename, name="XGB model", format="XGBoost")

~/rankeval/rankeval/rankeval/model/rt_ensemble.py in __init__(self, file_path, name, format, base_score, learning_rate, n_trees)
    124         elif format == "XGBoost":
    125             from rankeval.model import ProxyXGBoost
--> 126             ProxyXGBoost.load(file_path, self)
    127         elif format == "ScikitLearn":
    128             from rankeval.model import ProxyScikitLearn

~/rankeval/rankeval/rankeval/model/proxy_XGBoost.py in load(file_path, model)
    104                     node_id = int(match_leaf.group(1).strip()) + root_node
    105                     leaf_value = float(match_leaf.group(2).strip())
--> 106                     model.trees_nodes_value[node_id] = leaf_value
    107 
    108                 if match_node or match_leaf:

IndexError: index 344 is out of bounds for axis 0 with size 344

@strani
Copy link
Contributor

strani commented Jan 31, 2019

I feel like your problem is related to the open issue #12 I recently discovered. It occurs when XGBoost, after fitting a tree, prunes out some nodes before going on with the boosting phase...

I'll try to fix it tomorrow...your code seems to be correct.

@strani
Copy link
Contributor

strani commented Feb 1, 2019

The problem has been solved. The solution is in the develop branch. It will be merged soon in the master.

@strani strani closed this as completed Feb 1, 2019
@paulperry
Copy link
Author

paulperry commented Feb 4, 2019

I'm able to build/run master, but am not able to run develop:
import rankeval

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-1-c66b5899c31b> in <module>()
----> 1 import rankeval

~/anaconda3/lib/python3.6/site-packages/rankeval-0.7.2-py3.6-macosx-10.7-x86_64.egg/rankeval/__init__.py in <module>()
      9 
     10 __version__ = io.open(os.path.join(cur_dir, '..', 'VERSION'),
---> 11                       encoding='utf-8').read().strip()

FileNotFoundError: [Errno 2] No such file or directory: '/Users/paulperry/anaconda3/lib/python3.6/site-packages/rankeval-0.7.2-py3.6-macosx-10.7-x86_64.egg/rankeval/../VERSION'

I just copied the VERSION file over and it worked. I'm now past this problem and the model now loads.

@strani
Copy link
Contributor

strani commented Feb 11, 2019

Could you kindly retry? This bug should already have been fixed in the master, so I merged all the PRs made there on the develop branch as well. Let me know.

@strani strani reopened this Feb 11, 2019
@paulperry
Copy link
Author

I'm passed this issue, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants