Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slate bandit dataset #82

Merged
merged 21 commits into from
Apr 18, 2021
Merged

Conversation

fullflu
Copy link
Contributor

@fullflu fullflu commented Mar 14, 2021

Overview

Add SyntheticSlateBanditDataset (based on SyntheticBanditDataset)

Tasks

@usaito
Copy link
Contributor

usaito commented Mar 17, 2021

@fullflu Thanks! I've looked through the slate dataset class and suggest some rephrasing as listed below.

  • impression_id -> slate_id
  • pscore_joint_above -> pscore_cascade
  • pscore_joint_all -> pscore
  • pscore_marginal -> pscore_item_position
  • action_set = np.arange(self.n_actions) -> unique_action_set = np.arange(self.n_actions)
  • you are using get sometimes which is ambiguous, I suggest you use more concrete verbs such as obtain, calc, or define
  • you set RIPS to reward_structure but RIPS is an estimator not an assumption. I think reward_structure should be one of cascade(=RIPS), item_position(=IIPS), or None (=SIPS, which allows any complex interaction)
  • def self.sample_action( -> self.sample_action_and_obtain_pscore(

[nits]

I think you can implement the same process as follows

# calculate joint pscore
pscore_i *= score_[sampled_action_index]
pscore_joint_above[i * self.len_list + position_] = pscore_i

@fullflu
Copy link
Contributor Author

fullflu commented Apr 3, 2021

  • impression_id -> slate_id
  • pscore_joint_above -> pscore_cascade
  • pscore_joint_all -> pscore
  • pscore_marginal -> pscore_item_position
  • action_set = np.arange(self.n_actions) -> unique_action_set = np.arange(self.n_actions)
  • you are using get sometimes which is ambiguous, I suggest you use more concrete verbs such as obtain, calc, or define
  • you set RIPS to reward_structure but RIPS is an estimator not an assumption. I think reward_structure should be one of cascade(=RIPS), item_position(=IIPS), or None (=SIPS, which allows any complex interaction) -> defined five reward structures (standard_additive, standard_exponential, independent, cascade_additive, cascade_exponential)
  • def self.sample_action( -> self.sample_action_and_obtain_pscore(
  • you use np.arange and range alternately. I suggest you use only np.arange (just because I use it in other parts of the package) if there is no reason to use both
  • calculate joint pscore

@fullflu fullflu changed the title [WIP] add slate bandit dataset Add slate bandit dataset Apr 3, 2021
@fullflu
Copy link
Contributor Author

fullflu commented Apr 3, 2021

Error: Unable to resolve action psf/black@stable, unable to find version stable

psf/black#2079

@usaito
Copy link
Contributor

usaito commented Apr 3, 2021

@fullflu

Thanks! Overall, the slate stuff is great; I can't wait to see some simulations results!

Please address the following minor comments.

[must]

[imo]

        slot_weight_matrix = np.identity(len_list)
        for position_ in np.arange(len_list):
            slot_weight_matrix[:, position_] = -1 / np.exp(
                np.abs(np.arange(len_list) - position_)
            )
        return slot_weight_matrix

instead of the current one below.
https://github.com/fullflu/zr-obp/blob/2a8b38b19698cb847eaa202b7c53f587c2ad526b/obp/dataset/synthetic_slate.py#L269-L278

[ask]

[nits]

@fullflu
Copy link
Contributor Author

fullflu commented Apr 4, 2021

[must]

[imo]

        slot_weight_matrix = np.identity(len_list)
        for position_ in np.arange(len_list):
            slot_weight_matrix[:, position_] = -1 / np.exp(
                np.abs(np.arange(len_list) - position_)
            )
        return slot_weight_matrix
instead of the current one below.

https://github.com/fullflu/zr-obp/blob/2a8b38b19698cb847eaa202b7c53f587c2ad526b/obp/dataset/synthetic_slate.py#L269-L278

--> skipped (because those lines are necessary to generate upper triangular matrix)

[ask]

[nits]

@usaito
Copy link
Contributor

usaito commented Apr 11, 2021

@fullflu Thanks! Some additional comments.

dataset/synthetic_slate.py

[ask]

[nits]

dataset/test_synthetic_slate.py

[nits]

@usaito usaito merged commit 9bd83c2 into st-tech:master Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants