Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional Contextual Bandit #1816

Merged
merged 72 commits into from Jul 9, 2019

Conversation

Projects
None yet
7 participants
@jackgerrits
Copy link
Member

commented Apr 1, 2019

Opening a draft PR for review and comment.

  • Implements ccb_explore_adf reduction
  • Implements CCB format for text, json and dsjson format

I want to point out that we are 100% leaking memory in CCB. Primarily around labels and predictions. I'll need to do a pass fixing them at some point. Leaks should be fixed

Still left to do:

  • Remove random sampling from learn branch

jackgerrits and others added some commits Feb 11, 2019

Add CCB label parsing, label and prediction types (#1754)
* Add initial types for ccb

* Add unit test file

* WIP ccb parser

* finish parsing component add tests

* implement ccb label caching

* Fix test

* Change parsing to agreed format

* Add missing header

* trigger ci
[CCB] Enable interactions to be overridden by an example (#1770)
* Enable per example interaction override

* rename to override_interactions
[CCB] Add ccb_explore_adf reduction (#1761)
* Add new empty reduction for CCB

* implement the ccb reduction

* parse and use the explicit included actions

* fix case where no action is available for a decision

* comments

* use unordered_set to manage excludelist and includelist

* fix build

* fix from comments

* cosmetics

* add a finish function

* fix indices with 0-based index

* handle decisions w/o actions

* cosmetics

* cosmetics: auto-format document

* comments

* add TODO
[CCB] Dsjson and json parsing for CCB (#1767)
* Add incomplete ccb json parser, start tests

* WIP

* WIP

* fix ccb, finish tests

* Address comments

* add extra check

* fix tests
Sample from PDF and swap chosen as part of CCB (#1797)
* Sample as part of ccb

* remove test code

* Move sampling to it's own reduction. allow a custom seed in tag

* Fix pred file in test

* ccb uses cb_sample

* Fix compile break
Extend CCB parser for Dsjson/Json to support implicitly converting CB…
… examples to CCB examples (#1805)

* Extend ccb parser mode to implicitly convert cb logs

* normal json cb_as_ccb

* Fix merge conflict
Implement finish example and output for CCB (#1795)
* Add output for CCB

* Generify get_unbiased_cost, add const

* Move constant to top of namespace
Shared example decision feature injection and auto crossing (#1796)
* First cut at automatic interactions and merging decision example with shared example

* Move to use label dict for feature mutation

* Remove code not for this PR

* Reserve vector memory and add named default_namespace

* remove cast

* Fix bad merge

@jackgerrits jackgerrits added the CCB label Apr 1, 2019

@yannstad

This comment has been minimized.

Copy link
Collaborator

commented Apr 10, 2019

I already reviewed and approved this code before this PR was submitted. Also, I wrote some part of it, so reviewing again is not helpful.
Can someone else have a look? @rajan-chari @kumpera @ataymano ?

Show resolved Hide resolved vowpalwabbit/conditional_contextual_bandit.cc Outdated
Show resolved Hide resolved vowpalwabbit/conditional_contextual_bandit.cc Outdated
void clear_all(ccb& data)
{
data.shared = nullptr;
data.actions.clear();

This comment has been minimized.

Copy link
@kumpera

kumpera Apr 10, 2019

Collaborator

This could be a problem if we're dealing with a large enough set of actions/decisions that calling clear actually relocates the underlying array.

This comment has been minimized.

Copy link
@jackgerrits

jackgerrits May 2, 2019

Author Member

The size of the vector should be sufficiently small for it not to matter here, but I agree this is not great. Would be good if there was a way to clear it and guarantee the memory is not deallocated

This comment has been minimized.

Copy link
@rajan-chari

rajan-chari Jun 28, 2019

Member

v_array?

This comment has been minimized.

Copy link
@jackgerrits

jackgerrits Jun 28, 2019

Author Member

I don't think using v_array is necessary here. v_array still reallocates as well after a number of clears

Show resolved Hide resolved vowpalwabbit/conditional_contextual_bandit.cc Outdated

jackgerrits added some commits Apr 11, 2019

@jackgerrits jackgerrits marked this pull request as ready for review Apr 11, 2019

@jackgerrits

This comment has been minimized.

Copy link
Member Author

commented Apr 11, 2019

Had to unmark it is as draft so appveyor would build it

@JohnLangford
Copy link
Member

left a comment

Reviewed everything except ccb itself.

{VW} --warm_cb 10 --cb_explore_adf --cb_type mtr --epsilon 0.05 --warm_start 3 --interaction 7 --choices_lambda 8 --warm_start_update --interaction_update --corrupt_type_warm_start 2 --corrupt_prob_warm_start 0.5 -d train-sets/multiclass
train-sets/ref/warm_cb_cyc.stderr
# Test 185 warm_cb warm start with input cost-sensitive examples
# Test 186 warm_cb warm start with input cost-sensitive examples
{VW} --warm_cb 3 --cb_explore_adf --cb_type mtr --epsilon 0.05 --warm_start 1 --interaction 2 --choices_lambda 8 --warm_start_update --interaction_update --warm_cb_cs -d train-sets/cs_cb
train-sets/ref/warm_cb_cs.stderr

This comment has been minimized.

Copy link
@JohnLangford

JohnLangford Apr 15, 2019

Member

We're missing an end-to-end test of CCB.

This comment has been minimized.

Copy link
@jackgerrits

jackgerrits May 2, 2019

Author Member

We definitely are, this is on the todo

This comment has been minimized.

Copy link
@JohnLangford

JohnLangford Jun 3, 2019

Member

Still missing...

This comment has been minimized.

Copy link
@jackgerrits

jackgerrits Jun 7, 2019

Author Member

I've been waiting to settle on the learning algorithm before adding the test

Show resolved Hide resolved vowpalwabbit/cache.cc
Show resolved Hide resolved vowpalwabbit/cache.h Outdated
Show resolved Hide resolved vowpalwabbit/cb_algs.h
Show resolved Hide resolved vowpalwabbit/cb_sample.cc Outdated
Show resolved Hide resolved vowpalwabbit/example.h Outdated
Show resolved Hide resolved vowpalwabbit/parser.cc Outdated
Show resolved Hide resolved vowpalwabbit/parse_args.cc Outdated
Show resolved Hide resolved vowpalwabbit/conditional_contextual_bandit.h Outdated
Show resolved Hide resolved vowpalwabbit/conditional_contextual_bandit.h

jackgerrits and others added some commits Jun 26, 2019

@JohnLangford

This comment has been minimized.

Copy link
Member

commented Jul 1, 2019

Is this mergable today?

@jackgerrits

This comment has been minimized.

Copy link
Member Author

commented Jul 1, 2019

I think so, at this point I've addressed all comments.

@jackgerrits

This comment has been minimized.

Copy link
Member Author

commented Jul 2, 2019

I added a benchmark for the bool bookkeeping case. Turns out that with the knowledge of the fixed size we can be significantly more efficient.

2019-07-02 09:51:53
Running ./perf_tests
Run on (40 X 2195 MHz CPU s)
Load Average: 0.52, 0.58, 0.59
----------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations
----------------------------------------------------------------------
set                               1106 ns         1099 ns       640000
vector_binary_search_insert       1025 ns         1025 ns       746667
no_deduplication                  1058 ns         1050 ns       640000
fixed_bool_bookkeeping             430 ns          430 ns      1600000
@JohnLangford

This comment has been minimized.

Copy link
Member

commented Jul 2, 2019

Good. If you profile, is the computational complexity not in this reduction?

@jackgerrits

This comment has been minimized.

Copy link
Member Author

commented Jul 2, 2019

Here is the runtime for learn down the entire stack. The thing to keep in mind is that since they are recursive the numbers accumulate.
image

Internally to CCB, the primary cost is the interactions generation. So it is a very good thing that we went with the implementation that we did.
image

Note: This is with 4000 samples a second which was the high setting, which can itself affect the perf.

@jackgerrits jackgerrits changed the title [Draft for comment] Conditional Contextual Bandit Conditional Contextual Bandit Jul 4, 2019

@jackgerrits

This comment has been minimized.

Copy link
Member Author

commented Jul 9, 2019

@JohnLangford Do you think this is ready to merge?

@JohnLangford JohnLangford merged commit b4a54fc into master Jul 9, 2019

8 of 9 checks passed

coverage/coveralls Coverage decreased (-0.3%) to 73.017%
Details
LGTM analysis: C# No code changes detected
Details
LGTM analysis: C/C++ No new or fixed alerts
Details
LGTM analysis: Java No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No code changes detected
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
@JohnLangford

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

Yep merged. Sorry about the slowness---I ended up sick for the last week-or-so.

This is a big one. Congrats :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.