Skip to content
Permalink
Browse files

Conditional Contextual Bandit (#1816)

* Add CCB label parsing, label and prediction types (#1754)

* Add initial types for ccb

* Add unit test file

* WIP ccb parser

* finish parsing component add tests

* implement ccb label caching

* Fix test

* Change parsing to agreed format

* Add missing header

* trigger ci

* [CCB] Enable interactions to be overridden by an example (#1770)

* Enable per example interaction override

* rename to override_interactions

* [CCB] Add ccb_explore_adf reduction (#1761)

* Add new empty reduction for CCB

* implement the ccb reduction

* parse and use the explicit included actions

* fix case where no action is available for a decision

* comments

* use unordered_set to manage excludelist and includelist

* fix build

* fix from comments

* cosmetics

* add a finish function

* fix indices with 0-based index

* handle decisions w/o actions

* cosmetics

* cosmetics: auto-format document

* comments

* add TODO

* Fix ordering to be consistent with existing cb reduction (#1786)

* [CCB] Dsjson and json parsing for CCB (#1767)

* Add incomplete ccb json parser, start tests

* WIP

* WIP

* fix ccb, finish tests

* Address comments

* add extra check

* fix tests

* Sample from PDF and swap chosen as part of CCB (#1797)

* Sample as part of ccb

* remove test code

* Move sampling to it's own reduction. allow a custom seed in tag

* Fix pred file in test

* ccb uses cb_sample

* Fix compile break

* Extend CCB parser for Dsjson/Json to support implicitly converting CB examples to CCB examples (#1805)

* Extend ccb parser mode to implicitly convert cb logs

* normal json cb_as_ccb

* Fix merge conflict

* Implement finish example and output for CCB (#1795)

* Add output for CCB

* Generify get_unbiased_cost, add const

* Move constant to top of namespace

* Shared example decision feature injection and auto crossing (#1796)

* First cut at automatic interactions and merging decision example with shared example

* Move to use label dict for feature mutation

* Remove code not for this PR

* Reserve vector memory and add named default_namespace

* remove cast

* Fix bad merge

* Fix learning and stashed pred (#1836)

* trigger ci

* trigger ci

* "remove cb_sample_seed, only can override with tag"

* - comment out explicit slot actions
- use variables in sizeof
- change from unorderd_map and unordered_set to using vector

* fix spelling error

* Use variable names instead of types for sizeof

* stop using read_object helper

* Rename get_unbiased_cost to get_cost_estimate

* Fix test number in runtests

* C# test needs an extra empty line to recognize the data file as multiline

* change seed handling to match explore

* move to asserts

* remove unused variable

* Update overridable interactions implementation (#1866)

* Use overridable interactions per example

* Change line endings

* Implement history rollup and higher order interactions for CCB (#1869)

* Implement history rollup and higher order interaction

* Change to todo comment

* Fix naming based on spec discussion (#1879)

* Resolve leaking of label and prediction in CCB (#1882)

* Clean up and restore label correctly

* Fix clear issue with decision_scores

* CCB Learning fixes (#1890)

* Add ccb sim

* Update namespaces

* Must finish example to return to pool

* add cmake

* rename decision to slot

* Update to user, slot, action display

* pre-rerepresentation

* move to different representation of probs

* Implement new display

* Fix sampling and id association

* Explore seed message, skip clearing history, insert interactions to all examples, add constant into history namespace

* remove sim from learning fix branch

* Move function definition

* do not sample when learning, but use label

* Sample reduction should swap the labelled action for learning

* Fix double def

* update sameple test for learning based sampling

* Fix parser constructors in test

* remove history related changes

* split ccb into two files, remove history

* Automatic quadratic features

* Add ccb test

* Fix runtests, includes and move to warning

* fix runtests file

* Fix warning

* Add new line so windows tests work

* move cb_sample beneath shared_merger

* use seed for sample for consistent behavior

*  Make sure interactions are set in locations where examples are allocated (#1942)

* Make sure interactions are set in locations where examples are allocated

* remove added file

* Fix ptr

* Reuse memory in ccb rather than alloc/dealloc (#1938)

* reuse memory

* Change v_array pool implementation to work by value

* Move to std::vector

* Inject slot id and revise automatic interactions to slot id with all existing namespaces and interactions (#1945)

* implement revised ccb interactions and id

* remove file

* Address comments

* Overwrite test results

* Add audit

* fix feature injection

* Fix namespace duplication and ignore bugs

* Fix interaction object for ccb parser (#1947)

* Address comments

* Fix scope of clear

* Fix scope of clear

* Update tests

* Address PR comments

* Remove json parser

* Fix usage of json_parser

* Fix bad merge of tests

* revert file change

* Remove if conditional

* rename dfstate to slotsstate

* Improve error handling, and improve perf of interactions generation

* Fix brackets
  • Loading branch information...
jackgerrits authored and JohnLangford committed Jul 9, 2019
1 parent b0a4601 commit b4a54fcf2e434ad7550d73aa4910f603ac12ba5e
Showing with 2,837 additions and 159 deletions.
  1. +17 −17 cs/cli/vw_cbutil.cpp
  2. +8 −0 explore/explore_internal.h
  3. +2 −1 java/src/main/c++/jni_spark_vw.cc
  4. +1 −0 python/pylibvw.cc
  5. +9 −0 test/RunTests
  6. +4 −0 test/pred-sets/ref/cb_sample_seed.predict
  7. +7 −0 test/test-sets/cb_sample_seed.data
  8. +240 −0 test/train-sets/ccb_test.dat
  9. +96 −0 test/train-sets/ref/ccb_test.predict
  10. +22 −0 test/train-sets/ref/ccb_test.stderr
  11. +2 −1 test/unit_test/CMakeLists.txt
  12. +143 −0 test/unit_test/ccb_parser_test.cc
  13. +39 −0 test/unit_test/ccb_test.cc
  14. +267 −0 test/unit_test/dsjson_parser_test.cc
  15. +226 −0 test/unit_test/json_parser_test.cc
  16. +2 −0 test/unit_test/test_common.h
  17. +6 −2 test/unit_test/unit_test.vcxproj
  18. +10 −1 test/unit_test/unit_test.vcxproj.filters
  19. +3 −2 vowpalwabbit/CMakeLists.txt
  20. +2 −0 vowpalwabbit/baseline.cc
  21. +9 −0 vowpalwabbit/cache.cc
  22. +5 −0 vowpalwabbit/cache.h
  23. +1 −10 vowpalwabbit/cb.cc
  24. +1 −1 vowpalwabbit/cb_adf.cc
  25. +1 −1 vowpalwabbit/cb_algs.cc
  26. +11 −4 vowpalwabbit/cb_algs.h
  27. +1 −1 vowpalwabbit/cb_explore.cc
  28. +1 −1 vowpalwabbit/cb_explore_adf.cc
  29. +98 −0 vowpalwabbit/cb_sample.cc
  30. +3 −0 vowpalwabbit/cb_sample.h
  31. +1 −0 vowpalwabbit/cbify.cc
  32. +333 −0 vowpalwabbit/ccb_label.cc
  33. +37 −0 vowpalwabbit/ccb_label.h
  34. +672 −0 vowpalwabbit/conditional_contextual_bandit.cc
  35. +28 −0 vowpalwabbit/conditional_contextual_bandit.h
  36. +5 −14 vowpalwabbit/cost_sensitive.cc
  37. +1 −0 vowpalwabbit/example.cc
  38. +7 −0 vowpalwabbit/example.h
  39. +3 −0 vowpalwabbit/example_predict.h
  40. +1 −1 vowpalwabbit/explore_eval.cc
  41. +1 −0 vowpalwabbit/expreplay.h
  42. +4 −4 vowpalwabbit/gd.h
  43. +6 −1 vowpalwabbit/global_data.h
  44. +2 −2 vowpalwabbit/interactions.cc
  45. +11 −5 vowpalwabbit/interactions.h
  46. +2 −1 vowpalwabbit/learner.h
  47. +2 −2 vowpalwabbit/mwt.cc
  48. +3 −0 vowpalwabbit/nn.cc
  49. +103 −8 vowpalwabbit/object_pool.h
  50. +3 −0 vowpalwabbit/parse_args.cc
  51. +317 −72 vowpalwabbit/parse_example_json.h
  52. +11 −0 vowpalwabbit/parse_primitives.cc
  53. +3 −0 vowpalwabbit/parse_primitives.h
  54. +6 −6 vowpalwabbit/parser.cc
  55. +0 −1 vowpalwabbit/parser.h
  56. +1 −0 vowpalwabbit/search_dep_parser.cc
  57. +1 −0 vowpalwabbit/search_entityrelationtask.cc
  58. +1 −0 vowpalwabbit/search_sequencetask.cc
  59. +1 −0 vowpalwabbit/stagewise_poly.cc
  60. +28 −0 vowpalwabbit/v_array_pool.h
  61. +6 −0 vowpalwabbit/vw_core.vcxproj
@@ -1,17 +1,17 @@
/*
Copyright (c) by respective owners including Yahoo!, Microsoft, and
individual contributors. All rights reserved. Released under a BSD (revised)
license as described in the file LICENSE.
*/

#include "vw_cbutil.h"
#include "cb_algs.h"

namespace VW
{
float VowpalWabbitContextualBanditUtil::GetUnbiasedCost(uint32_t actionObservered, uint32_t actionTaken, float cost, float probability)
{ CB::cb_class observation = { cost, actionObservered, probability };

return CB_ALGS::get_unbiased_cost(&observation, actionTaken);
}
}
/*
Copyright (c) by respective owners including Yahoo!, Microsoft, and
individual contributors. All rights reserved. Released under a BSD (revised)
license as described in the file LICENSE.
*/

#include "vw_cbutil.h"
#include "cb_algs.h"

namespace VW
{
float VowpalWabbitContextualBanditUtil::GetUnbiasedCost(uint32_t actionObservered, uint32_t actionTaken, float cost, float probability)
{ CB::cb_class observation = { cost, actionObservered, probability };

return CB_ALGS::get_cost_estimate(&observation, actionTaken);
}
}
@@ -229,6 +229,8 @@ namespace exploration
return enforce_minimum_probability(minimum_uniform, update_zero_elements, pdf_first, pdf_last, pdf_category());
}

// Warning: `seed` must be sufficiently random for the PRNG to produce uniform random values. Using sequential seeds will result in a very biased distribution.
// If unsure how to update seed between calls, merand48 (in rand48.h) can be used to inplace mutate it.
template<typename It>
int sample_after_normalizing(uint64_t seed, It pdf_first, It pdf_last, uint32_t& chosen_index, std::input_iterator_tag /* pdf_category */)
{
@@ -277,20 +279,26 @@ namespace exploration
return S_EXPLORATION_OK;
}

// Warning: `seed` must be sufficiently random for the PRNG to produce uniform random values. Using sequential seeds will result in a very biased distribution.
// If unsure how to update seed between calls, merand48 (in rand48.h) can be used to inplace mutate it.
template<typename It>
int sample_after_normalizing(uint64_t seed, It pdf_first, It pdf_last, uint32_t& chosen_index)
{
typedef typename std::iterator_traits<It>::iterator_category pdf_category;
return sample_after_normalizing(seed, pdf_first, pdf_last, chosen_index, pdf_category());
}

// Warning: `seed` must be sufficiently random for the PRNG to produce uniform random values. Using sequential seeds will result in a very biased distribution.
// If unsure how to update seed between calls, merand48 (in rand48.h) can be used to inplace mutate it.
template<typename It>
int sample_after_normalizing(const char* seed, It pdf_first, It pdf_last, uint32_t& chosen_index, std::random_access_iterator_tag pdf_category)
{
uint64_t seed_hash = uniform_hash(seed, strlen(seed), 0);
return sample_after_normalizing(seed_hash, pdf_first, pdf_last, chosen_index, pdf_category);
}

// Warning: `seed` must be sufficiently random for the PRNG to produce uniform random values. Using sequential seeds will result in a very biased distribution.
// If unsure how to update seed between calls, merand48 (in rand48.h) can be used to inplace mutate it.
template<typename It>
int sample_after_normalizing(const char* seed, It pdf_first, It pdf_last, uint32_t& chosen_index)
{
@@ -201,6 +201,7 @@ JNIEXPORT jlong JNICALL Java_org_vowpalwabbit_spark_VowpalWabbitExample_initiali
try
{
example* ex = VW::alloc_examples(0, 1);
ex->interactions = &all->interactions;

if (isEmpty)
{
@@ -426,4 +427,4 @@ JNIEXPORT jobject JNICALL Java_org_vowpalwabbit_spark_VowpalWabbitExample_predic
{
rethrow_cpp_exception_as_java_exception(env);
}
}
}
@@ -148,6 +148,7 @@ example* my_empty_example0(vw_ptr vw, size_t labelType)
{ label_parser* lp = get_label_parser(&*vw, labelType);
example* ec = VW::alloc_examples(lp->label_size, 1);
lp->default_label(&ec->l);
ec->interactions = &vw->interactions;
if (labelType == lCOST_SENSITIVE)
{ COST_SENSITIVE::wclass zero = { 0., 1, 0., 0. };
ec->l.cs.costs.push_back(zero);
@@ -1739,4 +1739,13 @@ printf '3 |f a b c |e x y z\n2 |f a y c |e x\n' | {VW} --oaa 3 -q ef --audit
{VW} -d train-sets/rcv1_smaller.dat --memory_tree 10 --learn_at_leaf --max_number_of_labels 2 --dream_at_update 0 --dream_repeats 3 --leaf_example_multiplier 10 --alpha 0.1 -l 0.001 -b 15 -c --passes 2 --loss_function squared --holdout_off
train-sets/ref/cmt_rcv1_smaller_offline.stderr
# Test 198: test cb_sample
{VW} --cb_sample --cb_explore_adf -d test-sets/cb_sample_seed.data -p cb_sample_seed.predict --random_seed 1234
pred-sets/ref/cb_sample_seed.predict
# Test 199: CCB train then test
{VW} -d train-sets/ccb_test.dat --ccb_explore_adf -p ccb_test.predict
train-sets/ref/ccb_test.stderr
train-sets/ref/ccb_test.predict
# Do not delete this line or the empty line above it
@@ -0,0 +1,4 @@
1:0.5,0:0.5

0:0.5,1:0.5

@@ -0,0 +1,7 @@
shared seed=1234| s_1 s_2
| a:1 b:1 c:1
| a:0.5 b:2 c:1

shared seed=1234| s_1 s_2
0:1.0:0.5 | a:1 b:1 c:1
| a:0.5 b:2 c:1
@@ -0,0 +1,240 @@
ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 0:0:0.2 |Slot h
ccb slot 1:0:0.25 |Slot i
ccb slot 2:0:0.333333 |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 3:0:0.2 |Slot h
ccb slot 4:0:0.25 |Slot i
ccb slot 0:0:0.333333 |Slot j

ccb shared |User c
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:0:0.2 |Slot h
ccb slot 3:0:0.25 |Slot i
ccb slot 2:0:0.333333 |Slot j

ccb shared |User a
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 1:0:0.2 |Slot h
ccb slot 4:-1:0.25 |Slot i
ccb slot 2:0:0.333333 |Slot j

ccb shared |User a
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:0:0.84 |Slot h
ccb slot 3:0:0.316667 |Slot i
ccb slot 1:-1:0.466667 |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 1:0:0.04 |Slot h
ccb slot 4:-1:0.85 |Slot i
ccb slot 0:0:0.866667 |Slot j

ccb shared |User c
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:-1:0.84 |Slot h
ccb slot 1:0:0.85 |Slot i
ccb slot 2:0:0.0666667 |Slot j

ccb shared |User c
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:-1:0.84 |Slot h
ccb slot 1:0:0.85 |Slot i
ccb slot 0:0:0.866667 |Slot j

ccb shared |User a
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:0:0.84 |Slot h
ccb slot 1:0:0.85 |Slot i
ccb slot 2:0:0.0666667 |Slot j

ccb shared |User c
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:0:0.84 |Slot h
ccb slot 1:0:0.85 |Slot i
ccb slot 0:0:0.0666667 |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:0:0.84 |Slot h
ccb slot 1:0:0.85 |Slot i
ccb slot 3:0:0.866667 |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:0:0.84 |Slot h
ccb slot 1:0:0.85 |Slot i
ccb slot 2:0:0.0666667 |Slot j

ccb shared |User a
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:0:0.84 |Slot h
ccb slot 1:0:0.85 |Slot i
ccb slot 3:0:0.866667 |Slot j

ccb shared |User a
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 4:0:0.84 |Slot h
ccb slot 2:-1:0.05 |Slot i
ccb slot 1:0:0.866667 |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot 2:0:0.84 |Slot h
ccb slot 4:0:0.85 |Slot i
ccb slot 1:0:0.866667 |Slot j

ccb shared |User a
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

ccb shared |User c
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

ccb shared |User a
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

ccb shared |User c
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

ccb shared |User b
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

ccb shared |User a
ccb action |Action d
ccb action |Action e
ccb action |Action f
ccb action |Action ff
ccb action |Action fff
ccb slot |Slot h
ccb slot |Slot i
ccb slot |Slot j

0 comments on commit b4a54fc

Please sign in to comment.
You can’t perform that action at this time.