Skip to content
Connects Campaign Manager to the RTB4FREE bidders
Java Python Shell Dockerfile
Branch: master
Clone or download
Scott Switzer
Latest commit cf0321e Jul 10, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
libs Add new version of bidder lib Apr 16, 2019
python
query Total aggregation using invalid field name causes dropped campaigns May 30, 2019
shell Initial commit Apr 28, 2018
simdata Initial commit Apr 28, 2018
src Correct formatter specification errors May 20, 2019
tools Add remote docker yourkit support Mar 7, 2019
.gitignore make compatible with AWS elasticsearch Dec 31, 2018
Docker.yourkit Add remote docker yourkit support Mar 7, 2019
Dockerfile rename Dockerfile, update some documentation Jul 10, 2019
README.md rename Dockerfile, update some documentation Jul 10, 2019
bidder.yml Initial commit Apr 28, 2018
config.json Put environment variables in the configuration file Feb 15, 2019
daily.sim Initial commit Apr 28, 2018
docker-compose-debug.yml Set version numbers Apr 12, 2019
docker-compose.yml Dont put empty environment into the container May 20, 2019
hourly.sim
log4j.properties Update to prepare for kinesis Apr 9, 2019
pom.xml Add latest jetty server May 13, 2019
test.txt Initial commit Apr 28, 2018
total.sim Initial commit Apr 28, 2018
wait-for-it.sh Initial commit Apr 28, 2018

README.md

Docker Pulls Docker Stars Build Status

Crosstalk - Communication Interface for RTB4Free

Crosstalk is a JAVA 1.8 based interface between the Campaign Manager and the Bidder. The campaign manager creates, edits and maintains its campaigns in a MySQL database. Crosstalk converts those MySQL tables into a JSON representation of the campaigns as understood by the bidders.

An image of this repo is available directly from Docker Hub

Crosstalk loads the JSON representation of the campaigns into Zerospike, which is the shared context of the bidders. Crosstalk then tells which campaigns in the Zerospike store the bidders should load.

Crosstalk also handles budgets. Querying Elastic Search, it know up to the minute what the spend on all campaigns are. If a campaign exceeds its budget then crosstalk will tell the bidders to unload the running campaign.

Docker

Crosstalk is designed to run as a Docker container.

Docker Swarm

Use Docker swarm to run Crosstalk, Bidder, Zerospike, Kafka and Zookeeper

  1. Copy docker-compose.yml from Project's root directory.

  2. Start the swarm

    $docker swarm init

  3. Start the network

    $docker network create --driver overlay rtb_net --subnet=10.0.9.0/24

  4. Deploy

    $docker stack deploy -c docker-compose.yml crosstalk

Docker Compose

Use Docker Compose to run Crosstalk, Bidder, Zerospike, Kafka and Zookeeper in a single console window:

  1. Copy docker-compose.yml from Project's root directory.

  2. Start the network

    $docker network create rtb_net

  3. Deploy

    $docker-compose up

Docker Stack

To run the entire crosstalk. bidder, zerospike, kafka and zookeeper as a swarm:

$docker swarm init
$docker network create --driver overlay rtb_net --subnet=10.0.9.0/24

#
# Join any workers, if desired; then,
#

$docker stack deploy -c docker-compose.yml crosstalk

Working with Source

If you want to modify the code.

  1. GIT clone, cd to the cloned directory.

  2. Make your changes...

  3. Run maven:

    $mvn assembly:assembly -DdescriptorId=jar-with-dependencies -Dmaven.test.skip=true

  4. Make the docker images locally (note change your repo from jacamars to your repo):

    $docker build -t jacamars/crosstalk:v1 -f Docker.crosstalk .

  5. If you need to push to the repo:

    $docker push jacamars/crosstalk:v1

Changing Operational Parameters

Crosstalk uses a container based file in config.json. If you need to change the parameters within it do it in your own copy and use volumes command to mount into it. Example, suppose you made your own copy of config.json and modified it and you called it ./myconfig.json. You modify the bidder services section in docker-compose.yml to mount. Note the volumes directive:

crosstalk: image: "rtb4free/crosstalk" ports: - "8200:8200" environment: REGION: "AP" PASSWORD: "iamspartacus" GHOST: "192.92.68.11" AHOST: "192.92.68.10" ESPORT: "9200" BROKERLIST: "kafka:9092" PUBSUB: "zerospike" CONTROL: "8100" volumes: - myconfig.json:/config.json networks: - rtb_net depends_on: - kafka - zerospike command: bash -c "./wait-for-it.sh kafka:9092 -t 120 && ./wait-for-it.sh zerospike:6000 -t 120 && sleep 1; ./crosstalk"

HANDY DOCKER ACTIVITIES

List Running Containers

$docker ps

List Images

$docker image ls

Attach to a Running Container

Do a docker ps and then use the container-id or name:

$docker exec -it <id-or-name> /bin/bash

Attach to the Log of a Running Container

Do a docker ps and then use the container-id or name:

$docker logs -f <id-or-name>

Delete an Image

Do a docker ls first

$docker image ls

Find the container's image id

$docker image rm <image-id> --force

Correct Checksum Error

If docker-compose complains about a checksum after you delete a container do this:

$docker-compose rm

Run a container just to inspect it.

All the containers we employ use Alpine Linux. It will always have /bin/sh. Our containers also contain /bin/bash. To inspect the container for crosstalk do this:

$docker image ls | grep crosstalk
# Find the container id, let's sa it is c720081030ce
$docker run -it c720 /bin/bash
bash-4.3 #

System Overview

*** Please Note. This is a --- Maven --- Project, not Gradle or Ant ***

This system connects to MySQL to obtain campaigns, creatives and rtb_standard rules. It connects to Elastic Search to get current spends.

  1. The SQL is turned into a JSON array. The array elements are campaigns. Each campaign json object contains a json object for the target, an rtb_rules object and an array of JSON objects that represents the creatives. Each creative object has rtb_rules.

  2. The first job of the system is to convert these JSON objects of SQL representations of the campaigns into RTB4FREE JSON objects. Then the system stores the RTB4FREE version into Zerospike.

  3. Elastic Search is queried to update all of the current budgets total, daily and hourly. They are queried every minute.

  4. Then the Bidders are told to load the active campaigns from Zerospike that have not exceeded their budgets, and are active.

Steps 1-4 are done in a loop, if any changes occur in the campaigns, the bidders are told to delete or add the associated campaign.

Other activities:

  1. Bidders log their presence in Zerospike, that is timed to delete after 1 minute if not refreshed. This way if bidders crash crosstalk can become aware of this. If a new bidder is seen, it is automatically loaded with the results of 1-4.

  2. Crosstalk writes its own presence in Zerospike, that will time out in 1 minute if not refreshed. This is the deadman switch. If Crosstalk crashes, the deadman switch will auto delete. The bidders watch for this event and if they see this, they will stop bidding.

  3. Crosstalk no longer collects wins, bids, pixels, and clicks. These events are logged directly to ELK using kafka.

  4. Crosstalk provides a web API to allow other systems to interact with the system. It is located by default on :8100/api. The kinds of things you can do are:

a. Get current budgets. b. Get a campaign. c. Get prices. d. Get reasons why campaign is not bidding. e. Get spend rates. f. Add/Modify weights on campaigns with rotating creatives. g. List campaigns. h. Ping the system. i. Refresh the bidders with the campaigns. j. Set budgets k. Set prices for campaigns, creatives, deals. l. Start bidders. m. Stop bidders. n. Update a single campaign in the bidders.

Packages

com.jacamars.dsp.crosstalk.api - The web api com.jacamars.dsp.crosstalk.budget - Interface to Elastic Search, used for budgeting. com.jacamars.dsp.crosstalk.config - The configuration JSON reader com.jacamars.dsp.crosstalk.manager - The code that implements the crosstalk system. com.jacamars.dsp.crosstalk.tests - Some tests. com.jacamars.dsp.crosstalk.tools - Various tools com.jacamars.dsp.crosstalk.unified - The unified logger

Overall Flow

System's main class is in Crosstalk.java.

Crosstalk class starts a Scanner class. The scanner class periodically reads SQL and converts to JSON array. The JSON array is converted into RTB4FREE formatted JSON and is stored in AccountingCampaign.java and AccountingCreative.java. These classes hold the RTB4FREE representations of the campaigns. These 2 classes also make sure budgets are maintained. The Scanner loops through the AccountingCampaign Set and then checks to see if they are online or offline, and if the budgets are within limits. If not, the AccountingCampaign is moved to a set of parked campaigns (not deleted, because they can come back on line).

Once a minute the Budgeting is checked and the Elastic Search system is queried and all the campaigns and creatives are queried. A separate set of Campaigns and Creatives Maps are maintained - these are used to keep track of current spends on ALL campaigns and creatives. Note the AccountingCampaign and AccountingCreatives are separate.

As each AccountingCampaign first calls runUsingElk() - the spends are obtained from the BudgetControl class (via these internal Campaign and Creative objects). Think of it as this: AccountingCampaign and AccountingCreative maintain budget limits from the SQL system and the BudgetController.Campaign and BudgetController.Creative maintain current spends. The AccountingCampaign queries the BudgetController for the current spend.

Then the AccountingCampaign can call budgetExceeded() to see if the budget has been exceeded for total, daily, or hourly.

Each AccountingCampaign object also calls AccountingCreative.runUsingElk() to determine up to date spend amounts. Then the AccountingCreative.budgetExceeded() is called to see if the budget has been exceeded.

The BudgetController runs once a minute. It does Elastic Search queries for Total, Daily, and Hourly. The latency is also computed. This latency is the lag between now, and the last time ES saw a log record for the spend. This latency is in seconds.

The spends are located in Aggregator class instances. The Aggregator has a TreeMap of Campaign objects. These Campaign objects are visible only within the Budget package. Each Campaign object has a list of Creatives. The Campaign and Creative objects each contain the spend amounts for the total, daily, hour, and the spend delta (a moving average). To find the current Campaign or creative spend or rate you query the BudgetController instance.

Please note, the total aggregation is the total up to 0 hour of the current day. After all the total, daily and hourly spends are obtained, the daily amount is added to the total to come up with the current total spend.

The lag of the Elastic Search system is known. For example, if the current lag is 2.5 minutes, that means the effective total spend is equal to the current spend rate (in minutes) * 2.5 + the observed total. The same calculation for effective spend works for hourly and daily too. This way, if we should fall behind in the log, we can still stay fairly accurate in our budgeting.

Build Crosstalk

$mvn assembly:assembly -DdescriptorId=jar-with-dependencies -Dmaven.test.skip=true

The all-in-one jar file is in ./target

Javadoc

To make just the javadoc:

$mvn javadoc:javadoc

Javadoc is located in ./target/site/apidocs/index.html

Configuration Files

There are 5 configuration files, 3 Elastic Search queries, a log4j.properties and the config.json.

The application configuration file is: ./config.json

The logging configuration file is: ./log4j.properties

The Elastic Search queries are stored in the following queries directory as: ./queries/daily.json ./queries/total.json ./queries/lastlog.json

Running Crosstalk on the Local Machine

You can run crosstalk in development mode if you do the following.

  1. First you need a bidder running, on the localhost with your crosstalk development. You can open a console window and use the following command to start a production zookeeper, kafka, zerospike and bidder on your local system;

     $docker-compose -f bidder.yml up
    
  2. Now you can start your Crosstalk in a console window or in your IDE's debugger. Note. Crosstalk uses ./config.json as its default config file. You can start Crosstalk with a different configuration file using the name as its argument. The region in config.json is set to NJ.

    Or to run in the console:

     $tools/crosstalk [config-file-name]
    
  3. Configuration file and shell variables. The config file can access shell environment variables. Here is a list of environment variables that come with defaults. If you use these and don't provide a value for the shell variable, it will use the indicated default.

    If you use your own shell environment variables and don't provide a value, then "" will be used.

    Crosstalk specific:

    $CONTROL "8100" $JDBC "jdbc:mysql://localhost/rtb4free?user=ben&password=test"

    Shared with the bidder:

    $ADMINPORT "8155" $BROKERLIST "[localhost:9092]" $CONCURRENCY "3" $EXTERNAL "http://localhost:8080" $FREQGOV "true" $HOSTNAME Docker instance $INITPORT "6002" $PIXEL "localhost" $PUBPORT "6000" $PUBSUB "localhost" $REQUESTSTRATEGY "100" $SUBPORT "6001" $THREADS "2000" $VIDEO "localhost" $WIN "localhost"

    Note, all substitutions are strings.

Ping

http://ip-address-of-crosstalk:8100/status

API

http://ip-address-of-crosstalk:8100/api

POST the appropriate command

Other Directories

src - Source code for the Java language config - Up to date configurations used in the all the regions docs - Any extraneous documentation libs - External libararies not maintained with Maven logs - Where the log files go (Not the application log, that is in /var/log/crosstalk.log python - Python library for interfacing with Crosstalk's api queries - The template queries used with Elastic Search. shell - Shell script/curl stuff to test and demo the web API. target - Where the build artifacts are stored. tools - Where the console tools are located

You can’t perform that action at this time.