Evaluation toolkit that mimics the usage of the Twitter social network. The workload may be used with both key-value stores and relational databases.
Java Shell
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Bench
Cassandra
GraphServer
MYSQL
Voldemort
doc
LICENSE.txt
README.mkd

README.mkd

Description:

UBlog is a performance evaluation toolkit that mimics the usage of the Twitter social network. The workload may be used with both key-value stores and relational databases. Due to its modular design, anyone is free and welcome to write any implementation targeting any other system and have it easily plugged into the framework.

Our workload definition has been shaped by the results of recent studies on Twitter. We consider just the subset of the seven most used operations from the Twitter API (Search and REST API as of March 2010):

  • statuses_user_timeline
  • statuses_friends_timeline
  • statuses_mentions
  • search_contains_hashtag
  • statuses_update
  • friendships_create
  • friendships_destroy

About:

The current version has implementations for Cassandra, Voldemort, and MySQL.

Requirements

Almost all dependencies are automatically fetched by Maven. The others must be placed in a folder named lib, depending on the target implementation:

Cassandra

  • uuid-3.1.jar
  • apache-cassandra-0.6.1.jar
  • libthrift-r917130.jar
  • hector-0.6.0-11.jar

Voldemort

  • voldemort-0.70.1.jar

Build

Once you have downloaded the source directory you should change dir into it and issue:

  1. $ cd Bench
  2. $ mvn install
  3. $ cd ..
  4. $ cd GraphServer
  5. $ mvn assembly:assembly
  6. $ cd ..
  7. $ cd TargetImplementation: Cassandra, MYSQL or Voldemort
  8. $ mvn assembly:assembly

This creates a tar.gz with the GraphServer and the target implementation.

Configuration

Inside the tar.gz of each target implementation there are the required libraries, a windows script, a UNIX script, and a folder conf with the configuration files. This conf folder contains the:

  • log4j.properties : To configure the proper logger level.

  • policy : To define the access policy to the RMI server.

  • ublog.properties : That allows to configure several parameters of the benchmark:

    • benchmark.social.initialTweetsFactor : A initial tweet factor of n means that a user with f followers will have n x f initial tweets.
    • benchmark.social.maximumMessagesTimeline : The maximum number of tweets in the timeline of a user.
    • benchmark.social.seedNextOperation, benchmark.social.seedOwner, benchmark.social.seedTopic, benchmark.social.seedStartFollow : The random seeds used in the benchmark. If not defined, System.nanoTime() is used.
    • benchmark.social.probabilities.probabilitySearchPerTopic, benchmark.social.probabilities.probabilitySearchPerOwner, benchmark.social.probabilities.probabilityGetRecentTweets, benchmark.social.probabilities.probabilityGetFriendsTimeline, benchmark.social.probabilities.probabilityStartFollowing, benchmark.social.probabilities.probabilityStopFollowing : The probability of a given type of operation occur. Sum must be less or equal than one. The remaining probability is for new tweet operation.
    • benchmark.server.name, benchmark.server.port : The RMI server name and port.
  • A properties file to configure the data connection to the target implementation.

Cassandra

The cassandra.properties allows to configure the following parameters:

  • node : The list of nodes in the form hostname:port.
  • maxActiveConnections : The maximum number of active connections per node.
  • partitioner : The used partitioner, random or ordered.

The folder conf also contains a example of a storage-conf.xml configured to run this workload on Cassandra.

MySQL

The mysql.properties allows to configure the following parameters:

  • host.name, host.port : The host name and port of the MySQL server.
  • dbName : The database name.
  • userName : The user name.
  • password : The password.

Voldemort

The voldemort.properties allows to configure the following parameters:

  • node : The list of nodes in the form hostname:port.

The folder conf also contains a example of a stores.xml configured to run this workload.

Usage

  1. GraphServer

$ ./run.sh hostname

  1. Benchmark

$ ./run.sh sizeTotal size usernameStarter nOperations thinkTime

Where:

  • size: is the number of concurrent clients.
  • usernameStarter: The id of the first used to be emulated.
  • nOperations: is the number of total operations to be generated by the workload.
  • thinkTime: is the time between operations for each client.

Feedback

Updated source and an issue tracker are available at:

https://github.com/rmpvilaca/UBlog-Benchmark

Your feedback is welcome.

#Contact

Ricardo Vilaça (rmvilaca@di.uminho.pt)

Francisco Cruz (fmcruz@di.uminho.pt)