Skip to content

ashrithr/game_data_gen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Game DataSet

Mocks data generated by gaming website

DataSet Format:

column_header description
cid customer id
gender customer's gender
age age of the customer
country country to which customer belongs to
register_date date on which user reistered with us
friend_count number of friends a user has
lifetime number of days a user has been active
citygame_played number of times citygame has been played by user
pictionarygame_played number of times pictionary game has been played by user
scramblegame_played number of times scaramble game has been played by user
snipergame_played number of times sniper game has been played by user
revenue revenue generated by the user
paid_subscriber whether the customer is paid customer or not, represented by yes or no

If --extra option is enabled additional columns are populated as well

extra_columns description
name name of the user
email email address used to register the account
phone contant number of the user
address user provided address during registration

Preqs

This generator requires ruby version ≥ 1.9, to install ruby 1.9.3 using rvm follow these instructions

Generating the dataset:

This generator takes in various options for generating data:

Usage: generator.rb [options]
    -l, --lines LINES                number of lines to generate
    -c LinesPerProcess,              number of lines to generate per process (default: 50,000)
        --lines-per-process
    -m, --multiple-tables            generates data in multi-table format
    -p, --output-path PATH           directory path where output should be written to
    -e, --extra-data                 generates additional user information
    -h, --help

###Generating data in single table mode: This mode mocks random user interaction data into single file which can be loaded into a single table.

The following command will generate 100,000 lines into file(s) named analytics_[process_id].data at /tmp specified by --ouput-path and will mock extra user information (such as name, email, phone, address)

ruby generator.rb --lines 100000 --output-path /tmp --extra-data

###Generating data in multi table mode: This mode mocks random user interaction data into multiple files:

  • analytics_customer[process_id] will store the user information such as (cid, name, gender, age, register_date, country, total_days)
  • analytics_facts[process_id] will store user-game facts such as (cid, game_played, game_played_time)
  • analyics_revene[process_id] will store users who pay (cid, payed_date, revenue)
ruby generator.rb --lines 100000 --output-path /tmp --extra-data --multiple-tables

###Generating data in multi-process mode: To generate data using multiple processes use --lines-per-process option, which will specify lines per process. For example, the following example will generate 75,000 lines with 25,000 lines per process

ruby generator.rb --lines 75000 --output-path /tmp --extra-data --multiple-tables --lines-per-process 25000

Note: [process_id] represents id of the invoked process

About

Mocks data generated by gaming website

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages