Home
Gem5 NoSQL (CE Group - UC)
The gem5 Simulator, adapted to run NoSQL YCSB-cassandra workload
-
Version:
- Initial commit from official gem5: a09d5f86ae9653dd787bb9f80acfcb167c6dcbab
- Date: Sat Oct 15 15:11:07 2016 -0500
Apache Cassandra official site
Dependences.
- Things you'll need that aren't part of gem5 itself.
-
Hardware
gem5 is largely agnostic about the hardware it runs on. However, there are several considerations to keep in mind when running gem5:
- A 64-bit platform is strongly preferred over a 32-bit platform.
- gem5's ISA support involves some very large auto-generated C++ files, which can require up to 1 GB for g++ to compile.
- Ideally you should choose a host with the same endianness as the ISA you will be simulating.
-
Operating System
gem5 runs best on Linux and Unix. Most developers, and the current regression system, use Linux, so this platform has the best support. From CE grupo, we recommend Linux 8 (jessie) or newer.
System prerequisites
- Things you need to be ready in your host.
-
KVM support
Gem5 speed can be greatly enhanced using virtual machine support. This can be use to boot the system or fast forward the state easily.To enable it you don't meet anything (but the
/dev/kvm
must be enabled at compile time). What you need is the kvm (qemu-kvm package) installed in the host.(root)$ apt-get install qemu-kvm
Also, the admins should enable the access to users via the inclusion in a kvm users group. (who have access to /dev/kvm).
(root)$ usermod -G kvm <username>
-
External tools and required versions
- g++, version 4.8 or newer or clang version 3.1 or newer.
- Python, version 2.6 - 2.7 (it doesn't support Python 3.X).
- SCons, version 0.98.1 or newer.
- zlib, any recent version. For Debian/Ubuntu, you will need the "zlib-dev" or "zlib1g-dev" package to get the zlib.h header file as well as the library itself.
- m4, the macro processor.
-
Adicional software
-
sudo tool. Additionally, (as root) you need configure it to avoid continuous password requests.
(root)$ vi /etc/sudores # ... <your username> ALL=(ALL:ALL) NOPASSWD: ALL
- debootstrap, version 1.0.67 or newer.
- git, version 2.1.4 or newer.
- gzip, version 1.6-4 or newer.
-
sudo tool. Additionally, (as root) you need configure it to avoid continuous password requests.
Get gem5-NoSQL.
-
Download our modified version of gem5 from gitHUB.
$ git clone https://github.com/abadp/gem5-NoSQL.git
Directories.
- Description of the main directory hierarchy
- atc_scripts: disk images generation scripts for run YCSB/Cassandra workloads
- configs: example simulation configuration scripts
- ext: less-common external packages needed to build gem5
- src: source code of the gem5 simulator
- system: source for some optional system software for simulated systems
- tests: regression tests
- util: useful utility programs and files
- images: kernel file (.gz) and disk images
- nosql: Cassandra and YSCB source (adapted)
-
How to build or modify gem5 system.
$ cd gem5_NoSQL $ scons build/X86/gem5.opt <ENTER>
-
Setting up the disk image and benchmark applications for full system simulation
$ cd gem5_NoSQL $ cd atc_scripts $ vi config.py # the below line must be changed with your gem5 absolute path gem5_dir = "/absolute/path/to/your/GEM5" $ ./create_disk_img.py $ ./update_disk_img.py
The result of running these commands is two files; The first one, a base disk image of debian jessie (
x86_debian-jessie.img
) and a second image with all you need to run YCSB/cassandra workload on gem5 (x86_debian_MULTIYCSB-cassandra
).Note the kernel file is delivered in the images/kernels directory. Due to its size, the file must be compressed, so you need uncompress it before use it with gem5.
$ cd images/kernels $ gzip -d x86_64-vmlinux-3.18.34_ceconfig.smp.gz
- How to run YCSB/cassandra workload on gem5's build system
-
Launch scripts In the atc_scripts/launch_apps/NoSQL/cassandra directory you have two sample files (
script, script-run
), both of them are needed to run gem5 and YCSB/cassandra properly. Check them out and change whatever you consider, according to next:./launch_app.sh $num_nodes $app $DB_size $num_threads ./launch_app-run.sh $num_nodes $app $DB_size $num_threads $app_size
Where:
-
$num_nodes
is the number of nodes simulated by gem5. -
$app
is the identifier or a YCSB workload. It can be from "a" to "f". Go to YCSB Core Workloads for details. -
$DB_size
is the number of records to load into the cassandra database initially (default: 0). We recommend to use$DB_size=950000
to create a data base of 1 GB. -
$num_threads
is the number of YCSB client threads. By default, the YCSB Client uses a single worker thread, but additional threads can be specified. This is often done to increase the amount of load offered against the database. We recommend to use$num_threads=1
in any case. -
$app_size
is the number of operations to perform by YCSB client (for each thread). Typically you will want to use the it to control the amount of offered load.
-
-
Load the cassandra DB
$ build/X86/gem5.opt configs/ac/fs_ac.py --kernel=</absolute/path/to/your/GEM5>/images/kernels/x86_64-vmlinux-3.18.34_ceconfig.smp --disk-image=</absolute/path/to/your/GEM5>/images/disks/x86_debian_MULTIYCSB-cassandra.img --cpu-type=kvm --cluster=<number of nodes> --num-cpus=<number on cores per node> --mem-size=<Main memory per simulated nodo>MB --sim_quantum=50000000 --ethernet=switch --script=</absolute/path/to/your/GEM5>/atc_scripts/launch_apps/NoSQL/cassandra/script --checkpoint-at-end
You can use m5term (
utils/term/
) for connecting to the console of every simulated node and show the running process. -
Run the simulation
$ build/X86/gem5.opt configs/ac/fs_ac.py --kernel=</absolute/path/to/your/GEM5>/images/kernels/x86_64-vmlinux-3.18.34_ceconfig.smp --disk-image=</absolute/path/to/your/GEM5>/images/disks/x86_debian_MULTIYCSB-cassandra.img --cpu-type=atomic --restore-with-cpu=atomic --cluster=<number of nodes> --num-cpus=<number on cores per node> --mem-size=<Main memory per simulated nodo>MB --ethernet=switch --script=</absolute/path/to/your/GEM5>/atc_scripts/launch_apps/NoSQL/cassandra/script-run -r 1
Enjoy!.