Skip to content

A "Hello World" for the Power8 CAPI Interface. Includes application C and RTL code.

License

Notifications You must be signed in to change notification settings

sbates130272/capi-textswap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# Text Swap AFU

This package contains RTL and C test code for an AFU accelerated text searching and
replacing. It works using IBM's Coherent Accelerator Processor Interface.

To get started, if this is an initial clone of the repositor, you should
initialize the submodules:

git submodule update --init

After that is done, configure the project:

./waf configure [--hardware]

This will check for the required dependancies. If it doesn't succeed, please
find the missing components and re-run until it does. Note at the
current time a common issue during the configure phase is a missing
cxl.h which should be obtained from a working Power8 based system and
placed somewhere on the build path.

Once the configuration passes, you may build the C code with:

./waf

Note that you may see an issue when you do this on a machine which
does not have a capi enabled kernel. In that case you will have to
copy the offending files from a suitable machine to the current
machine. If you have no access to said files you should contact
IBM. Note that certain files (e.g. cxl.h) are now pulled by the waf
script using the internet so you will need network connectivity for
that to work. 

Note that currently hardware based builds will not complete
sucessfully on non-PPC architectures.

## Running Simulations

To run a simulation, use the ./sim script. Currently only Cadence's ncsim tools
are supported. You may select a binary to run with the -e option (it defaults to
'textswap') and you may pass additional arguments to this binary by passing them
after a '--'. For example, to simulate unittest with random data run:

./sim -e unittest -- -R

There is also a regression script

./run_tests

## Running on Hardware

To run on actual powerpc hardware you must configure the code specially:

./waf configure --hardware

Then you may rebuild the code and run the binaries normally:

./waf
build/unittest -R

The regression script also works on hardware (when available)

./run_tests

## FPGA Build and Bitfiles

To build the FPGA rbf file to run against the C code you will need to
work with IBM to enable you with the PSL files. This files should then
be placed or symlinked in a folder called libs/psl_fpga and then run

make

in the fpga folder (of course you will need Quartus tools installed on
your system). A pre-built bitfile that works with this codebase is
available on the releases page of this GitHub project at

https://github.com/sbates130272/capi-textswap/releases

To download the rbf file to the FPGA on the CAPI card you will need to
tools provided by IBM and/or Nallatech. Support for other cards is an
open issue for this project.

## Generate Datasets

You can generate datasets using

./build/gen_haystack -s <SIZE> [OUTPUT_FILE]

and you can run

./build/gen_haystack -h

to see the defaults. For example

./build/gen_haystack -s 8G /mnt/nvme/demo.GoPower8.50.8G.dat

Will generate a 8GiB dataset with the phase "GoPower8" inserted in 50
random locations in a file located at /mnt/nvme/demo.GoPower8.50.8G.dat.
This file can be on any block IO device such as a HDD or SSD.

## Performance Testing

You can test how quickly you can read the datasets and count the
occurances of the needle string using something like

./build/textswap -R -E 50 /mnt/nvme/demo.GoPower8.50.8G.dat -r 14 -c 8M -q 22

which should produce an output like

Transfer rate:
    8.00GiB in 2.9    s     2.72GiB/s
    Matches Found: 50 (Good)

Note this if all check pass the program will return with exit code 0
(to faciliate scripting). A failing test will exist with a non-zero
exit code. You should play with the -r, -c and -q options to tune the
performance. Run

./build/textswap -h

for a complete list of command line options and defaults. Note you may
need to run the above command as root or set your udev rules to permit
access to the CAPI device.

A word of warning, be sure to flush your caches before and between
performance test runs or you will see exceptional performance due to
hitting the page cache. We normally install a simple script in
/usr/local/sbin called drop_caches which consists of

#!/bin/sh

sync
echo 3 > /proc/sys/vm/drop_caches

## CPU Utilization

A simple script called cpuperf.py lives in the scripts folder and it
can be run either in an adjacent shell or as a background task (using
something like nohup). By running

./scripts/cpuperf.py -C textswap -w 100 -s | tee cpuperf.log

the script will capture CPU and memory utilization using a system call
to ps once every 100ms when textswap is running. Omit the -s to
capture at all times. Refer to the ps manual page for specifics on
exactly what is being measured.

## Updates

This code is open-source, we welcome patches and pull requests against
this codebase. You are under no obligation to submit code back to us
but we are hoping you will ;-).