AIFM stands for Application-Integrated Far Memory. It provides a simple, general, and high-performance mechanism for users to adapt ordinary memory-intensive applications to far memory. Different from existing paging-based systems, AIFM exposes far memory as far-memory pointers and containers in the language level. AIFM's API allows its runtime to accurately capture application semantics, therefore making intelligent decisions on data placement and movement.
Currently, AIFM supports C++ and TCP-enabled remote server memory.
- AIFM
- AIFM: High-Performance, Application-Integrated Far Memory
Zhenyuan Ruan, Malte Schwarzkopf, Marcos Aguilera, Adam Belay
The 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI ‘20)
We strongly recommend you to run AIFM using the xl170 instance of Cloudlab as the code has been throughly tested there. We haven't done any test in other hardware environment. If you have trouble applying an cloudlab account, please contact us for assistance.
-
Apply a Cloudlab account if you do not have one.
-
Now you have logged into Cloublab console. Click
Experiments
|-->Create Experiment Profile
. Uploadcloudlab.profile
provided in this repo root. -
Create a two-node instance using the profile.
Now you have logged into your Cloudlab instances. You have to install the necessary dependencies in order to build AIFM. Note you have to do run those steps on all Cloudlab nodes you have created.
- Update package database and Linux kernel version.
sudo apt-get update
echo Y | sudo apt-get install linux-headers-5.0.0-20 linux-headers-5.0.0-20-generic linux-hwe-edge-tools-5.0.0-20 linux-image-5.0.0-20-generic linux-modules-5.0.0-20-generic linux-tools-5.0.0-20-generic
sudo reboot
- Install Mellanox OFED.
wget "http://content.mellanox.com/ofed/MLNX_OFED-4.6-1.0.1.1/MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64.tgz"
tar xvf MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64.tgz
cd MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64
sudo ./mlnxofedinstall --add-kernel-support --dpdk --upstream-libs # it's fine to see 'Failed to install libibverbs-dev DEB'
sudo /etc/init.d/openibd restart
- Install libraries and tools.
echo Y | sudo apt-get --fix-broken install
echo Y | sudo apt-get install libnuma-dev libmnl-dev libnl-3-dev libnl-route-3-dev
echo Y | sudo apt-get install libcrypto++-dev libcrypto++-doc libcrypto++-utils
echo Y | sudo apt-get install software-properties-common
echo Y | sudo apt-get install gcc-9 g++-9 python-pip
echo Y | sudo add-apt-repository ppa:ubuntu-toolchain-r/test
echo Y | sudo apt-get purge cmake
sudo pip install cmake
- Set bash as the default shell.
chsh -s /bin/bash
For all nodes, clone our github repo in a same path, say, your home directory.
AIFM relies on Shenango's threading and TCP runtime. The build_all.sh
script in repo root compiles both Shenango and AIFM automatically.
./build_all.sh
After rebooting machines, you have to rerun the script to setup Shenango.
sudo ./scripts/setup_machine.sh
So far you have built AIFM on both nodes. One node is used as the compute node to run applications, while the other node is used as the remote memory node. Now edit aifm/configs/ssh
in the compute node; change MEM_SERVER_SSH_IP
to the IP of the remote memory node (eno49 inet in ifconfig
), and MEM_SERVER_SSH_USER
to your ssh username. Please make sure the compute node can ssh the remote memory node successfully without password.
Now you are able to run AIFM programs. aifm/test
contains a bunch of test files of using local/far pointers and containers (and few other system components). You can also treat those tests as examples of using AIFM. aifm/test.sh
is a script that runs all tests automatically. It includes the commands of running AIFM end-to-end.
./test.sh
We provide code and scripts in aifm/exp
folder for reproducing our experiments. For more details, see aifm/exp/README.md
.
Github Repo Root
|---- build_all.sh # A push-button build script for building both Shenango and AIFM.
|---- shenango # A modified version of Shenango runtime for AIFM. DO NOT USE OTHER VERSIONS.
|---- aifm # AIFM code base.
|---- bin # Test binaries and a TCP server ran at the remote memory node.
|---- configs # Configuration files for running AIFM.
|---- inc # AIFM headers.
|---- src # AIFM cpp files.
|---- test # Test files of using far-memory pointers and containers.
|---- snappy # An AIFM-enhanced snappy.
|---- DataFrame # C++ DataFrame library (which includes both the original version and the AIFM version).
|---- exp # Code and scripts for reproducing our experiments.
|---- Makefile
|---- build.sh # The script for building AIFM.
|---- test.sh # The script for testing AIFM.
|---- shared.sh # A collection of helper functions for other scripts.
AIFM is a research prototype rather than a production-ready system. Its current implementation has two main limitations.
- A thread cannot have more than one live
DerefScope
at any time (but you can have multiple liveDerefScope
s across different threads). For example, when executingfoo()->bar()
in a thread, if you've already instantiated aDerefScope
infoo()
, you must not instantiate another one inbar()
. The right way is to pass the one infoo()
as a reference tobar()
. - AIFM assumes that the remote memory is sufficiently large so that it never garbage collects the dead (i.e., freed) objects in the remote memory, see
FarMemManager::mutator_wait_for_gc_far_mem()
inaifm/src/manager.cpp
. However, AIFM does garbage collects the dead objects in the local memory since it's a precious resource.
Contact zainruan [at] csail [dot] mit [dot] edu for assistance.