Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: AMD ROCm support with plugin #1519

Closed
wants to merge 29 commits into from
Closed
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
51efa5c
Revert "Allow systemcfg proc file to be dumped"
rajbhar May 15, 2021
fbfa556
criu/parse: Treat some unsupported VMAs as regular
rajbhar Nov 20, 2020
7146313
criu/plugin: Initialize AMD KFD header
rajbhar May 4, 2021
4028ddc
criu/files-reg: Add offset and file path plugin
rajbhar Apr 15, 2021
311aee4
criu/plugin: Support AMD ROCm Checkpoint Restore with KFD
rajbhar Apr 15, 2021
4c1b8f3
criu/plugin: Optimize the proto image size
rajbhar Feb 3, 2021
f55fe48
criu/plugin: optimization for large bar read
rajbhar Feb 26, 2021
ca99c15
criu/restore: Introduce restore late stage hook
rajbhar Apr 15, 2021
ffeb86b
criu/plugin: Implement restore late hook for kfd
rajbhar Apr 15, 2021
0a6771f
criu/plugin: Add support for dumping and restoring queues
dayatsin-amd Jan 26, 2021
c08db4a
criu/plugin: dump debug logs selectively
rajbhar Feb 12, 2021
24a6761
criu/plugin: Support larger memory footprints
dayatsin-amd Feb 16, 2021
0c32304
criu/plugin: Dump and restore events
dayatsin-amd Apr 15, 2021
9ff8973
criu/plugin: Add initial documentation for ROCm support.
rajbhar Mar 18, 2021
e4819aa
criu/plugin: Re-adjust doorbell offset for queues
dayatsin-amd Mar 22, 2021
1199c20
criu/plugin: Pytorch container with criu
rajbhar Apr 13, 2021
95c9258
criu/plugin: Dockerfile for AMD criu repo
rajbhar Apr 20, 2021
8268b61
criu/files: *RFC* Don't cache fd for amdgpu devices
rajbhar Apr 27, 2021
d83ddd5
criu/plugin: Add whitepaper document
fxkamd Apr 30, 2021
274aabd
criu/plugin: Add build options for amdgpu plugin
rajbhar May 12, 2021
84135f4
criu/plugin: Implement system topology parsing
dayatsin-amd Apr 20, 2021
16778cc
criu/plugin: Remap GPUs on checkpoint restore
dayatsin-amd Apr 20, 2021
98eddc9
criu/plugin: Add parameters to override mapping
dayatsin-amd Apr 20, 2021
c87fdf5
criu/plugin: Add unit tests for GPU remapping
dayatsin-amd May 18, 2021
ee928e1
criu/plugin: Read and write BO contents in parallel
dayatsin-amd May 18, 2021
d838942
criu/plugin: Restore libhsakmt shared memory files
dayatsin-amd Jun 10, 2021
a5df3ad
criu/plugin: fix build warnings
rajbhar Jun 25, 2021
f81a453
script/builds: add build dependepncy for libdrm
rajbhar Jun 25, 2021
4f864a1
Merge branch 'criu-dev' into criu-dev
rajbhar Jun 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions Documentation/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ endif
FOOTER := footer.txt
SRC1 += crit.txt
SRC1 += compel.txt
SRC1 += amdgpu_plugin.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name amdgpu_plugin (i.e., man amdgpu_plugin) is global.
What do you think about using something like criu-amdgpu-plugin instead?

SRC8 += criu.txt
SRC := $(SRC1) $(SRC8)
XMLS := $(patsubst %.txt,%.xml,$(SRC))
Expand Down
99 changes: 99 additions & 0 deletions Documentation/amdgpu_plugin.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
ROCM Support(1)
===============

NAME
----
amdgpu_plugin - A plugin extention to CRIU to support checkpoint/restore in
userspace for AMD GPUs.


CURRENT SUPPORT
---------------
Single and Multi GPU systems (Gfx9)
Checkpoint / Restore on same system
Checkpoint / Restore inside a docker container
Pytorch

DESCRIPTION
-----------
Though *criu* is a great tool for checkpointing and restoring running
applications, it has certain limitations such as it cannot handle
applications that have device files open. In order to support *ROCm* based
workloads with *criu* we need to augment criu's core functionality with a
plugin based extention mechanism. *amdgpu_plugin* provides the necessary support
to criu to allow Checkpoint / Restore with ROCm.


Dependencies
~~~~~~~~~~~~~~
*amdkfd support*::
In order to snapshot the *VRAM* and other *GPU* device states, we require
an updated version of amdkfd(amdgpu) driver. The kernel patches are under
review currently.

*criu 3.15*::
This work is rebased on latest criu release available at this time.


OPTIONS
-------
Optional parameters can be passed in as environment variables before
executing criu command.

*KFD_FW_VER_CHECK*::
Enable or disable firmware version check.
If enabled, firmware version on restored gpu needs to be greater than or
equal firmware version on checkpointed GPU. Default:Enabled

E.g:
KFD_FW_VER_CHECK=0

*KFD_SDMA_FW_VER_CHECK*::
Enable or disable SDMA firmware version check.
If enabled, SDMA firmware version on restored gpu needs to be greater than or
equal firmware version on checkpointed GPU. Default:Enabled

E.g:
KFD_SDMA_FW_VER_CHECK=0

*KFD_CACHES_COUNT_CHECK*::
Enable or disable caches count check. If enabled, the caches count on
restored GPU needs to be greater than or equal caches count on checkpointed
GPU. Default:Enabled

E.g:
KFD_CACHES_COUNT_CHECK=0

*KFD_NUM_GWS_CHECK*::
Enable or disable num_gws check. If enabled, the num_gws on
restored GPU needs to be greater than or equal num_gws on checkpointed
GPU. Default:Enabled

E.g:
KFD_NUM_GWS_CHECK=0

*KFD_VRAM_SIZE_CHECK*::
Enable or disable VRAM size check. If enabled, the VRAM size on
restored GPU needs to be greater than or equal VRAM size on checkpointed
GPU. Default:Enabled

E.g:
KFD_VRAM_SIZE_CHECK=0

*KFD_NUMA_CHECK*::
Enable or disable NUMA CPU region check. If enabled, the plugin will restore
GPUs that belong to one CPU NUMA region to the same CPU NUMA region.
Default:Enabled

E.g:
KFD_IGNORE_NUMA=1


AUTHOR
------
The AMDKFD team.


COPYRIGHT
---------
Copyright \(C) 2020-2021, Advanced Micro Devices, Inc. (AMD)
15 changes: 12 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ HOSTCFLAGS += $(WARNINGS) $(DEFINES) -iquote include/
export AFLAGS CFLAGS USERCLFAGS HOSTCFLAGS

# Default target
all: flog criu lib crit
all: flog criu lib crit amdgpu_plugin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Adrian mentioned in his comment, the plugin should not be built automatically by typing make. This is because it requires additional dependencies that are optional for CRIU.

To enable the build we could use something like AMDGPU_PLUGIN=1 make instead?

Copy link
Contributor Author

@rajbhar rajbhar Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, I've added nmk based dependencies check on libdrm in the Makefile

.PHONY: all

#
Expand Down Expand Up @@ -290,15 +290,19 @@ clean mrproper:
$(Q) $(MAKE) $(build)=crit $@
.PHONY: clean mrproper

clean-amdgpu_plugin:
$(Q) $(MAKE) -C plugins/amdgpu clean
.PHONY: clean-amdgpu_plugin

clean-top:
$(Q) $(MAKE) -C Documentation clean
$(Q) $(MAKE) $(build)=test/compel clean
$(Q) $(RM) .gitid
.PHONY: clean-top

clean: clean-top
clean: clean-top clean-amdgpu_plugin

mrproper-top: clean-top
mrproper-top: clean-top clean-amdgpu_plugin
$(Q) $(RM) $(CONFIG_HEADER)
$(Q) $(RM) $(VERSION_HEADER)
$(Q) $(RM) $(COMPEL_VERSION_HEADER)
Expand Down Expand Up @@ -326,6 +330,10 @@ test: zdtm
$(Q) $(MAKE) -C test
.PHONY: test

amdgpu_plugin:
$(Q) $(MAKE) -C plugins/amdgpu all
.PHONY: amdgpu_plugin

#
# Generating tar requires tag matched CRIU_VERSION.
# If not found then simply use GIT's describe with
Expand Down Expand Up @@ -403,6 +411,7 @@ help:
@echo ' cscope - Generate cscope database'
@echo ' test - Run zdtm test-suite'
@echo ' gcov - Make code coverage report'
@echo ' amdgpu_plugin - Make AMD GPU plugin'
.PHONY: help

lint:
Expand Down
10 changes: 8 additions & 2 deletions Makefile.install
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ MANDIR ?= $(PREFIX)/share/man
INCLUDEDIR ?= $(PREFIX)/include
LIBEXECDIR ?= $(PREFIX)/libexec
RUNDIR ?= /run
PLUGINDIR ?= /var/lib/criu

#
# For recent Debian/Ubuntu with multiarch support.
Expand All @@ -26,7 +27,7 @@ endif
LIBDIR ?= $(PREFIX)/lib

export PREFIX BINDIR SBINDIR MANDIR RUNDIR
export LIBDIR INCLUDEDIR LIBEXECDIR
export LIBDIR INCLUDEDIR LIBEXECDIR PLUGINDIR

install-man:
$(Q) $(MAKE) -C Documentation install
Expand All @@ -40,12 +41,16 @@ install-criu: criu
$(Q) $(MAKE) $(build)=criu install
.PHONY: install-criu

install-amdgpu_plugin: amdgpu_plugin
$(Q) $(MAKE) -C plugins/amdgpu install
.PHONY: install-amdgpu_plugin

install-compel: $(compel-install-targets)
$(Q) $(MAKE) $(build)=compel install
$(Q) $(MAKE) $(build)=compel/plugins install
.PHONY: install-compel

install: install-man install-lib install-criu install-compel ;
install: install-man install-lib install-criu install-compel install-amdgpu_plugin ;
.PHONY: install

uninstall:
Expand All @@ -54,4 +59,5 @@ uninstall:
$(Q) $(MAKE) $(build)=criu $@
$(Q) $(MAKE) $(build)=compel $@
$(Q) $(MAKE) $(build)=compel/plugins $@
$(Q) $(MAKE) -C plugins/amdgpu uninstall
.PHONY: uninstall
15 changes: 15 additions & 0 deletions criu/cr-restore.c
Original file line number Diff line number Diff line change
Expand Up @@ -2451,6 +2451,21 @@ static int restore_root_task(struct pstree_item *init)
pr_err("Unable to flush breakpoints\n");

finalize_restore();
/*
* Some external devices such as GPUs might need a very late
* trigger to kick-off some events, memory notifiers and for
* restarting the previously restored queues during criu restore
* stage. This is needed since criu pie code may shuffle VMAs
* around so things such as registering MMU notifiers (for GPU
* mapped memory) could be done sanely once the pie code hands
* over the control to master process.
*/
for_each_pstree_item(item) {
pr_info("Run late stage hook from criu master for external devices\n");
ret = run_plugins(RESUME_DEVICES_LATE, item->pid->real);
if (ret < 0)
pr_err("criu: restore late stage hook for external plugin failed\n");
}

ret = run_scripts(ACT_PRE_RESUME);
if (ret)
Expand Down
11 changes: 9 additions & 2 deletions criu/file-ids.c
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,18 @@ int fd_id_generate_special(struct fd_parms *p, u32 *id)
{
if (p) {
struct fd_id *fi;
struct stat st_kfd;

fi = fd_id_cache_lookup(p);
if (fi) {
*id = fi->id;
return 0;
if (stat("/dev/kfd", &st_kfd) == -1) {
*id = fi->id;
return 0;
} else {
/* Don't cache the id */
*id = fd_tree.subid++;
return 1;
}
}
}

Expand Down
18 changes: 18 additions & 0 deletions criu/files-reg.c
Original file line number Diff line number Diff line change
Expand Up @@ -2314,6 +2314,23 @@ static int open_filemap(int pid, struct vma_area *vma)
BUG_ON((vma->vmfd == NULL) || !vma->e->has_fdflags);
flags = vma->e->fdflags;

/* update the new device file page offsets and file paths set during restore */
if (vma->e->status & VMA_UNSUPP) {
uint64_t new_pgoff;
char new_path[PATH_MAX];
int ret;

struct reg_file_info *rfi = container_of(vma->vmfd, struct reg_file_info, d);
ret = run_plugins(UPDATE_VMA_MAP, rfi->rfe->name, new_path, vma->e->start,
vma->e->pgoff, &new_pgoff);
if (ret == 1) {
pr_info("New mmap %#016"PRIx64"->%#016"PRIx64" path %s\n", vma->e->pgoff,
new_pgoff, new_path);
vma->e->pgoff = new_pgoff;
strcpy(rfi->path, new_path);
}
}

if (ctx.flags != flags || ctx.desc != vma->vmfd) {
if (vma->e->status & VMA_AREA_MEMFD)
ret = memfd_open(vma->vmfd, &flags);
Expand All @@ -2331,6 +2348,7 @@ static int open_filemap(int pid, struct vma_area *vma)

ctx.vma = vma;
vma->e->fd = ctx.fd;

return 0;
}

Expand Down
12 changes: 12 additions & 0 deletions criu/include/criu-plugin.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@

#include <limits.h>
#include <stdbool.h>
#include <stdint.h>

#define CRIU_PLUGIN_GEN_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))
#define CRIU_PLUGIN_VERSION_MAJOR 0
Expand Down Expand Up @@ -50,6 +51,10 @@ enum {

CR_PLUGIN_HOOK__DUMP_EXT_LINK = 6,

CR_PLUGIN_HOOK__UPDATE_VMA_MAP = 7,

CR_PLUGIN_HOOK__RESUME_DEVICES_LATE = 8,

CR_PLUGIN_HOOK__MAX
};

Expand All @@ -63,6 +68,9 @@ DECLARE_PLUGIN_HOOK_ARGS(CR_PLUGIN_HOOK__RESTORE_EXT_FILE, int id);
DECLARE_PLUGIN_HOOK_ARGS(CR_PLUGIN_HOOK__DUMP_EXT_MOUNT, char *mountpoint, int id);
DECLARE_PLUGIN_HOOK_ARGS(CR_PLUGIN_HOOK__RESTORE_EXT_MOUNT, int id, char *mountpoint, char *old_root, int *is_file);
DECLARE_PLUGIN_HOOK_ARGS(CR_PLUGIN_HOOK__DUMP_EXT_LINK, int index, int type, char *kind);
DECLARE_PLUGIN_HOOK_ARGS(CR_PLUGIN_HOOK__UPDATE_VMA_MAP, const char* old_path, char *new_path,
const uint64_t addr, const uint64_t old_pgoff, uint64_t *new_pgoff);
DECLARE_PLUGIN_HOOK_ARGS(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, int pid);

enum {
CR_PLUGIN_STAGE__DUMP,
Expand Down Expand Up @@ -128,5 +136,9 @@ typedef int (cr_plugin_restore_file_t)(int id);
typedef int (cr_plugin_dump_ext_mount_t)(char *mountpoint, int id);
typedef int (cr_plugin_restore_ext_mount_t)(int id, char *mountpoint, char *old_root, int *is_file);
typedef int (cr_plugin_dump_ext_link_t)(int index, int type, char *kind);
typedef int (cr_plugin_update_vma_offset_t)(const char* old_path, char *new_path,
const uint64_t addr, const uint64_t old_pgoff,
uint64_t *new_pgoff);
typedef int (cr_plugin_resume_devices_late_t)(int pid);

#endif /* __CRIU_PLUGIN_H__ */
3 changes: 3 additions & 0 deletions criu/include/proc_parse.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
#define PROC_TASK_COMM_LEN 32
#define PROC_TASK_COMM_LEN_FMT "(%31s"

#define DRM_FIRST_RENDER_NODE 128
#define DRM_LAST_RENDER_NODE 255

struct proc_pid_stat {
int pid;
char comm[PROC_TASK_COMM_LEN];
Expand Down
2 changes: 2 additions & 0 deletions criu/plugin.c
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ static cr_plugin_desc_t *cr_gen_plugin_desc(void *h, char *path)
__assign_hook(DUMP_EXT_MOUNT, "cr_plugin_dump_ext_mount");
__assign_hook(RESTORE_EXT_MOUNT, "cr_plugin_restore_ext_mount");
__assign_hook(DUMP_EXT_LINK, "cr_plugin_dump_ext_link");
__assign_hook(UPDATE_VMA_MAP, "cr_plugin_update_vma_map");
__assign_hook(RESUME_DEVICES_LATE, "cr_plugin_resume_devices_late");

#undef __assign_hook

Expand Down