Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic CPU-GPU affinity #57

Closed
eile opened this issue Nov 28, 2011 · 6 comments
Closed

Automatic CPU-GPU affinity #57

eile opened this issue Nov 28, 2011 · 6 comments

Comments

@eile
Copy link
Member

eile commented Nov 28, 2011

On multi-socket systems, performance may vary widely depending on which core a thread executes:

readback "channel" RGBA/401/101/0 1920x1200: 249MPix/sec (8.80081ms, 113FPS)
readback "channel" RGBA/401/101/1 1920x1200: 243MPix/sec (9.02805ms, 110FPS)
readback "channel" RGBA/401/101/2 1920x1200: 257MPix/sec (8.54776ms, 116FPS)
readback "channel" RGBA/401/101/3 1920x1200: 241MPix/sec (9.10284ms, 109FPS)
readback "channel" RGBA/401/101/4 1920x1200: 263MPix/sec (8.34019ms, 119FPS)
readback "channel" RGBA/401/101/5 1920x1200: 262MPix/sec (8.37638ms, 119FPS)
readback "channel" RGBA/401/101/6 1920x1200: 173MPix/sec (12.6451ms, 79FPS)
readback "channel" RGBA/401/101/7 1920x1200: 173MPix/sec (12.6364ms, 79FPS)
readback "channel" RGBA/401/101/8 1920x1200: 173MPix/sec (12.6432ms, 79FPS)
readback "channel" RGBA/401/101/9 1920x1200: 173MPix/sec (12.6438ms, 79FPS)
readback "channel" RGBA/401/101/a 1920x1200: 175MPix/sec (12.5237ms, 79FPS)
readback "channel" RGBA/401/101/b 1920x1200: 174MPix/sec (12.5634ms, 79FPS)

Provide a, preferably automatic, way to configure CPU affinity. hwloc seems to be the most promising package for this.

@ghost ghost assigned eile Nov 28, 2011
@eile
Copy link
Member Author

eile commented Nov 28, 2011

It also matters for network IO:

[[eilemann@node01 Equalizer]$ numactl --cpunodebind=0 ./release/bin/netperf -s node01i:4242:RDMA
Recv perf: 1758.73MB/s (1758.73pps) from RDMA#5000000#node01i##4242#default#
Recv perf: 1787.89MB/s (1787.89pps) from RDMA#5000000#node01i##4242#default#
Recv perf: 1771.19MB/s (1771.19pps) from RDMA#5000000#node01i##4242#default#

[eilemann@node01 Equalizer]$ numactl --cpunodebind=1 ./release/bin/netperf -s node01i:4242:RDMA
Recv perf: 2028.07MB/s (2028.07pps) from RDMA#5000000#node01i##4242#default#
Recv perf: 2022.14MB/s (2022.14pps) from RDMA#5000000#node01i##4242#default#
Recv perf: 1921.8MB/s (1921.8pps) from RDMA#5000000#node01i##4242#default#

@eile
Copy link
Member Author

eile commented Nov 30, 2011

@eile
Copy link
Member Author

eile commented Dec 19, 2011

@ghost ghost assigned marwan-abdellah Jan 16, 2012
@eile
Copy link
Member Author

eile commented Mar 21, 2012

FindHWLOC needs version checking. The Ubuntu version is too old, it doesn't have hwloc_bitmap_t which is deprecated in newer versions. Later we'll need also your new code. Please do this in https://github.com/Eyescale/CMake and then merge to Eq (see doc).

@marwan-abdellah
Copy link
Member

This issue was resolved in the commit 08f4ae5

@eile
Copy link
Member Author

eile commented Jul 6, 2012

Implemented except node thread affinity, as the small gain does not warrant the implementation overhead right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants