Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First attempt to use more gpus #339

Merged
merged 86 commits into from
Jan 8, 2020
Merged

First attempt to use more gpus #339

merged 86 commits into from
Jan 8, 2020

Conversation

tfarago
Copy link
Contributor

@tfarago tfarago commented Oct 21, 2014

A collaborative PR for making UFO work on more GPUs from concert.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 21, 2014

This is a deep WIP so far...

@tfarago
Copy link
Contributor Author

tfarago commented Oct 24, 2014

@matze so I tried this:

shape = 2048, 4096
image = np.random.normal(1000, 10, size=shape).astype(np.float32)


def generate(num_images=10):
    image = np.random.normal(1000, 10, size=shape).astype(np.float32)

    for i in range(num_images):
        print 'yielding', id(image), i + 1
        yield image
        print 'yielded', id(image), i + 1


@async
def process(backproject, arch, gpu, result, num_images=10):
    try:
        inject(generate(num_images=num_images), backproject(result(), arch=arch, gpu=gpu)) 
    finally:
        backproject.wait()


def run(num_images=10):
    arch = Ufo.ArchGraph()
    gpus = arch.get_gpu_nodes()
    resources = {}
    for i, gpu in enumerate(gpus):
        resources[gpu] = (Backproject(image.shape[1] / 2), Result())

    futures = []
    st = time.time()
    for gpu, (bp, result) in resources.iteritems():
        futures.append(process(bp, arch, gpu, result, num_images=num_images))

    wait(futures)
    print 'duration: {} s'.format(time.time() - st)

    return zip(*resources.values())[1]

But unfortunately I got no speedup plus the memory was not released.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 24, 2014

Even though I call wait the resources are not released. But when I type exit in the session I see the release messages from the framework, so it actually does the job but a little late. Maybe we need to release resources manually with this scheduler?

@tfarago
Copy link
Contributor Author

tfarago commented Oct 24, 2014

And funnily, those yielding and yielded printouts look like the stuff gets parallelized... weird.

@matze
Copy link
Contributor

matze commented Oct 27, 2014

There is a nasty race condition for me in this example. Waiting for the futures begins before the processes started which results in an AttributeError because there is no Backproject.thread object yet created.

@matze
Copy link
Contributor

matze commented Oct 27, 2014

Actually I have problems to get this running at all. Do you made any changes to either ufo-core or Concert?

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

Did you install this branch when you tried?

@matze
Copy link
Contributor

matze commented Oct 27, 2014

Yes, of course.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

Can you re-check? I tried with master and got the same error you mentioned above

AttributeError: 'Backproject' object has no attribute 'thread'

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

I just logged in, did no changes to ufo-* and it works.

@matze
Copy link
Contributor

matze commented Oct 27, 2014

As I said, it's a race condition. If it starts executing before waiting on the futures begins it should be fine. But I am not sure how to solve this generally at the moment.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

It is not a race condition because when you wait for the futures (if you mean the ones just before the duration printout) you are waiting for the process function which executes these commands serially (pseudo):

try:
    backproject.start() # inside the __call__, backproject.thread exists from now on
    while True:
        # crunch
finally:
    backproject.wait()

Try this process function instead:

@async
def process(backproject, arch, gpu, result, num_images=10):
    inject(generate(num_images=num_images), backproject(result(), arch=arch, gpu=gpu)) 
    backproject.wait()

Does it by any chance say:

TypeError: __call__() got an unexpected keyword argument 'gpu'

@matze
Copy link
Contributor

matze commented Oct 27, 2014

Does it by any chance say:

No, it never did so because I told you already that I am using the correct branch. After removing the try/finally blocks it kind of works now.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

Hm, but then I have no idea what the problem is. Especially since for me it works every time. Actually I experienced some random hangings and segfaults but that was very rare and before your final python multithreading fix, now I have no idea if it still happens.

@matze
Copy link
Contributor

matze commented Oct 27, 2014

I release the GIL in the output task now which let's all schedulers run fully in parallel. I haven't checked the memory consumption yet though.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

Should I try? Do I need to update/do smth. special on the ufo server?

@matze
Copy link
Contributor

matze commented Oct 27, 2014

Yeah, try what you wanted to try. I see the speed up and I guess you have to update ufo-core if you don't use a local installation.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

After a fresh pull it doesn't build...

EDIT: All OK, make clean rules.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

Nope, it builds on my machine but not on the server...

@matze
Copy link
Contributor

matze commented Oct 27, 2014

Nope, it builds on my machine but not on the server...

I ran it on the server ...

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

In /opt/ufo-core:

[ 47%] Building C object ufo/CMakeFiles/ufo.dir/ufo-output-task.c.o
/opt/ufo-core/ufo/ufo-output-task.c: In function ‘ufo_output_task_get_output_buffer’:
/opt/ufo-core/ufo/ufo-output-task.c:116:5: error: ‘buffer’ may be used uninitialized in this function [-Werror=uninitialized]
cc1: all warnings being treated as errors

make[2]: *** [ufo/CMakeFiles/ufo.dir/ufo-output-task.c.o] Error 1
make[1]: *** [ufo/CMakeFiles/ufo.dir/all] Error 2
make: *** [all] Error 2

How did you manage locally?

@matze
Copy link
Contributor

matze commented Oct 27, 2014

I have no idea why it didn't complain in my case. Anyway, I pushed a change that removes that warning, compiled and installed it. Have fun.

@tfarago
Copy link
Contributor Author

tfarago commented Oct 27, 2014

Thanks, I will!

@tfarago
Copy link
Contributor Author

tfarago commented Mar 3, 2015

I am trying to make this work with CopyTask but no luck. The code from here crashes with the following awfully:

f = FooProcess()
im = np.empty((512, 512), dtype=np.float32)
result = Result()
inject((im,), f(result()))

Can @matze take a look please?

@tfarago tfarago merged commit fa18e84 into master Jan 8, 2020
@tfarago tfarago deleted the ufo-multi-gpus branch January 8, 2020 11:32
@tfarago
Copy link
Contributor Author

tfarago commented Jan 8, 2020

This has been around for long enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants