Multicore support? #22

hthuwal · 2018-10-03T12:10:11Z

I am trying to run it on a 1.5 GB text file. The model uses only a single core and hence it's taking too long.

I couldn't find a flag to specify the number of threads to use. Is there a way to run the model on multiple cores?

guilherme-salome · 2018-10-03T19:36:29Z

I couldn't either, an easy workaround is to just split the text file and launch two processes, one for each half of the file.

hthuwal · 2018-10-03T19:44:04Z

Yeah Running several processes on file splits is exactly what I am doing right now. But it would be nice to have some flag/method that allows using all cores like the stanfordNLP does.

…

On Thu 4 Oct, 2018, 1:06 AM Guilherme Salomé, ***@***.***> wrote: I couldn't either, an easy workaround is to just split the text file and launch two processes. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#22 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AO5Ljxqtq0GUnMfjY0TRaEGudUAqXBXfks5uhRG-gaJpZM4XF42I> .

guilherme-salome · 2018-10-03T20:04:57Z

If you find a more efficient solution or update the code to allow for multicores please post it here!

guilherme-salome · 2018-10-05T20:21:47Z

@hthuwal I've been using this project to go over a lot of text and I was running it in a single powerful machine and it was veeery slow. I then went to digitalocean and got one of the high tier droplets and started running openie5 in parallel (with https://www.gnu.org/software/parallel/) and on a small number of phrases (1000 at a time, more to debug really, but could be increased). The processing time was about 3 minutes for each 1000 (1 min for loading up open ie 5 more or less).

At first I was trying with the -Xmx10g -XX:+UseConcMarkSweepGC option and it was not working at all, no lines were being parsed. This options seems to work in RedHat and MacOs but did not work for me in Ubuntu 18.04. I removed the options and it started working.
However, I noticed that the memory usage was higher than 10gb, about 13gb per openie 5 process.
I was also using top to monitor cpu usage, and each process was using about 150% CPU on average.
With 64gb of RAM I was able to run 4 processes simultaneously (a 5th would crash because of low memory).

The droplet I was using has 64gb of RAM + 32vCPUs. The type of the droplet is: "CPU Optimized droplet".
There is another type that is called "Standard Droplets", and its highest tier has 192gb of RAM + 32vCPUs.
Since the bottleneck in the CPU Optimized droplet was the RAM, it may be possible to run more processes in the Standard droplet, even though there the CPUs are less powerful.

guilherme-salome · 2018-10-05T21:17:51Z

Update: I tested their Standard Droplet with 192gb of memory and 32vCPUs and I was able to run 8 processes at the same time. That consumed 92% ish of the memory. The average CPU use was 1200%. So the bottleneck is definitely memory.

Update: Looking at the top output it seems there is still some memory free with 8 processes, so I think maybe 9 or 10 could run in parallel. 12 processes definitely does not work, neither 11.

Anyways, maybe this can help you speed up. Btw digital ocean (referral link) is giving $100 for use during october. That can buy about 60 hours of the most expensive droplet.

hthuwal · 2018-10-06T04:54:22Z

Thanx @Salompas. Yes, memory is the bottleneck because the process requires ~10 GB of memory just to run. I have access to a machine with about ~80GB of RAM and 32 cores. I was able to run 3 processes simultaneously. Any further increase in the number of processes chokes up the machine.

Thanx for reminding about the parallel command. I totally forgot about this and wrote a script that splits the data and spawns processes in multiple tmux windows.

bhadramani · 2019-01-03T10:20:33Z

openie 4.2 + had multi core support with multi threaded environment ( With approx constant RAM usage ) .
Performance recommendation

For N threads use N=1 core . You may observe Nx improvement up to 8 cores.
Use taskset.

Swarna may update , OpenIe 5.x is thread safe?

bhadramani · 2019-01-03T10:27:36Z

One more performance related suggestion, reading the files is costly , so smaller chunks must help. Choosing chunk size is another smart thing to do.
Similarly writing the output , should be done smartly ( For very very large data , may consider using RabittMQ or any similar system , which maintains the Q and save asynchronously.

ambujpd · 2020-02-17T10:48:21Z

@vaibhavad @swarnaHub @harrysethi @schmmd @bhadramani
Could you please suggest on the approach to multicore support?
Alternatively, is it possible to load the model in a separate process such that it can be shared (since model size is one of the major bottlenecks)?

I tried to naively use concurrent Futures in Scala and divided sentences among them (in OpenIECli.scala). (I found OpenNLP Chunker as non thread-safe so I put it in blocking{}). But this is not giving me any improvement. For 8 concurrent futures (and 80 sentences), run time is slightly slower than serial. The extracts are getting serialized at some point, although they are running in different threads.

PS: I also see some nThreads set to 1 in some targets:

edu/stanford/nlp/models/pos-tagger/wsj-0-18-left3words-nodistsim.tagger.props:                nthreads = 1
edu/stanford/nlp/models/pos-tagger/english-bidirectional/english-bidirectional-distsim.tagger.props:                nthreads = 1
edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger.props:                nthreads = 1

ambujpd · 2020-03-02T07:38:02Z

@vaibhavad @swarnaHub @harrysethi @schmmd @bhadramani
Could you please suggest on the approach to multicore support?
Alternatively, is it possible to load the model in a separate process such that it can be shared (since model size is one of the major bottlenecks)?

I tried to naively use concurrent Futures in Scala and divided sentences among them (in OpenIECli.scala). (I found OpenNLP Chunker as non thread-safe so I put it in blocking{}). But this is not giving me any improvement. For 8 concurrent futures (and 80 sentences), run time is slightly slower than serial. The extracts are getting serialized at some point, although they are running in different threads.

PS: I also see some nThreads set to 1 in some targets:
edu/stanford/nlp/models/pos-tagger/wsj-0-18-left3words-nodistsim.tagger.props:                nthreads = 1
edu/stanford/nlp/models/pos-tagger/english-bidirectional/english-bidirectional-distsim.tagger.props:                nthreads = 1
edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger.props:                nthreads = 1

The multithreaded implementation is working now, giving 4X improvement with 6 threads (tried on a 20-core machine. Increasing threads further showed no further improvement). The reason it wasn't showing any improvement earlier was that I was using too less heap memory - 10G. Increasing 10G to 12G gave substantial improvement in runtime already (around 10X in extractions).

vaibhavad · 2020-03-03T08:02:13Z

@ambujpd
Glad to know that multithreading implementation is working. Can you share the changes you made to make it work? in a pull request? We can test them and merge them with the codebase.

ambujpd · 2020-03-05T09:23:23Z

@vaibhavad

With higher number of threads (8+), I sporadically see one or two sentences (out of 80) throwing NullPointerException from OpenNLP Chunker, even though I've put that call within Blocking. I'm looking into it currently.

moinnadeem · 2020-09-25T15:43:44Z

@ambujpd Hey! Are you able to share your multithreaded implementation? It would be super useful for me personally, and cut down my development time by quite a bit. Happy to spend time on the code to help if necessary

ambujpd · 2020-09-25T18:18:28Z

@moinnadeem I don't have the code with me unfortunately (I remember I was able able to use some thread-safe NLP chunker, along with Scala concurrency and had gotten rid of sporadic NullPointerException issue). But in conclusion I found it was not worth the effort as scalability was quite limited. A much better alternative is multi-processing (at the cost of extra memory) which I eventually ended up using.

vaibhavad · 2020-10-02T08:54:45Z

Hi @ambujpd @moinnadeem @bhadramani @hthuwal @Salompas,

We have just released a neural OpenIE system - OpenIE6, which is better in performance and at least 10x faster than OpenIE-5 (if you run it on a GPU). You can check it out here - https://github.com/dair-iitd/openie6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multicore support? #22

Multicore support? #22

hthuwal commented Oct 3, 2018

guilherme-salome commented Oct 3, 2018 •

edited

Loading

hthuwal commented Oct 3, 2018 via email •

edited

Loading

guilherme-salome commented Oct 3, 2018

guilherme-salome commented Oct 5, 2018

guilherme-salome commented Oct 5, 2018 •

edited

Loading

hthuwal commented Oct 6, 2018

bhadramani commented Jan 3, 2019

bhadramani commented Jan 3, 2019

ambujpd commented Feb 17, 2020 •

edited

Loading

ambujpd commented Mar 2, 2020 •

edited

Loading

vaibhavad commented Mar 3, 2020

ambujpd commented Mar 5, 2020 •

edited

Loading

moinnadeem commented Sep 25, 2020

ambujpd commented Sep 25, 2020 •

edited

Loading

vaibhavad commented Oct 2, 2020

Multicore support? #22

Multicore support? #22

Comments

hthuwal commented Oct 3, 2018

guilherme-salome commented Oct 3, 2018 • edited Loading

hthuwal commented Oct 3, 2018 via email • edited Loading

guilherme-salome commented Oct 3, 2018

guilherme-salome commented Oct 5, 2018

guilherme-salome commented Oct 5, 2018 • edited Loading

hthuwal commented Oct 6, 2018

bhadramani commented Jan 3, 2019

bhadramani commented Jan 3, 2019

ambujpd commented Feb 17, 2020 • edited Loading

ambujpd commented Mar 2, 2020 • edited Loading

vaibhavad commented Mar 3, 2020

ambujpd commented Mar 5, 2020 • edited Loading

moinnadeem commented Sep 25, 2020

ambujpd commented Sep 25, 2020 • edited Loading

vaibhavad commented Oct 2, 2020

guilherme-salome commented Oct 3, 2018 •

edited

Loading

hthuwal commented Oct 3, 2018 via email •

edited

Loading

guilherme-salome commented Oct 5, 2018 •

edited

Loading

ambujpd commented Feb 17, 2020 •

edited

Loading

ambujpd commented Mar 2, 2020 •

edited

Loading

ambujpd commented Mar 5, 2020 •

edited

Loading

ambujpd commented Sep 25, 2020 •

edited

Loading