Any ideas how to iterate faster using multi core? #96

koustuvsinha · 2016-04-13T15:51:49Z

Hi! This is an amazing library, and only one working for me which is taking the least ram space. Had a doubt regarding faster iteration. Currently I am iterating through a 2GB xml file (5,000,000+ nodes) using the built in iterator. Since it is taking quite a lot of time, I want to use multi core support to iterate faster. I tried using OpenMP but its not working for me. Any ideas?

zeux · 2016-04-13T16:49:41Z

You may want to try compact mode (make sure pugiconfig.hpp has a commented-out define PUGIXML_COMPACT and uncomment it). The effects of this on performance are not well understood yet, but it may improve traversal performance of large documents due to better locality, given a fast CPU.
I'd like to know a bit more about your XML. Is it a root node with lots (millions) of children, or is the number of children of the root relatively small? How are you trying to use OpenMP?

koustuvsinha · 2016-04-13T19:03:12Z

@zeux the dataset is DBLP - Computer Science Bibliography. its mostly a flat structure, i.e root with millions of children.. depth upto 3 or 4. I am trying to use OpenMP by using the pragma definition above the loop like this :

#pragma omp parallel for
for(int n=0; n<node.size(); ++n) {

but for some strange reason its not working. will check out the compact mode as you mentioned.

zeux · 2016-04-14T08:32:07Z

This is a neat data set to test with!

I wrote a simple test program that loads this XML into memory and then does a simple XPath query to get all nodes:

pugi::xpath_node_set ns = doc.select_nodes("dblp/article");

Without compact mode I'm getting 5.7 seconds to parse the file and 0.9 seconds to run the query, as well as 6.98 GB memory consumption.

With compact mode I'm getting 4.8 seconds to parse the file and 0.3 seconds to run the query; memory consumption is 2.77 GB.

Now, these timings are pretty reasonable but of course my test is very simple - my XPath query is pretty simple and I'm not doing any processing on the results of the query.

In your application, what's the distribution of time between parsing, populating the node variable (which I'm assuming is an XPath node set) and processing the results? And how heavy is the processing - what kind of work do you do inside your loop?

koustuvsinha · 2016-04-14T11:12:28Z

Yeah actually i figured out the processing inside a loop is what taking time. I am trying to index the dataset based on a Dewey Based coding scheme, and thus within each iteration I have to retrieve the child nodes, do some computation and get the computed index label, and go on. So a faster iteration would have helped.

On that note, is there any api to directly get child nodes for any element? Since i am using a custom method to retrieve them, the complexity rises to O(n^2). I guess thats my bottleneck

zeux · 2016-04-14T14:10:24Z

Since i am using a custom method to retrieve them, the complexity rises to O(n^2).

So you have a function that given an xml_node and a child index iterates through children and returns the child?

There's no way to provide faster access - you can think of child container as a linked list, so indexing operations are naturally slow.

I'd suggest grabbing children of the root element once and storing them in a vector, e.g.:

pugi::xml_node root = doc.child("dblp");
std::vector<pugi::xml_node> children(root.begin(), root.end());

And using that for indexing instead.

If you're talking about the access to children of nodes like article/phdthesis/etc., then ideally you should not need to use index-based access since they just represent properties of the item? Posting a code example may help me understand the use case better.

koustuvsinha · 2016-04-14T14:25:19Z

As per my requirement I have to perform DFS on the tree, get every children for each element, and perform some computation. kind of like this :

struct mdc_faster_calc:xml_tree_walker
{
    long t = 0;
    virtual bool for_each(xml_node& node) {
        cout<<"Iteration : "<<t++<<endl;
        if(node_types[node.type()] == "element") {

            //if no parent then assign 0 to self

            if(string(node.parent().name()).size() == 0) {
                cout<<"----------Parent Node-----------"<<endl;
                label_hash.insert({node,"0"});
                m_hash.insert({node,0});
            } else {
                // not a parent
                // retrive parents children tags & nodes
                node_info ni = getChildrenNodesWithTags(node.parent());
                // .... some linear computation with retrieved nodes

This getChildrenNodesWithTags(node.parent()) call is where the bottleneck is, which creates the O(n^2) complexity. As you mentioned grabbing the children with XPath, here i have to do it for every step. So going by your way I have to generate an XPath to query for every node.

zeux · 2016-04-14T14:42:35Z

If getChildrenNodesWithTags just iterates through the children of a given node and returns them then the bigger problems seems to be the fact that you call this function, not the implementation of this function.

Basically regardless of the performance of getChildrenNodesWithTags, it has to be linear since it returns all nodes in the parent. Plus you mention linear computation with the retrieved nodes.

It seems like you're doing this processing redundantly though? I.e. that code would execute for every child of dblp (e.g. article), get parent (which is dblp - root), get all 5M children of that node again and again, etc. Is it possible to cache the results of this processing?

koustuvsinha · 2016-04-14T15:37:00Z

right. good idea, will try to add a separate processing loop to cache the results and then re-use it. thanks! will keep you posted here

zeux added the question label Jun 13, 2016

zeux closed this as completed Jun 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any ideas how to iterate faster using multi core? #96

Any ideas how to iterate faster using multi core? #96

koustuvsinha commented Apr 13, 2016

zeux commented Apr 13, 2016

koustuvsinha commented Apr 13, 2016

zeux commented Apr 14, 2016

koustuvsinha commented Apr 14, 2016

zeux commented Apr 14, 2016

koustuvsinha commented Apr 14, 2016

zeux commented Apr 14, 2016

koustuvsinha commented Apr 14, 2016

Any ideas how to iterate faster using multi core? #96

Any ideas how to iterate faster using multi core? #96

Comments

koustuvsinha commented Apr 13, 2016

zeux commented Apr 13, 2016

koustuvsinha commented Apr 13, 2016

zeux commented Apr 14, 2016

koustuvsinha commented Apr 14, 2016

zeux commented Apr 14, 2016

koustuvsinha commented Apr 14, 2016

zeux commented Apr 14, 2016

koustuvsinha commented Apr 14, 2016