Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

drat 'go' command stalls at reduce prompt #17

Closed
lewismc opened this issue Jul 25, 2014 · 11 comments
Closed

drat 'go' command stalls at reduce prompt #17

lewismc opened this issue Jul 25, 2014 · 11 comments

Comments

@lewismc
Copy link
Member

lewismc commented Jul 25, 2014

BTW the go command is dynamite. Really easy to get up and running.
I've recently been running DRAT on Apache Usergrid (Incubating) [0] in an attempt to add missing license headers, detect binaries and also set up up for a release as we have yet to release within the Incubator.

I see that after we've crawler, indexed and fired off map task(s), we are immediately prompted to fire of reducers e.g.

Waiting for the mapping and partitioning to finish...
There are still MapReduce mappers running! It is reccomended you wait for them to finish.
Are you sure you wish to continue? [yN]

This therefore essentially stalls the 'go' process. I would imagine for a user it would be be a catch22 situation as to whether we answer yes or no to this prompt.

I think the behavhiour should find another mechanism for firing off reduce tasks which do not require user input to complete (or destroy) the current DRAT job.

I need ot delve more closely into the mechanics of getting this working but will try my best over the next wee while.

[0] https://git-wip-us.apache.org/repos/asf?p=incubator-usergrid.git;a=tree;h=refs/heads/master;hb=master

@tbpalsulich
Copy link
Contributor

Hmm. The go function should wait for all mappers to finish before firing off the reducer. But, the reducer still checks for running mappers. In this case, it seems the go didn't find a mapper, but the reduce function did. My guess is that one unluckily fired off in the short time between the two.

Regardless, we should give the command to rerun later in the confirmation message. Will send a PR tomorrow.

@tbpalsulich
Copy link
Contributor

We could also only wait for 10 or so seconds for user input if the reduce function finds running mappers, then simply trying to run reduce again. That way, asleep go users don't get an unpleasant surprise in the morning and reduce users can just say no then try again later.

@chrismattmann
Copy link
Collaborator

I'm wondering why it stalls - I thought the point of go was to be automated - I think it dynamically checks for the mappers to be done and then fires the reducer, right? Where does the prompt come in? I thought it comes in if you don't use "go" and if you ran each cycle (crawl, index, map) before running reduce?

@tbpalsulich
Copy link
Contributor

Correct, go waits for mappers to finish, then calls the reduce function. The reduce function checks for running mappers, then prompts the user for confirmation if there are still mappers running.

This double-check is useful for when a user doesn't want to use go, and instead call reduce manually. For automation, go attempts to prevent this check from finding any mappers.

In this issue, go found no mappers, but reduce did. I don't know why, except for maybe being really unlucky. This should never happen, since they both use the same method to check for currently running mappers.

Unless, does this happen every time, Lewis?

@lewismc
Copy link
Member Author

lewismc commented Jul 25, 2014

every time Tyler. It seems tobe built in.

On Thu, Jul 24, 2014 at 8:57 PM, Tyler Palsulich notifications@github.com
wrote:

Correct, go waits for mappers to finish, then calls the reduce function.
The reduce function checks for running mappers, then prompts the user for
confirmation if there are still mappers running.

This double-check is useful for when a user doesn't want to use go, and
instead call reduce manually. For automation, go attempts to prevent this
check from finding any mappers.

In this issue, go found no mappers, but reduce did. I don't know why,
except for maybe being really unlucky. This should never happen, since they
both use the same method to check for currently running mappers.

Unless, does this happen every time, Lewis?


Reply to this email directly or view it on GitHub
chrismattmann/drat#17 (comment).

Lewis

@lewismc
Copy link
Member Author

lewismc commented Jul 25, 2014

BTW another observation of running on Usergrid codebase was that some tasks too around 18-20 mins whereas others took literally a second or two. Please see screenshot of OPSUI. I am investigating further folks.
screen shot 2014-07-24 at 10 07 07 pm

@chrismattmann
Copy link
Collaborator

Lewis that's totally expected (some tasks taking 20 mins, versus seconds). It has to do with the type of code it's looking at - usually Javascript or other MIME types, etc., take WAY longer than e.g., Java, for instance to check. I think this is more a functionality of RAT than anything else, but it could be that there is some difficulty RAT has in finding licenses in particular programming languages. But note, this is a KNOWN issue, (that Tika and DRAT helped to uncover, check out the DRAT docs for more info via the XNET presentation).

@lewismc
Copy link
Member Author

lewismc commented Jul 25, 2014

Great.
This is something new I am learing and I am extremely happy that you've now
put DRAT into ASLv2.0...
I am now a contributor :)

On Thu, Jul 24, 2014 at 10:42 PM, Chris Mattmann notifications@github.com
wrote:

Lewis that's totally expected (some tasks taking 20 mins, versus seconds).
It has to do with the type of code it's looking at - usually Javascript or
other MIME types, etc., take WAY longer than e.g., Java, for instance to
check. I think this is more a functionality of RAT than anything else, but
it could be that there is some difficulty RAT has in finding licenses in
particular programming languages. But note, this is a KNOWN issue, (that
Tika and DRAT helped to uncover, check out the DRAT docs for more info via
the XNET presentation).


Reply to this email directly or view it on GitHub
chrismattmann/drat#17 (comment).

Lewis

@chrismattmann
Copy link
Collaborator

you are awesome!


Chris Mattmann
chris.mattmann@gmail.com

-----Original Message-----
From: Lewis John McGibbney notifications@github.com
Reply-To: chrismattmann/drat
<reply+i-38691321-173039f6446ef9f81838cea84193de75f255e45a-395887@reply.git
hub.com>
Date: Friday, July 25, 2014 12:08 AM
To: chrismattmann/drat drat@noreply.github.com
Cc: Chris Mattmann chris.mattmann@gmail.com
Subject: Re: [drat] drat 'go' command stalls at reduce prompt (#17)

Great.

This is something new I am learing and I am extremely happy that you've
now

put DRAT into ASLv2.0...

I am now a contributor :)

On Thu, Jul 24, 2014 at 10:42 PM, Chris Mattmann
notifications@github.com

wrote:

Lewis that's totally expected (some tasks taking 20 mins, versus
seconds).

It has to do with the type of code it's looking at - usually Javascript
or

other MIME types, etc., take WAY longer than e.g., Java, for instance to

check. I think this is more a functionality of RAT than anything else,
but

it could be that there is some difficulty RAT has in finding licenses in

particular programming languages. But note, this is a KNOWN issue, (that

Tika and DRAT helped to uncover, check out the DRAT docs for more info
via

the XNET presentation).

Reply to this email directly or view it on GitHub

chrismattmann/drat#17 (comment).

Lewis


Reply to this email directly or view it on GitHub
chrismattmann/drat#17 (comment).

@lewismc
Copy link
Member Author

lewismc commented Jul 30, 2014

WOW I totally missed all of this conversation and also @tpalsulich patch.
The ...................... counting update for map is not only familar but also very professional.
Thanks for this functionality folks. MUCH more precise.

@lewismc lewismc closed this as completed Jul 30, 2014
@lewismc
Copy link
Member Author

lewismc commented Jul 30, 2014

BOOM XNET PRESSIE
https://www.youtube.com/watch?v=9w3fpnNWdIE

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants