You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The normal use case of building a scalding job involves writing a class that subclasses Job. Then, this class is rendered as a cascading flow by the scalding.Tool. There are two issues with this: 1) reflection is normally used to launch the job, and any error in the job that throws at constructor time, is generally hidden from the user as it is a reflection failure. 2) For the use case of people building stand-alone jobs, this needlessly complicates their build, as they have to launch with a special redundant string.
My idea is the ability to do something like:
objectMyNewJobextendsAppwithToolJob {
// args is the raw input passed in provided by AppTypedTsv[String](parsedArgs("input"))
.mapTo('words) { _.split("\\s+") }
.groupBy('words) { _.size }
.write(Tsv(parsedArgs("out")))
}
And then be able to run that with ```hadoop jar MyJar.jar --input infile --out wordcount.tsv" and have it bake in the default main method correctly.
The text was updated successfully, but these errors were encountered:
Alternatively, we could implement something equivalent to App, that doesn't introduce the args confusion (i.e. args will return a scalding.Args), and you just type object MyJob extends ToolJob.
The normal use case of building a scalding job involves writing a class that subclasses Job. Then, this class is rendered as a cascading flow by the scalding.Tool. There are two issues with this: 1) reflection is normally used to launch the job, and any error in the job that throws at constructor time, is generally hidden from the user as it is a reflection failure. 2) For the use case of people building stand-alone jobs, this needlessly complicates their build, as they have to launch with a special redundant string.
My idea is the ability to do something like:
And then be able to run that with ```hadoop jar MyJar.jar --input infile --out wordcount.tsv" and have it bake in the default main method correctly.
The text was updated successfully, but these errors were encountered: