Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IllegalArgumentException: fieldDeclaration must be the same size as the given values #216

Closed
dkincaid opened this issue Nov 26, 2013 · 13 comments

Comments

@dkincaid
Copy link
Contributor

On upgrading to Cascalog 2.0 one query that uses cascalog-checkpoint is throwing the following exception:

13/11/26 08:56:45 ERROR checkpointed-workflow: Component failed
java.lang.IllegalArgumentException: fieldDeclaration must be the same size as the given values
    at cascalog.ops.KryoInsert.<init>(KryoInsert.java:21)
    at cascalog.cascading.operations$insert_STAR_$fn__5580.invoke(operations.clj:105)
    at cascalog.cascading.operations$each$fn__5554.invoke(operations.clj:64)
    at clojure.lang.AFn.applyToHelper(AFn.java:161)
    at clojure.lang.AFn.applyTo(AFn.java:151)
    at clojure.core$apply.invoke(core.clj:619)
    at clojure.core$update_in.doInvoke(core.clj:5587)
    at clojure.lang.RestFn.invoke(RestFn.java:445)
    at cascalog.cascading.operations$add_op.invoke(operations.clj:42)
    at cascalog.cascading.operations$each.invoke(operations.clj:60)
    at cascalog.cascading.operations$insert_STAR_.doInvoke(operations.clj:105)
    at clojure.lang.RestFn.applyTo(RestFn.java:139)
    at clojure.core$apply.invoke(core.clj:619)
    at cascalog.cascading.operations$insert_subs.invoke(operations.clj:680)
    at cascalog.cascading.operations$with_constants.invoke(operations.clj:687)
    at cascalog.cascading.operations$logically.invoke(operations.clj:738)
    at cascalog.cascading.platform$assem_STAR_$fn__7224.invoke(platform.clj:76)
    at cascalog.cascading.platform$eval7293$fn__7294.invoke(platform.clj:129)
    at clojure.lang.MultiFn.invoke(MultiFn.java:241)
    at cascalog.cascading.platform$eval7454$fn__7456.invoke(platform.clj:219)
    at cascalog.cascading.platform$eval7370$fn__7371$G__7361__7376.invoke(platform.clj:202)
    at cascalog.cascading.platform$compile_query$fn__7477.invoke(platform.clj:305)
    at cascalog.logic.zip$postwalk_edit.doInvoke(zip.clj:56)
    at clojure.lang.RestFn.invoke(RestFn.java:494)
    at cascalog.cascading.platform$compile_query.invoke(platform.clj:303)
    at cascalog.cascading.platform$eval7487$fn__7488.invoke(platform.clj:312)
    at cascalog.cascading.types$eval5409$fn__5410$G__5400__5415.invoke(types.clj:35)
    at cascalog.cascading.operations$add_op.invoke(operations.clj:42)
    at cascalog.cascading.operations$rename_pipe.invoke(operations.clj:76)
    at cascalog.cascading.operations$in_branch.invoke(operations.clj:595)
    at cascalog.cascading.operations$write_STAR_.invoke(operations.clj:602)
    at clojure.core$comp$fn__409.invoke(core.clj:2332)
    at clojure.core$map$fn__470.invoke(core.clj:2492)
    at clojure.lang.LazySeq.sval(LazySeq.java:42)
    at clojure.lang.LazySeq.seq(LazySeq.java:60)
    at clojure.lang.RT.seq(RT.java:484)
    at clojure.core$seq.invoke(core.clj:133)
    at clojure.core.protocols$seq_reduce.invoke(protocols.clj:26)
    at clojure.core.protocols$eval2802$fn__2803.invoke(protocols.clj:53)
    at clojure.core.protocols$eval2735$fn__2736$G__2726__2749.invoke(protocols.clj:13)
    at clojure.core$reduce.invoke(core.clj:6175)
    at cascalog.logic.algebra$sum.invoke(algebra.clj:26)
    at cascalog.cascading.flow$compile_flow.doInvoke(flow.clj:91)
    at clojure.lang.RestFn.applyTo(RestFn.java:137)
    at clojure.core$apply.invoke(core.clj:619)
    at cascalog.api$_QMARK__.doInvoke(api.clj:153)
    at clojure.lang.RestFn.invoke(RestFn.java:436)
    at com.idexx.lambda.hadoop.jobs.patientvisits.summary$launch_workflow$fn__10967.invoke(summary.clj:52)
    at cascalog.checkpoint$mk_runner$fn__8338.invoke(checkpoint.clj:60)
    at clojure.lang.AFn.run(AFn.java:24)
    at java.lang.Thread.run(Thread.java:724)
@chetmancini
Copy link

@dkincaid Thanks for opening. I'm seeing this same issue come up running my tests.

@Quantisan
Copy link
Collaborator

do you have the query or, better yet, a unit test?

@dkincaid
Copy link
Contributor Author

dkincaid commented Dec 9, 2013

The query that I think it throwing the exception is in the Gist at https://gist.github.com/dkincaid/4235b4a4aaa5f95ba6cf

Once you've had a chance to look it over I'll need to delete it. Thanks.

@Quantisan
Copy link
Collaborator

that's one long query, could you extract out the problem into a unit test please?

@Quantisan
Copy link
Collaborator

it's just difficult for me to eyeball the problem since I can't run the code

@dkincaid
Copy link
Contributor Author

dkincaid commented Dec 9, 2013

Ok. I'll see if I can make it fail with a simpler query.

@Quantisan
Copy link
Collaborator

I think I know the problem. This is caused when a mapfn fails so the query builder doesn't know how many fields it outputs. In particular, get-in doesn't seem to work because of #217

@dkincaid could you confirm if your stripped down query with just get-in throws the same exception please?

@mping
Copy link

mping commented Dec 12, 2013

I'm hitting the same error running this simple gist on the repl: https://gist.github.com/mping/7931708#comment-968513
Using cascalog 2.0.0.

@Quantisan
Copy link
Collaborator

@mping @dkincaid this should be fixed by pull request #223, could you give that patch a try please?

@mping
Copy link

mping commented Dec 16, 2013

I'm still hitting the same error :\ I'm a total n00b so I could be doing something wrong.
I cloned the cascalog project, build it and installed it locally using maven; I double checked the classpath of my proj and it shows cascalog-2.0.1-SNAPSHOT so I'm guessing it's set up correctly.

@Quantisan
Copy link
Collaborator

@mping could you write a test case so I can reproduce the error and help you debug it please?

@mping
Copy link

mping commented Feb 12, 2014

Sorry for the late reply. Test case is here: https://gist.github.com/mping/7931708#comment-968513, I hit it with that piece or using similar code. Let me know if you need more info

@sritchie
Copy link
Collaborator

Okay, looks like this is fixed:

(def line
  [[(->> ["contentHost" "my.site.org"
           "contentKeywords" "something"
           "contentPath" "/loader.php" "contentReferer"  "http://www.mysite.org/hr"    "contentTitle"    "myunited"    "geoCountry"  "Croatia" "geoCountryId"    "HR"  "userAgentOsName" "Windows" "userAgentOsVersion"  "Windows 7"   "userAgentScreenResolution"   "1280x800"    "userAgentUaString"   "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31"]
          (clojure.string/join "\t" ))]])

(defn zipmap-fields [line] 
  (apply hash-map (clojure.string/split line #"\t")))

(defn field-values [line ks]
  (let [m (zipmap-fields line)]
    (for [k ks]
      (get m k))))

(??<- [?contentHost ?keywords]
      (line ?line)
      (field-values ?line ["contentHost" "contentKeywords"] :> ?contentHost ?keywords))
;;=> (["my.site.org" "something"])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants