Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom loaders can't use functions defined elsewhere #46

Closed
pkozikow opened this issue Aug 21, 2014 · 6 comments
Closed

Custom loaders can't use functions defined elsewhere #46

pkozikow opened this issue Aug 21, 2014 · 6 comments
Labels

Comments

@pkozikow
Copy link
Contributor

The example on the wiki shows this:

(defn square [x]
  (* x x))

(defn baz [data]
  (->> data
    (pig/map square)))

I'm trying to use a function like square above from inside a custom loader (raw/load$), but that results in an "Unable to resolve symbol" exception. Defining the function in a separate namespace and requiring it doesn't work either.

I'd guess this could be solved the same way as for operations like pig/map in the example.

@pkozikow
Copy link
Contributor Author

Update: using functions in a separate ns using the full name works (e.g. my.full.ns/my-func), but one can't use an ns alias (my.full.ns :as x, and then x/my-func)

@mbossenbroek
Copy link
Contributor

What you need is pigpen.code/trap. This function does the job of trapping the lexical scope and current namespace. It re-writes the user function & wraps it as such:

(defn foo [x]
  (let [y 42]
    (pigpen.code/trap (fn [z] (+ x y z)))))

If you call this, it will produce this:

=> (foo 123)

(pigpen.pig/with-ns pigpen-demo.core
  (clojure.core/let [y (quote 42)
                     x (quote 123)]
    (fn [z] (+ x y z))))

Here's how it's used in pigpen.core/map: https://github.com/Netflix/PigPen/blob/master/pigpen-core/src/main/clojure/pigpen/map.clj#L57

This function is generally used within a macro to prevent clojure from evaluating the function. The map example above is a good pattern to use - have the macro only add the code/trap and then call map* with the trapped function.

Here's some skeleton code for a loader that would trap a function:

(defn square [x]
  (* x x))

(defn load-custom* [location f]
  (->
    (raw/load$ location '[value] raw/default-storage {})
    (raw/bind$ `(pigpen.pig/map->bind ~f) {})))

(defmacro load-custom [location f]
  `(load-custom* ~location (pigpen.code/trap ~f)))

Let me know if that helps.

Thanks,
Matt

@pkozikow
Copy link
Contributor Author

Thanks for the prompt response. I'm just getting my head around this, but wouldn't it be possible to make map->bind be a macro that does this so that the user doesn't need to care about it? (just like map).

@mbossenbroek
Copy link
Contributor

It might be possible to clean up the way that custom loaders are implemented (I certainly didn't expect it to be one of the most common things users would do), but you'd still be stuck with using macros & pigpen.code/trap if you need to take a user function as an argument. If you don't, clojure evaluates the function and we get a compiled function, which can't easily be serialized into a pigpen script.

Generally, trap is separate from map->bind because it's used in many other places & it's usually easier to pick which of the bind operations is appropriate for the situation. There's also pigpen.pig/filter->bind and pigpen.pig/mapcat->bind, which convert those types of functions into mapcat-like functions that can be squished into one big mapcat operation. This way it keeps these distinct operations de-complected from one another.

That said, if you've got any ideas for making it cleaner, I'm totally open to making changes to this part of pigpen. I've looked at a number of different loader implementations & haven't seen a good generalization yet.

@pkozikow
Copy link
Contributor Author

Update: I spoke too soon. Using functions from a separate ns as described above only works locally. When I tried it on the cluster, I got a ClassNotFoundException. After some experimenting, the only way I could get it to work is to add :gen-class to that ns with the method I'm using defined with :methods. At that point I gave up and implemented the function as a Java static method. If (when) this comes up again, I'll try to use the trap approach you suggest above.

By the way, this was the last hurdle before being able to run my first PigPen job on our real cluster. Hooray!

@mbossenbroek
Copy link
Contributor

Oh no! You definitely shouldn't have to go to that extent. I thought I had it set up such that if it worked in the repl it would work in the cluster, but it looks like I missed a case. I might have to start using something that will produce a clean, isolated ns when running locally.

pigpen.code/trap is definitely what you're looking for though. Feel free to send me your loader code & I can help you get trap working with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants