-
-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to pass parameters to run when called programmatically #18
Comments
I have bene giving the passing-in and returning of parameters some thought too. I have been toying with the idea of extending the DSL language to include “param” and “return” statements. The “param” statement would be used to unpack parameters passed into the script to instance variables. Similarly the “return” statement would pack any instance variables for returning back to the caller. I agree that executing ‘run’ multiple times on the ‘control’ class needs careful consideration. One thought I had, was to parse the DSL and create a class which could then be instantiated and run by calling it’s methods directly. I think this approach would allow the garbage collector to free up unreferenced instances. However, there is a risk that approach this might detract from the simplicity of the code. I have not given much thought to eager loading of the scripts because of the previous consideration. From a SideKiq worker, I call Kiba.run (including the script loading and parsing) each time with the script name and parameters (using my fork). I instantiate a new worker for each different tenant configured on the application. Ideally, it would be nice to initialise (load and parse) all the different scripts and then just call the worker with parameters for the script. I still need to learn how I would try to accomplish this with SideKiq. I assume I would need to add some code to the ‘Sidekiq.configure_server’ block in the initialiser to tell SideKiq which scripts to load and label them for calling later. From: Thibaut Barrère [mailto:notifications@github.com] When calling Kiba.run programmatically (which apparently a growing number of users are doing), it would be very useful to be able to pass parameters to configure the run. Points to be investigated
Related issues
— |
+1 |
@bcherrington thanks for the thoughts, appreciated! I'm leaning toward a kind of In all cases thanks, I'll try to play around with these ideas and see if I can implement a clean, expandable pattern. |
This is useful when doing programmatic calls to `Kiba.parse`. Note that this is experimental and implementation may change.
This so far underlines that it’s probably going to be hard to call `run` and expect to get an output, without calling `parse` each time before.
A first experiment is available. This implementation makes it more or less mandatory to call Maybe a better implementation could rely on some sort of |
I also note that I initially wanted a clear separation between the parsing and |
Closing this one for now, since programmatic calls via API are not the main intended use today. I'll revisit this if/when I officially support programmatic calls. |
@thbar thanks for this great gem! One of our projects was using the
This branch was recently deleted. The dev who worked on the implementation is no longer with us. Can you remind me what this branch accomplished? We're getting the following:
we're running this method in a rake task:
|
@joshRpowell glad you like Kiba, and sorry for your trouble today! Indeed this was an experimental branch which ultimately is not exactly how I want to solve that problem, so I deleted it (but hadn't thought someone would actually be using it). I will make sure to mark such branches explicitely as "experimental" in the future, sorry about that! To solve the problem right now, I suggest you fork Kiba then cherry-pick the commits here, since I think it should mostly be this. Let me know how it goes! |
@joshRpowell another possibility (since I do think you are using Kiba from a background job, right?) would be to rely on the block form of module ETL
module MySync
module_function
# variables could be a string, but also a Sequel connection for instance
def setup(source_file, sequel_connection)
Kiba.parse do
source CSVSource, filename: source_file
# SNIP
destination MyDBDestination, sequel_connection: sequel_connection
end
end
end
end Which you then call with: Kiba.run(ETL::MySync.setup(source_file, sequel_connection)) Please note that in case of errors here, This approach will require a bit of refactoring on your side but it also more future-proof if you are using background jobs. I'll address that in an upcoming blog post - hope this helps though! |
@thbar the cherry-pick worked. back up and running. We are using Kiba from a Sidekiq job. I like your recommended approach and agree, some refactoring is required. I'll be in touch after I give it a try. Thanks again! |
@joshRpowell glad to hear everything's fine now 😄 - you welcome! |
As a heads-up for anyone following this issue, Kiba (from v2 and upward) officially supports the programmatic job declaration which I hinted @joshRpowell about. This is the best way to handle any form of input & output parameters, much more flexible than parsing an external text file. |
When calling
Kiba.run
programmatically (which apparently a growing number of users are doing), it would be very useful to be able to pass parameters to configure the run.Points to be investigated
Control
) should be made public and documented properly, maybe renamed (JobDefinition
?).run
multiple time on the sameControl
, so I must consider the potential issues with this scenario (e.g. if you store some@state
in the control, you'd have to initialize it)Potential work-arounds for now
See 97758bb and ca5240a which may provide a way to do that now.
Related issues
Implementation points to consider
At this point I believe an input is mostly useful at
parse
time, for instance:We should also make sure that there is a way to ultimately pass that from both API calls and command-line.
The text was updated successfully, but these errors were encountered: