Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate feasibility of scheduling hoodie pipelines w/o need for external workflow managers #128

Closed
vinothchandar opened this issue Mar 28, 2017 · 3 comments

Comments

@vinothchandar
Copy link
Member

Per http://spark.apache.org/docs/latest/job-scheduling.html , Spark already can do schedule tasks internally (we do this at scale already). and Spark has APIs to request and relinquish executors.

https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/SparkContext.html#requestExecutors(int)

Even for batch pipelines, we would like hoodie pipelines to be run efficiently, like Spark Streaming, except we give up containers when not in use..

Blocker : #123

@SemanticBeeng
Copy link

SemanticBeeng commented Apr 2, 2019

Do we want to consider what this means more exactly from perspective of HudiIO? (integration with Apache Beam, how Spark session is managed for long running jobs, what workflow manager means, etc)

From https://issues.apache.org/jira/browse/HUDI-70

"connection management"

See https://github.com/apache/beam/search?q=ProcessContext&unscoped_q=ProcessContext
Can we use something like that to hold the session ?

@vinothchandar
Copy link
Member Author

@SemanticBeeng this is orthogonal to connection management I believe.. This ticket was around figuring out how to deploy hudi in a long running mode.. Some aspects like givining up containers on dynamic allocation etc could still be useful. let me post this into HUDI-70

@vinothchandar
Copy link
Member Author

Once we approach beam, we can consider ProcessContext. for now, I think SparkStreamingContext or equivalent would do something similar I beleieve

vinishjail97 pushed a commit to vinishjail97/hudi that referenced this issue Dec 15, 2023
Co-authored-by: jainendra tarun <jainendratarun@onehouse.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants