Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support feeding from Spark #9158

Open
lesters opened this issue Apr 23, 2019 · 5 comments
Open

Support feeding from Spark #9158

lesters opened this issue Apr 23, 2019 · 5 comments

Comments

@lesters
Copy link
Member

lesters commented Apr 23, 2019

Today, the Hadoop integration tools for Vespa support Hadoop and Pig for feeding and querying Vespa. The Pig feeder is a thin wrapper around the Vespa HTTP client.

We should support feeding directly from Spark as well, to avoid Spark pipelines having to write to HDFS and run another Pig job for the actual feeding. Similarly to the Pig feeder, this could be implemented as a thin wrapper around the HTTP client.

@frodelu frodelu added this to the later milestone Apr 24, 2019
@kkraune kkraune added the HackTogether https://yahoo.github.io/hacktogether/ label Mar 10, 2021
@kkraune kkraune removed the HackTogether https://yahoo.github.io/hacktogether/ label Apr 21, 2021
@prasad-marne
Copy link

@kkraune i dont see Hadoop integration anymore. do we want to have Spark Support. I would be interested in taking it up.

@kkraune
Copy link
Member

kkraune commented Oct 9, 2023

Hi, yes that would be a great addition! A good starting point is https://docs.vespa.ai/en/vespa-feed-client.html. Thanks!

@prasad-marne
Copy link

Great. Will spend some time to investigate and see how we can design a sink in Spark

@tsafacjo
Copy link

tsafacjo commented Nov 1, 2023

can I take this issue ?

@kkraune
Copy link
Member

kkraune commented Nov 2, 2023

Sure, thanks for contributing! https://github.com/vespa-engine/vespa/blob/master/CONTRIBUTING.md is a good place to start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants