New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python implementation of Kafka Streams? #38

Open
dalejin2014 opened this Issue Aug 29, 2016 · 27 comments

Comments

Projects
None yet
@dalejin2014

dalejin2014 commented Aug 29, 2016

We are interested in using kafka streaming.
Is it on the road map for confluent kafka python library?

@ewencp

This comment has been minimized.

Member

ewencp commented Aug 30, 2016

@dalejin2014 We'd love to have native stream processing libraries in different languages and having really good Kafka clients is the basis for that. That said, we don't have a timeline for adding this yet.

@miguno

This comment has been minimized.

Member

miguno commented Aug 30, 2016

@dalejin2014: As @ewencp mentioned we don't have a timeline yet. The reason for this is that we first want to ensure we have a strong foundation in the form of the Java implementation of Kafka Streams before venturing into non-JVM languages.

That said, of course I took a note of your request. :-)

Do you mind sharing some information about your use case where you'd use Kafka Streams from Python?

@miguno miguno added the question label Aug 30, 2016

@miguno miguno changed the title from kafka streaming to Python implementation of Kafka Streams? Aug 30, 2016

@dalejin2014

This comment has been minimized.

dalejin2014 commented Aug 30, 2016

We are interested in developing a commenting feature kind of like google doc.
The use case is as follows:

  • users in a thread should be notified on the events at a frequency of their choosing (realtime, hourly, daily, etc)

So we are thinking about using Kafka Streaming since it provides us:

  • windowing
  • group-by
  • accumulation
  • etc

Is there an easy way to port the features from Java client?

@miguno

This comment has been minimized.

Member

miguno commented Aug 31, 2016

Thanks for sharing the background info @dalejin2014.

Is there an easy way to port the features from Java client?

It's not super-hard but also not trivial. Also, one would need to continuously maintain any such Kafka Streams libraries for other languages with the same commitment and high quality as the current Kafka Streams library for Java, so "porting" is not a one-off effort but an ongoing time investment. Hence our current decision to focus our efforts first on the Java implementation of Kafka Streams.

@spicykaiju

This comment has been minimized.

spicykaiju commented Oct 11, 2016

+100 :)

@zzbennett

This comment has been minimized.

zzbennett commented Dec 3, 2016

Kafka Streams for Python would be so amazing. I'm currently evaluating stream processing frameworks and I like what I've been reading about Kafka Streams. My use case is essentially this: I'm laying down the infrastructure to enable realtime analytics and processing of log/event data. The primary users of this data are data scientists who would be standing up their own Kafka streams apps mostly for doing transformations, joins, partitioning and windowed analytics. I think Kafka streams fits this use case nicely since the streams library eliminates a lot of the boiler plate code involved in configuring Kafka consumers and producers but leaves developers the freedom and flexibility to do lots of cool stuff with the data in each Kafka topic. The only catch is that not many of the data scientists are well versed in Java--our language of choice is Python for almost everything. As much as I like Kafka and as excited as I am about Kafka Streams, getting the data scientists on board with writing Java will be an uphill battle.

With that said, have there been any developments with regards to supporting a Python based Kafka Streams library?

@miguno

This comment has been minimized.

Member

miguno commented Dec 5, 2016

@zzbennett I hear you, Elizabeth. :-)

Unfortunately our short-term roadmap does not include work on a Python library of Kafka Streams. (We'd definitely welcome contributors though!) Same situation for e.g. kafka-python, a community project.

I'm kinda hesitant to suggest this, but perhaps it would be worth a try to experiment with Jython? IIRC some Ruby users have been experimenting with Kafka Streams' Java library via JRuby. FWIW, there are a few community/external projects already working on various "wrappers" (in a broad sense) for Kafka's Streams and Connect APIs, but they haven't been released yet; I don't remember off the top of my hat whether a Python-based one was amongst that.

@zzbennett

This comment has been minimized.

zzbennett commented Dec 6, 2016

Thanks for your reply @miguno and thanks for the suggestions. Jython might be a good option for prototyping. I may actually be able to drum up support for Scala based Streams apps, which would work a bit better with the Java libraries.

As far as contributing, I may even end up putting together a Python port of Kafka Streams for our uses cases. Eventually with the help of some collaborators in the kafka python community we'd hopefully be able to contribute something upstream. But I suppose we can cross that bridge when we get there. At any rate, thanks again for the help!

@murphyke

This comment has been minimized.

murphyke commented Dec 13, 2016

@zzbennett Somebody in my group was talking about working on this also. If you create a repo with issues laying out the work and then solicit help, you may find yourself with some contributors reasonably soon.

@zzbennett

This comment has been minimized.

zzbennett commented Dec 13, 2016

@murphyke that would be super. I actually just created a repo last weekend to start working on it (https://github.com/python-kafka-streams/python-kafka-streams). I haven't committed any work or created any tickets yet, but hopefully I'll get a chance to do that in the next couple of days. Feel free to send people over there if they are itching to work on it. Once a little momentum gets built up I'll post to some user groups to solicit help.

@supertramped

This comment has been minimized.

supertramped commented Dec 26, 2016

@zzbennett I'd love to contribute to the python-kafka-streams repo.

@ayanguha

This comment has been minimized.

ayanguha commented Mar 22, 2017

I would love to work on this, as well as love the idea itself :)

Wondering if someone has some initial design which I can start working with?

@pouledodue

This comment has been minimized.

pouledodue commented Jul 8, 2017

so... what's best practice? use Jython?

@miguno

This comment has been minimized.

Member

miguno commented Jul 10, 2017

Jython is one option, yes. And some users are actually running Jython-based Kafka Streams applications in production.

Also: There's an upcoming, community-driven Python implementation of Kafka Streams (a first MVP = not all features are already implemented) that will be presented at EuroPython later this month.

@llawall

This comment has been minimized.

llawall commented Jul 12, 2017

The code @miguno is referring to is now on GitHub: https://github.com/wintoncode/winton-kafka-streams

Check it out and get involved with the project!

@pouledodue

This comment has been minimized.

pouledodue commented Aug 14, 2017

no updates for a month on winton, I hope they continue their good project

@pouledodue

This comment has been minimized.

pouledodue commented Sep 16, 2017

seems dead unfortunately

@pouledodue

This comment has been minimized.

pouledodue commented Oct 10, 2017

Would be great to have a bit of help from Confluent on this, given python is the most wanted language in 2017 according to Stack Overflow
51eef3d9dcc6a0ca8642a6d58fd182fcb0c8b419

@miguno

This comment has been minimized.

Member

miguno commented Oct 10, 2017

@pouledodue: I'd suggest to bring this up at https://github.com/wintoncode/winton-kafka-streams -- the last commit in that project was actually 5 days ago.

@rdehouss

This comment has been minimized.

rdehouss commented Oct 29, 2017

+1 on this.
Question for the community about renaming the projet to a more "standard name": wintoncode/winton-kafka-streams#8

@pouledodue

This comment has been minimized.

pouledodue commented Feb 21, 2018

at this point I decided to learn the java ecosystem instead of using an half-baked python solution

@g-rd

This comment has been minimized.

g-rd commented Jun 22, 2018

Are there any developments on this request ? I was so excited about kafka but with no streaming api implementation in python I am unsure now.

@rnpridgeon

This comment has been minimized.

Contributor

rnpridgeon commented Jun 22, 2018

@g-rd, as of today we are still tracking interest but it doesn't currently have a place on the roadmap.

@pouledodue

This comment has been minimized.

pouledodue commented Jun 23, 2018

@g-rd you may look into Apache Pulsar

@edenhill

This comment has been minimized.

Member

edenhill commented Jun 24, 2018

@g-rd

This comment has been minimized.

g-rd commented Jun 24, 2018

@edenhill I have looked at it already, but it looks to me that this project is either perfect with no developing needed or just not being developed. I go with not being actively developed.
I am looking now at Apache Pulsar and I think Pulsar is a better fit for me.

@vineetgoel

This comment has been minimized.

vineetgoel commented Jul 31, 2018

Check out a Kafka Streams inspired Python Stream Processing library we just open sourced: https://robinhood.engineering/faust-stream-processing-for-python-a66d3a51212d

dtheodor pushed a commit to dtheodor/confluent-kafka-python that referenced this issue Sep 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment