-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contributing Spark TensorFlow connector to ecosystem #32
Comments
Looks really useful! I'll take a look after the dev summit tomorrow. Don't have cycles at the moment :) |
Thanks @jhseu. Enjoy the dev summit :) |
Hi @skavulya Its a nice work by you and your team. Could you please help me with some use case where can I use this integration. |
Would it make more sense to integrate/implement this into Spark directly? |
Thanks, @dsblr. This library is a connector for importing and exporting data to and from TensorFlow and Spark. For example, if you did ETL in Spark and wanted to export that data into a format that can be processed by a TensorFlow program. If you would like to run TensorFlow programs using Spark, you can check out Yahoo's TensorFlow on Spark. @thesuperzapper We thought the library might be more applicable to the TensorFlow ecosystem since it builds upon the TensorFlow Hadoop input/output format, but we are open to suggestions. @jhseu Did you get a chance to look at our repo? Please let us know what you think. |
@skavulya Thanks for your patience, been busy with other things. Yeah, I took a look and it definitely makes sense to merge here. Please make a pull request and put it under a separate spark directory. We can do code review in the pull request. |
Merged |
Our team has been working on a Spark TensorFlow connector that we would like to contribute back to the TensorFlow ecosystem. The connector uses the TensorFlow Hadoop input/output format, and simplifies import and export of data from TFRecords into Spark dataframes.
@jhseu, please let us know if this is something you are interested in. We would need some guidance on which directory to place the library in before creating the pull request. We were not sure if we should create a new spark directory at the root of the repo, or whether to create a new sub-directory under hadoop.
Here is a snippet of code that demonstrates the usage of the library:
The text was updated successfully, but these errors were encountered: