Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add abstractions for parsing TFRecord Files using tf.Example and tf.io ops #58

Open
dhruvrajan opened this issue May 18, 2020 · 3 comments

Comments

@dhruvrajan
Copy link
Contributor

System information

  • TensorFlow version (you are using): Latest master of TensorFlow Java
  • Are you willing to contribute it (Yes/No): No (working on other things at the moment)

Describe the feature and the current behavior/state.
Currently in Java, we have access to the core tf.io ops such as tf.parseExample, tf.parseSingleExample, tf.decodeRaw etc. In order to serialize TF Record datasets and read in datasets from the tensorflow_datasets buckets, for example, we need to be easily able to use these ops.

In Python, the relevant abstractions built on top of tf.io are defined in parsing_config.py. Specifically it will be very helpful to have abstractions such as:

  • Various feature types: FixedLenFeature, SparseFeature, FixedLenSequenceFeature, etc...
  • The _ParseOpParams class which wraps the parameters to tf.parseExample
  • Standardizing a flow for defining features in a TFRecord file.

See these examples which relate to using the parse-example ops, and reading TFRecord files

Will this change the current api? How?

This will add APIs for serializing / parsing examples to / from TF Record files

Who will benefit with this feature?

Anyone using datasets stored as TFRecord flies from TensorFlow java (for example, to load datasets from the tensorflow_datasets GCP bucket)

Any Other info.

Feel free to get in touch with me anytime to discuss! Happy to help.

@karllessard
Copy link
Collaborator

Thanks @dhruvrajan ,

Before we begin, can you please take a look at this old example of mine based on TF1.x where I use tf.io.parseExample? Here I was using raw ops so yes, having a higher-level API to wrap them up would be interesting for sure. But I just wanted to show you how I used to do it and to check if you ended up doing similar too at this stage.

@karllessard
Copy link
Collaborator

Hey @dhruvrajan , I didn't hear you back about this point, so are you unblock and do you think the example I provide is enough? (If so, I'll move it to our example repository)

@karllessard
Copy link
Collaborator

Hi @dhruvrajan , just to get a little update on this, do you have any plan to add something that helps out building a TFRecord? If so, will that be part of the framework or the keras layer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants