Skip to content
This repository has been archived by the owner on Jul 31, 2023. It is now read-only.

Feature/structured data tutorial #45

Merged
merged 2 commits into from Oct 16, 2020
Merged

Conversation

cfezequiel
Copy link
Contributor

Description

Add a Jupyter notebook tutorial on converting structured data to TFRecords.
This also changes types.FloatInput to use tf.float32 for its feature_spec
attribute to address potential incompatibility with using tf.float64
type in TensorFlow Transform.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

Please delete options that are not relevant.

  • My code adheres to the Google Python Style Guide
  • I ran make pylint and code is rated 10/10
  • I have added tests that prove my fix is effective or that my feature works
  • I ran make test and all tests pass
  • I ran the tool and verified the change works
  • I have adequately commented my code, particularly in hard-to-understand areas
  • I have made relevant changes to the documentation, if needed
  • My changes generate no new warnings
  • I have corrected any misspellings in my code

mbernico and others added 2 commits October 16, 2020 13:12
This changes types.FloatInput to use tf.float32 for its feature_spec
attribute to address potential incompatibility with using tf.float64
type in TensorFlow Transform.
@mbernico mbernico merged commit 2af364d into dev Oct 16, 2020
mbernico added a commit that referenced this pull request Nov 4, 2020
* Update check_tfrecords to use new dataset load function.

* Add tfrecord_dir to create_tfrecords output.

* Restructure test image directory to match expected format.

* Feature/dataclass (#44)

* Added data classes for types.

* Checking in progress.

* Checking in more changes.

* Converted types to classes and refactored schema into OO pattern.

* Changed OrderedDict import to support py3.6.

* Changed OrderedDict import to support py3.6.

* Updated setup.py for version.

* fixing setup.py

* Patched requirements and setup.

* Addressed comments in code review.

* Addressed code comments round 2.

* refactored IMAGE_CSV_SCHEMA.

* Merged check_test.py from dev

Co-authored-by: Carlos Ezequiel <cezequiel@google.com>

* Feature/structured data tutorial (#45)

* Converted types to classes and refactored schema into OO pattern.

* Add tutorial on structured data conversion.

This changes types.FloatInput to use tf.float32 for its feature_spec
attribute to address potential incompatibility with using tf.float64
type in TensorFlow Transform.

Co-authored-by: Mike Bernico <mikebernico@google.com>

* Update structured data tutorial to use  output dir.

* Clarify need for proper header when using create_tfrecords. Fixes #47.

* Clean up README and update image directory notebook.

* Feature/test image dir (#49)

* Restructure test image directory to match expected format.

* Clean up README and update image directory notebook.

* Fix minor issues

* Add an explicit error message for missing train split

* Configure automated tests for Jupyter notebooks.

* Add convert_and_load function.

Also refactor create_tfrecords to convert.

* Refactor check and common modules to utils.

* Add test targets for py files and notebooks.

* Feature/convert and load (#55)

* Add convert_and_load function.

Also refactor create_tfrecords to convert.

* Refactor check and common modules to utils.

* Add test targets for py files and notebooks.

* Update version in setup.py and release notes.

* Fix issues with GCS path parsing.

Co-authored-by: Mike Bernico <mikebernico@google.com>
Co-authored-by: Sergii Khomenko <khomenko@brainscode.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants