You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@wronk and I think it will be important for the ffda-poi work as well as for future label-maker users to have an optional step to convert the NPZ files generated from label-maker package.
To address this, we can open a PR that would allow for an optional step 6 in the label maker workflow, and could work as a cli command label-maker tfrecords, to write out a.tfrecords file for train, test, and if applicable, val, using numpy array representations of the images and labels from label-maker package.
the implementation I'm currently working on does require tensorflow2, and tensorflow2 has some conflicts with label-maker's requirements.txt, would it be okay if to upgrade these to work with tf2 or do we want to avoid having tensorflow as a dependency?
Happy to approach this a different way, if people have thoughts of something else to try.
@martham93, thinking more about this, I'm not sure we want to add TF as a dependency. I think the options are:
Code this functionality up separately and link to a public example in label-maker's documentation. However, don't explicitly support it in this codebase.
Include the code in label-maker, make requirements updates (maybe in a separate PR), and make the TF functionality optional. We can do this by only importing TF imports within your data->TFRecord file and only include tf with the requirements-dev.txt file.
Feel free to continue coding this separately and we can drop in the functionality later depending on what we decide to do.
Right now code to convert data.npz is in the ffda-poi repo. I can work on making a public example (once I've made more progress on other ffda-poi work) and link to label-maker's documentation, to move forward with option 1 presented above.
A few things to note:
Converting a ~1.8GB data.npz file takes about 3.5 hours to run. The corresponding .tfrecords file will be ~1GB.
kudos to @wronk for figuring out that the image array needs to be converted to a tensor, then run through tf.image.encode_png function to reduce the size before converting to a tfrecord without this step the data size is huge!
@wronk and I think it will be important for the ffda-poi work as well as for future label-maker users to have an optional step to convert the
NPZ
files generated fromlabel-maker package
.To address this, we can open a PR that would allow for an optional step 6 in the label maker workflow, and could work as a cli command
label-maker tfrecords
, to write out a.tfrecords
file fortrain
,test
, and if applicable,val
, usingnumpy array
representations of the images and labels fromlabel-maker package
.Initially this PR will just address adding tfrecords for classification data to stay within the ffda-poi scope. I do see that there is already some existing work around tfrecords creation for object detection data https://github.com/developmentseed/label-maker/blob/master/examples/utils/tf_records_generation.py.
The text was updated successfully, but these errors were encountered: