Added Caltech10k Web Faces dataset #324

mandroid6 · 2019-03-24T07:15:50Z

Dataset Request #318
@rsepassi @cyfra @Conchylicultor

us · 2019-03-24T13:34:12Z

tensorflow_datasets/image/caltech10k_webfaces.py

+                "landmarks": {name: tf.int64 for name in LANDMARK_HEADINGS}
+            }),
+            urls=['http://www.vision.caltech.edu/Image_Datasets/Caltech_10K_WebFaces/'],
+            citation=""


The datasets must be have citation. Please add citation.

us · 2019-03-24T13:37:12Z

tensorflow_datasets/image/caltech10k_webfaces.py

+""" 
+class Caltech10K_WebFaces(tfds.core.GeneratorBasedBuilder):
+
+    VERSION = tfds.core.Version("0.1.0")


Indentation should be 2 as in other files.

us · 2019-03-25T06:51:10Z

tensorflow_datasets/image/caltech10k_webfaces.py

+        "landmarks": {name: tf.int64 for name in LANDMARK_HEADINGS}
+      }),
+      urls=['http://www.vision.caltech.edu/Image_Datasets/Caltech_10K_WebFaces/'],
+      citation=""


I think you forgot that

Yes, fixed now

us · 2019-03-25T09:05:17Z

tensorflow_datasets/image/caltech10k_webfaces.py

+        name=tfds.Split.TRAIN,
+        num_shards=10,
+        gen_kwargs={
+          "file_id": 0,


You didn't use file_id why returning that ?

Forgot to remove it after updating logic

Conchylicultor · 2019-03-25T18:30:11Z

tensorflow_datasets/image/caltech10k_webfaces.py

+        name=tfds.Split.VALIDATION,
+        num_shards=4,
+        gen_kwargs={
+          "extracted_dirs": extracted_dirs,


I don't understand. It seems that you are passing the same gen_kwargs to your validations and test set. So all train/valid/test sets will contains the same examples no ?

@Conchylicultor some incorrect replication of logic on my end. This dataset doesn't have any original splits given, so will return a single tfds.Split.TRAIN split.

Fixed now

Conchylicultor

Thanks, unfortunately your tests are not working (ex: you don't have landmarks inside fake data,...). Please fix. Make sure the tests are running correctly.

cyfra · 2020-02-08T18:42:24Z

tensorflow_datasets/image/caltech10k_webfaces.py

+from __future__ import print_function
+
+import os
+import tensorflow as tf


import tensorflow.compat.v2 as tf

cyfra · 2020-02-08T18:43:12Z

tensorflow_datasets/image/caltech10k_webfaces.py

+    return [
+      tfds.core.SplitGenerator(
+        name=tfds.Split.TRAIN,
+            num_shards=5,


remove - num_shards is no longer needed.

cyfra · 2020-02-08T18:45:07Z

tensorflow_datasets/image/caltech10k_webfaces.py

+    for file_name in sorted(files):
+      path = os.path.join(filedir, file_name)
+
+      yield {


the S3 is the default now - so you have to yield a tuple: of (file_name and everything else)

cyfra · 2020-02-08T18:45:22Z

tensorflow_datasets/image/caltech10k_webfaces.py

+
+    landmarks = self._process_caltech10k_config_file(landmarks_path)
+
+    for file_name in sorted(files):


nit: doesn't have to be sorted.

cyfra · 2020-02-08T18:49:10Z

tensorflow_datasets/image/caltech10k_webfaces.py

+      row_values = line.strip().split()
+      # Each row start with the 'file_name' and then space-separated values.
+      values[row_values[0]] = [v for v in row_values[1:]]
+    return keys, values


please explain more what are the "keys" and "values" here..

keys should be the same as "LANDMARK_HEADINGS" right ?

cyfra · 2020-02-08T18:51:19Z

tensorflow_datasets/image/caltech10k_webfaces_test.py

+  SPLITS = {  
+    "train": 10,
+  }
+


does the test pass ? the CSV file is not present in the fake_examples dir.

mandroid6 added 4 commits March 23, 2019 16:47

initial commit for caltech10k_webfaces

371bf6c

added and updated _split generators for caltech10k_webfaces

73228aa

added _process_caltech10k_config_file to create key value for landmarks

cd43826

updated metadata fro caltech10k_webfaces dataset

a71f57c

googlebot added the cla: yes Author has signed CLA label Mar 24, 2019

mandroid6 added 2 commits March 24, 2019 12:48

pylint updates

d1663b6

added unit test and fake images for caltech10k_webfaces

4cf615e

us suggested changes Mar 24, 2019

View reviewed changes

mandroid6 added 2 commits March 25, 2019 08:24

converted indentation to 2 spaces

deadc9b

updated citation

249c1b1

us suggested changes Mar 25, 2019

View reviewed changes

Added CITATION param

08ed9b8

us suggested changes Mar 25, 2019

View reviewed changes

removed file_id

9517e34

Conchylicultor reviewed Mar 25, 2019

View reviewed changes

mandroid6 added 2 commits March 26, 2019 00:40

remove reduntant splits

b930d8c

removed reduntant splits

0441f52

Conchylicultor requested changes Mar 25, 2019

View reviewed changes

Conchylicultor added the kokoro:run Run Kokoro tests label Mar 25, 2019

kokoro-team removed the kokoro:run Run Kokoro tests label Mar 25, 2019

Conchylicultor added the dataset request Request for a new dataset to be added label Apr 25, 2019

cyfra added the tfds:is_reviewing TFDS team: PTAL label Feb 8, 2020

cyfra suggested changes Feb 8, 2020

View reviewed changes

cyfra added author:please_respond Author - please respond to the recent comments. and removed tfds:is_reviewing TFDS team: PTAL labels Feb 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Caltech10k Web Faces dataset #324

Added Caltech10k Web Faces dataset #324

mandroid6 commented Mar 24, 2019

us Mar 24, 2019

mandroid6 Mar 25, 2019

us Mar 24, 2019

mandroid6 Mar 25, 2019

us Mar 25, 2019

mandroid6 Mar 25, 2019

us Mar 25, 2019

mandroid6 Mar 25, 2019

Conchylicultor Mar 25, 2019

mandroid6 Mar 25, 2019

Conchylicultor left a comment

cyfra Feb 8, 2020

cyfra Feb 8, 2020

cyfra Feb 8, 2020

cyfra Feb 8, 2020

cyfra Feb 8, 2020

cyfra Feb 8, 2020


		landmarks = self._process_caltech10k_config_file(landmarks_path)

		for file_name in sorted(files):

Added Caltech10k Web Faces dataset #324

Are you sure you want to change the base?

Added Caltech10k Web Faces dataset #324

Conversation

mandroid6 commented Mar 24, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Conchylicultor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment