Maths! Tensors! Sciences! #44

lusbenjamin · 2017-09-21T02:39:29Z

Adds all the maths and sciences that the cool kids use to do Deep Learning these days.
Adds tensor utilities in anticipation of deep learning:
- pymoji.tensors.one_hot for evaluating Softmax outputs
- pymoji.tensors.random_mini_batches for batch gradient descent
Utilities to read/write large datasets: pymoji.save_dataset and pymoji.load_dataset
Helper functions and CLI command to harvest all of the heads from a directory of previous runs!

Head Hunting!

swallows errors file-by-file
processes all "prior runs" in a directory
for now, a prior run is an image file that has a corresponding JSON metadata file

mostly for convenience

utility module to do lots of the tensor-related data processing for DL

uses pymoji.tensors to process a directory and produce a data file

lusbenjamin · 2017-09-21T02:46:52Z

pymoji/utils.py

@@ -170,6 +171,36 @@ def load_json(json_stream):
    return result.data


+def json_to_object(name, json_node):
+    """Quick-and-dirty recursive conversion from JSON dictionary data to
+    an object built out of namedtuples.


Mebbe about a 5kyu on codewars? I had to write this to be able to re-use the logic in pymoji.emoji. It all relies on face annotation objects with attributes, e.g. face.bounding_box, whereas I was kinda annoyed to learn there was no way to get Marshmallow to give us back an object instead of a dictionary. There may be a better way to do this?

lusbenjamin · 2017-09-21T02:50:46Z

pymoji/tensors.py

+    labels = []
+
+    def load_heads(input_path):
+        """Helper for iteratively loading input feature data."""


@melodylu this relates to your earlier tussle with nonlocal. This helper also "gets away with it" because it mutates features and labels instead of setting them.

melodylu · 2017-09-21T03:04:19Z

pymoji/tensors.py

+
+    for face in faces:
+        # compute label Y
+        code = get_emoji_code(None, face, use_big_guns=False)


update use_big_guns plz 🔫

melodylu

@lusbenjamin
Thanks for the explanation of all the Tensors! Sorry I can't give more helpful ML feedback.

@dnewburger ?

dnewburger

Cool stuff! Can't wait to see how your training goes!

One note is that TF has great tools for image manipulation, and these tools help prevent mistakes when converting from images to tensor representations. I remember having a couple bugs when feeding tensors into my models because I was doing the normalization and flattening myself when creating the datasets, then using TF image tools when training the models. I recommend when you start working on the model, either keep using PIL, or convert all the image processing methods to TF.

dnewburger · 2017-09-21T03:39:24Z

pymoji/tensors.py

+# pylint: disable=invalid-name
+
+
+def head_to_ndarray(input_stream, size=HEAD_SIZE):


Tensorflow has some nice image encoding, decoding, and manipulation methods that could save you some headache. I found manipulating the images myself led to unexpected bugs, and, when it comes to adding more advanced training techniques like changing the image orientation or saturation, you'll probably want to use the tf libraries anyway.

https://www.tensorflow.org/api_guides/python/image

dnewburger · 2017-09-21T03:54:57Z

pymoji/tensors.py

+    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig
+
+
+def random_mini_batches(X, Y, mini_batch_size=MINI_BATCH_SIZE):


You may want to migrate to TensorFlow's queuing system later on, which is one of the worst and best parts of TensorFlow (at least when I was using it). Best because of the convenience, worst because the way it works under the hood is kind of inscrutable.

https://www.tensorflow.org/programmers_guide/threading_and_queues

Advantages include an easy framework for training in parallel and convenience methods for shuffling, combining, and weighting different datasets.

lusbenjamin · 2017-09-22T18:01:35Z

Going to leave this around as a reference for a few days, but most likely going to ultimately close this in favor of using functionality built in to the TensorFlow libraries.

lusbenjamin added 9 commits September 20, 2017 13:21

pylintrc ignores no-member errors in google.cloud, numpy, tensorflow

f9348cd

lint

1953404

add python math and science dependencies

c58f241

utils.json_to_object

311be00

emoji.extract_head

a5acdb8

constants.TEST_DATASET and TRAIN_DATASET

5591f64

mostly for convenience

pymoji.tensors

8c5ff9c

utility module to do lots of the tensor-related data processing for DL

CLI makedata command

2fd8d87

uses pymoji.tensors to process a directory and produce a data file

lint and test case fix

d7a04d8

lusbenjamin commented Sep 21, 2017

View reviewed changes

melodylu reviewed Sep 21, 2017

View reviewed changes

melodylu approved these changes Sep 21, 2017

View reviewed changes

lusbenjamin added 2 commits September 20, 2017 20:25

Merge branch 'master' into tensors

7f505ae

bugfix changes from master, feedback from review

de3b1ae

dnewburger approved these changes Sep 21, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maths! Tensors! Sciences! #44

Maths! Tensors! Sciences! #44

lusbenjamin commented Sep 21, 2017 •

edited

lusbenjamin Sep 21, 2017

lusbenjamin Sep 21, 2017

melodylu Sep 21, 2017

melodylu left a comment

dnewburger left a comment

dnewburger Sep 21, 2017

dnewburger Sep 21, 2017

lusbenjamin commented Sep 22, 2017

		# pylint: disable=invalid-name


		def head_to_ndarray(input_stream, size=HEAD_SIZE):

		return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig


		def random_mini_batches(X, Y, mini_batch_size=MINI_BATCH_SIZE):

Maths! Tensors! Sciences! #44

Are you sure you want to change the base?

Maths! Tensors! Sciences! #44

Conversation

lusbenjamin commented Sep 21, 2017 • edited

Head Hunting!

lusbenjamin Sep 21, 2017

Choose a reason for hiding this comment

lusbenjamin Sep 21, 2017

Choose a reason for hiding this comment

melodylu Sep 21, 2017

Choose a reason for hiding this comment

melodylu left a comment

Choose a reason for hiding this comment

dnewburger left a comment

Choose a reason for hiding this comment

dnewburger Sep 21, 2017

Choose a reason for hiding this comment

dnewburger Sep 21, 2017

Choose a reason for hiding this comment

lusbenjamin commented Sep 22, 2017

lusbenjamin commented Sep 21, 2017 •

edited