-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maths! Tensors! Sciences! #44
base: master
Are you sure you want to change the base?
Conversation
mostly for convenience
utility module to do lots of the tensor-related data processing for DL
uses pymoji.tensors to process a directory and produce a data file
@@ -170,6 +171,36 @@ def load_json(json_stream): | |||
return result.data | |||
|
|||
|
|||
def json_to_object(name, json_node): | |||
"""Quick-and-dirty recursive conversion from JSON dictionary data to | |||
an object built out of namedtuples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mebbe about a 5kyu on codewars? I had to write this to be able to re-use the logic in pymoji.emoji
. It all relies on face annotation objects with attributes, e.g. face.bounding_box
, whereas I was kinda annoyed to learn there was no way to get Marshmallow to give us back an object instead of a dictionary. There may be a better way to do this?
labels = [] | ||
|
||
def load_heads(input_path): | ||
"""Helper for iteratively loading input feature data.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@melodylu this relates to your earlier tussle with nonlocal
. This helper also "gets away with it" because it mutates features
and labels
instead of setting them.
pymoji/tensors.py
Outdated
|
||
for face in faces: | ||
# compute label Y | ||
code = get_emoji_code(None, face, use_big_guns=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update use_big_guns plz 🔫
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lusbenjamin
Thanks for the explanation of all the Tensors! Sorry I can't give more helpful ML feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff! Can't wait to see how your training goes!
One note is that TF has great tools for image manipulation, and these tools help prevent mistakes when converting from images to tensor representations. I remember having a couple bugs when feeding tensors into my models because I was doing the normalization and flattening myself when creating the datasets, then using TF image tools when training the models. I recommend when you start working on the model, either keep using PIL, or convert all the image processing methods to TF.
# pylint: disable=invalid-name | ||
|
||
|
||
def head_to_ndarray(input_stream, size=HEAD_SIZE): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tensorflow has some nice image encoding, decoding, and manipulation methods that could save you some headache. I found manipulating the images myself led to unexpected bugs, and, when it comes to adding more advanced training techniques like changing the image orientation or saturation, you'll probably want to use the tf libraries anyway.
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig | ||
|
||
|
||
def random_mini_batches(X, Y, mini_batch_size=MINI_BATCH_SIZE): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to migrate to TensorFlow's queuing system later on, which is one of the worst and best parts of TensorFlow (at least when I was using it). Best because of the convenience, worst because the way it works under the hood is kind of inscrutable.
https://www.tensorflow.org/programmers_guide/threading_and_queues
Advantages include an easy framework for training in parallel and convenience methods for shuffling, combining, and weighting different datasets.
Going to leave this around as a reference for a few days, but most likely going to ultimately close this in favor of using functionality built in to the TensorFlow libraries. |
pymoji.tensors.one_hot
for evaluating Softmax outputspymoji.tensors.random_mini_batches
for batch gradient descentpymoji.save_dataset
andpymoji.load_dataset
Head Hunting!