-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: TensorFlow API symbols and namespaces #16
Conversation
…o, mentioning that we want to move TensorFlow debugger to tf.debugging.
rfcs/20180827-api-names.md
Outdated
|
||
We plan to add the following additional namespaces: | ||
|
||
**tf.probability** - replaces `tf.distributions` namespace and will contain other ops related to statistics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share any examples of "other ops related to statistics" that would fit in here? So far, the list below only contains functionality currently in tf.distributions
.
I think "distributions" is slightly more descriptive than "probability" as a namespace for distribution objects. I'm also concerned about potential confusion with the TensorFlow probability project (https://github.com/tensorflow/probability), which currently uses the tensorflow_probability
namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping these symbols in tf.distributions sounds good to me. I updated the doc accordingly.
I don't remember seeing any that are not in tf.distributions. This definition just included any other future ops we might add.
…t move to tf.probability.*
rfcs/20180827-api-names.md
Outdated
| tf.newaxis | tf.manip.newaxis | | ||
| tf.nn.in_top_k | tf.math.in_top_k | | ||
| tf.nn.l2_normalize | tf.math.l2_normalize, tf.linalg.l2_normalize | | ||
| tf.nn.log_poisson_loss | tf.losses.log_poisson_loss | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log_poisson_losss
feels like a slightly strange fit for tf.losses
, given that it's a low-level function that doesn't adhere to the tf.losses interface (e.g., with the loss_collection
and reduction
arguments).
| tf.nn.log_uniform_candidate_sampler | tf.random.log_uniform_candidate_sampler | | ||
| tf.nn.sigmoid | tf.math.sigmoid | | ||
| tf.nn.softmax | tf.math.softmax | | ||
| tf.nn.top_k | tf.math.top_k | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If tf.argmin and tf.argmax are in the top level namespace it might make sense to have top_k there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed deprecation of tf.argmin from the doc since tf.argmax is not deprecated. Makes sense to keep both of them in root if one is in root.
However, tf.nn.top_k was never added to root namespace before. We want to avoid growing the root namespace (unless we are adding a very common op). The canonical endpoints for these symbols are: tf.math.top_k, tf.math.argmin, tf.math.argmax. (i.e. all under same module - tf.math)
Is this RFC only about python? I'd like TF 2.0 to version the libtensorflow.so and _cc.so etc properly. (eg they need to be named libtensorflow.so.2.0) and the linker scripts should set versions on the symbols so in future ABI compat is easier. Should that be discussed somewhere else? |
Let's keep this about Python, and possibly start a discussion about .so on build@. |
|
||
|
||
|
||
# Appendix 2: Deprecated Endpoints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we deprecate the tf.logging module?
We clearly need some facility that TF can use internally, but I'm not sure that other people need to use the same? Maybe we keep a single endpoint with which users can get the logger, and then use that using their own import of python's logging?
Just a suggestion, triggered by tensorflow/tensorflow#21930, which made clear to me that we're just providing a wrapper around logging without adding any real value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just talked about this with @tensorflow/api-owners and we do want to remove tf.logging from the public API (we will continue to use it for TF itself, but it's unclear what value, if any, it provides to users over just using Python's logging module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a note about deprecating tf.logging under "Deprecated namespaces"
| tf.saved_model.tag_constants.GPU | tf.saved_model.GPU | | ||
| tf.saved_model.tag_constants.SERVING | tf.saved_model.SERVING | | ||
| tf.saved_model.tag_constants.TPU | tf.saved_model.TPU | | ||
| tf.saved_model.tag_constants.TRAINING | tf.saved_model.TRAINING | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lack of tag_constants here makes the meaning of these less clear. Is the primary goal reducing verbosity? Maybe .tags., or GPU_TAG even? Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pasting @martinwicke 's response here to preserve the coherence of the thread: The primary reason is to reduce incidental complexity. This is particular
important for signature_def_utils
and signature_constants
, which are
very redundant. For tag_constants, how about making it a class, not a
module? tf.saved_model.Tags.GPU (and yeah, shorter, as you suggest as well)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be perfectly honest, I am one of the world-experts in these sets of constants at this point, and I still forget what the difference between tag_constants and signature_constants is, and which is used where. Let me think about these a little, and see if I have a better suggestion.
The primary reason is to reduce incidental complexity. This is particular
important for `signature_def_utils` and `signature_constants`, which are
very redundant. For tag_constants, how about making it a class, not a
module? tf.saved_model.Tags.GPU (and yeah, shorter, as you suggest as well)?
|
Note also that I will not be upset if we remove either (signature_constants
seems the most likely) or even both. Not 100% sure what we would lose.
|
…metrics, tf.keras.layers and deprecated symbols under tf.layers and tf.logging. I also removed tf.log_poisson_loss changes according to the comment as well as tf.global_norm - still need to decide where it needs to go.
I added some big changes to the doc, specifically:
|
@annarev : is there any particular reason why Is the rationale to make users switch to |
…f.bitcast in tf.dtypes.bitcast to match tf.cast changes
…g that signatures of tf.layers, tf.losses and tf.metrics will likely change when we add additional endpoints under tf.keras.layers to match existing functions under tf.keras better
@omoindrot I updated the doc just now to keep layers under tf.layers which would be aliases for tf.keras.layers. Right now we have layers under tf.layers and tf.keras.layers. We want to keep one consistent set of layers. The plan is to add them to tf.keras.layers, but we can keep aliases in tf.layers as well. However, signatures will likely change since we want all layers under tf.keras.layers to have consistent interface. |
I added 2 more new endpoints: Given these recent changes, I am extending review by 2 more days (until Friday). |
Not sure if it is related to api and namespace, though I am wondering if we have a plan for data format types? In the existing APIs, there are several ways of representing the data format: |
@perfinion let's talk about that on sig-build @yongtang that's a good point. This RFC is mainly about the top-level symbols but we should talk about this. If 'channels-*' is sufficient I would prefer that, but it's not clear to me that it is. What we can definitely do is align these wherever we can. If HW vs WH is actually used, we should likely give up on the simpler symbols and just use that. @fchollet FYI. |
rfcs/20180827-api-names.md
Outdated
* Remove all endpoints that have been moved to `tf.quantization` namespace. | ||
* Remove all endpoints that have been moved to `tf.random` namespace. | ||
* Remove all endpoints from `tf.logging`. | ||
* Keeps most symbols in `tf.manip` in root namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we bother making manip? It's a mixture of set_ops and array_ops, and since we're keeping all of them (most of them?) in the root anyway, how about this:
tf.sets.* (only) for the set ops currently in manip
tf.* (only) for the tensor manipulation ops currently in manip
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also manip_ops:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/ops/manip_ops.cc
It currently only has 1 op (tf.manip.roll).
We also have tf.manip already:
https://www.tensorflow.org/api_docs/python/tf/manip
That being said, tf.manip.roll is not that frequently used and I think most references to other ops in tf.manip are still referring to these ops in root. So, we can probably deprecate tf.manip. It is just a bit strange since we just create it recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know... I was looking through the list though, and it's a weird module. I think removing it would be an improvement. I don't think moving roll to tf.roll would be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for not bothering with tf.manipt
…milar to tf.sparse_tensor_to_dense except the latter takes SparseTensor as argument.
I updated the document to deprecate tf.manip namespace. |
@annarev do you intend to have a design review meeting for this document? |
…ange is according to tensorflow/community#16. PiperOrigin-RevId: 214680285
@ewilderj sorry for a late reply. We are not planning to have a design review meeting. |
…ing to tensorflow/community#16. In addition to the changes in the doc, I made the following updates (these changes make sense to me and I didn't notice them when compiling the doc): * deprecate saved_model.builder.SavedModelBuilder - replaced with saved_model.SavedModelBuilder * deprecate python_io.tf_record_iterator - replaced with io.tf_record_iterator * deprecate python_io.TFRecordWriter - replaced with io.TFRecordWriter * move reduce_join to tf.string PiperOrigin-RevId: 215253944
…changes are made according to tensorflow/community#16. I am keeping a few symbols deprecated not mentioned in the doc: tf.diag - it seems best to keep it next to tf.linalg.diag, so that the two are easy to compare and decide which one to use. The plan is to rename tf.diag to tf.tensor_diag. tf.is_nan - similar to tf.is_inf, tf.is_finite, tf.is_numeric_tensor which are all getting deprecated and replaced by symbols in tf.debugging. tf.string_to_number - other string endpoints in root namespace are getting deprecated: for e.g. tf.substr, tf.string_join. tf.dequantize - all quantization ops should be under tf.quantize. I probably missed this one. tf.check_numerics - similar to other debugging ops that are getting moved to tf.debugging. tf.squared_difference - moved to tf.math namespace and not as popular as some other math ops such as tf.add to justify keeping endpoint in root. tf.decode_raw - similar to other ops such as tf.decode_csv that are getting moved to tf.io.decode_csv. PiperOrigin-RevId: 216278010
…changes are made according to tensorflow/community#16. I am keeping a few symbols deprecated not mentioned in the doc: tf.diag - it seems best to keep it next to tf.linalg.diag, so that the two are easy to compare and decide which one to use. The plan is to rename tf.diag to tf.tensor_diag. tf.is_nan - similar to tf.is_inf, tf.is_finite, tf.is_numeric_tensor which are all getting deprecated and replaced by symbols in tf.debugging. tf.string_to_number - other string endpoints in root namespace are getting deprecated: for e.g. tf.substr, tf.string_join. tf.dequantize - all quantization ops should be under tf.quantize. I probably missed this one. tf.check_numerics - similar to other debugging ops that are getting moved to tf.debugging. tf.squared_difference - moved to tf.math namespace and not as popular as some other math ops such as tf.add to justify keeping endpoint in root. tf.decode_raw - similar to other ops such as tf.decode_csv that are getting moved to tf.io.decode_csv. PiperOrigin-RevId: 216278010
Hi @annarev, is this proposal finalized, and do you know when is the final date? Asking since there is pending PRs adding ops in TensorFlow.js |
| tf.complex | tf.dtypes.complex | | ||
| tf.complex128 | tf.dtypes.complex128 | | ||
| tf.complex64 | tf.dtypes.complex64 | | ||
| tf.confusion_matrix | tf.train.confusion_matrix | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why move confusion_matrix
under tf.train.*
? Confusion matrix doesn't have to do with training intrinsically. One can evaluate a model's confusion matrix even without training the model. The tf.train.*
namespace seems to be mainly about optimizers and checkpoints, etc, which are very different things from confusion matrix.
I think confusion matrix belong more under the tf.math.*
namespace. Also, tf.metrics.*
may seem appropriate at first glance, but keep in mind that confusion matrix is not a single scalar, unlike all the things we have under `tf.metrics.* right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for suggestions! I already made changes in this doc, but we can definitely add more endpoints and deprecate tf.train.confusion_matrix in TensorFlow 2.0.
tf.metrics.* seems to be a good match. Let me confirm if adding tf.metrics.confusion_matrix would be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi, I moved confusion_matrix to tf.math.confusion_matrix.
@dsmilkov This document has been finalized, but I can still make changes if a different name makes sense. |
We plan to add the following additional namespaces: | ||
|
||
**tf.random** - will contain random sampling ops. | ||
**tf.keras.layers** - will contain all symbols that are currently under `tf.layers`. Note that signatures of these symbols will likely change to match layers under tf.keras.layers better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinwicke : Should this be rephrased (as are other references to tf.layers
, since we do plan to remove the functional layer calls and also the variable-scope related wrapping of tf.keras.Layers
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I do think though that in addition to this rename RFC, we'll need another to publish the more detailed changes we're discussing.
tf.train.confusion_matrix. This change is based on feedback at tensorflow/tfjs#771 (comment) tensorflow/community#16 PiperOrigin-RevId: 218216013
(tf.keras.layers.Foo should be used instead of tf.layers.Foo or tf.layers.foo as per tensorflow/community#16) PiperOrigin-RevId: 218273732
…changes are made according to tensorflow/community#16. I am keeping a few symbols deprecated not mentioned in the doc: tf.diag - it seems best to keep it next to tf.linalg.diag, so that the two are easy to compare and decide which one to use. The plan is to rename tf.diag to tf.tensor_diag. tf.is_nan - similar to tf.is_inf, tf.is_finite, tf.is_numeric_tensor which are all getting deprecated and replaced by symbols in tf.debugging. tf.string_to_number - other string endpoints in root namespace are getting deprecated: for e.g. tf.substr, tf.string_join. tf.dequantize - all quantization ops should be under tf.quantize. I probably missed this one. tf.check_numerics - similar to other debugging ops that are getting moved to tf.debugging. tf.squared_difference - moved to tf.math namespace and not as popular as some other math ops such as tf.add to justify keeping endpoint in root. tf.decode_raw - similar to other ops such as tf.decode_csv that are getting moved to tf.io.decode_csv. PiperOrigin-RevId: 216278010
| tf.scalar_mul | tf.math.scalar_mul | | ||
| tf.serialize_many_sparse | tf.io.serialize_many_sparse | | ||
| tf.serialize_sparse | tf.io.serialize_sparse | | ||
| tf.set_random_seed | tf.random.set_random_seed | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be tf.random.set_seed
?
Review open for comments until 2018-09-11
TensorFlow Namespaces
Objective
This document proposes organizing TensorFlow API symbols in corresponding logical subnamespaces. As TensorFlow library grows, it is important to structure namespaces in a clear way for easier discoverability and usability.
We define endpoint as full name that can be used to access a TensorFlow
symbol. For e.g. name_scope can be accessed using either
tf.name_scope
ortf.keras.backend.name_scope
. Therefore,name_scope
has 2 endpoints:tf.name_scope
andtf.keras.backend.name_scope
.At a high level we have the following goals: