Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: TensorFlow API symbols and namespaces #16

Merged
merged 10 commits into from
Nov 8, 2018
Merged

RFC: TensorFlow API symbols and namespaces #16

merged 10 commits into from
Nov 8, 2018

Conversation

annarev
Copy link
Contributor

@annarev annarev commented Aug 27, 2018

Review open for comments until 2018-09-11

TensorFlow Namespaces

Status Accepted
Author(s) Anna Revinskaya (annarev@google.com), Andrew Selle (aselle@google.com)
Sponsor Martin Wicke (wicke@google.com)
Updated 2018-08-27

Objective

This document proposes organizing TensorFlow API symbols in corresponding logical subnamespaces. As TensorFlow library grows, it is important to structure namespaces in a clear way for easier discoverability and usability.

We define endpoint as full name that can be used to access a TensorFlow
symbol. For e.g. name_scope can be accessed using either tf.name_scope or tf.keras.backend.name_scope. Therefore, name_scope has 2 endpoints: tf.name_scope and tf.keras.backend.name_scope.

At a high level we have the following goals:

  • Add a few additional namespaces.
  • Add additional endpoints for TensorFlow symbols in relevant namespaces.
  • Remove some of the existing endpoints.

@ewilderj ewilderj changed the title RFC for TensorFlow API names changes RFC: TensorFlow API symbols and namespaces Aug 27, 2018
@ewilderj ewilderj added this to Needs attention in RFC management Aug 27, 2018
…o, mentioning that we want to move TensorFlow debugger to tf.debugging.
@martinwicke martinwicke added the 2.0 TensorFlow 2.0 development label Aug 27, 2018
@ewilderj ewilderj added the RFC: Proposed RFC Design Document label Aug 27, 2018

We plan to add the following additional namespaces:

**tf.probability** - replaces `tf.distributions` namespace and will contain other ops related to statistics.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share any examples of "other ops related to statistics" that would fit in here? So far, the list below only contains functionality currently in tf.distributions.

I think "distributions" is slightly more descriptive than "probability" as a namespace for distribution objects. I'm also concerned about potential confusion with the TensorFlow probability project (https://github.com/tensorflow/probability), which currently uses the tensorflow_probability namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping these symbols in tf.distributions sounds good to me. I updated the doc accordingly.

I don't remember seeing any that are not in tf.distributions. This definition just included any other future ops we might add.

@ewilderj ewilderj moved this from Needs attention to Open reviews in RFC management Aug 27, 2018
| tf.newaxis | tf.manip.newaxis |
| tf.nn.in_top_k | tf.math.in_top_k |
| tf.nn.l2_normalize | tf.math.l2_normalize, tf.linalg.l2_normalize |
| tf.nn.log_poisson_loss | tf.losses.log_poisson_loss |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_poisson_losss feels like a slightly strange fit for tf.losses, given that it's a low-level function that doesn't adhere to the tf.losses interface (e.g., with the loss_collection and reduction arguments).

| tf.nn.log_uniform_candidate_sampler | tf.random.log_uniform_candidate_sampler |
| tf.nn.sigmoid | tf.math.sigmoid |
| tf.nn.softmax | tf.math.softmax |
| tf.nn.top_k | tf.math.top_k |
Copy link

@georgedahl georgedahl Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If tf.argmin and tf.argmax are in the top level namespace it might make sense to have top_k there as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed deprecation of tf.argmin from the doc since tf.argmax is not deprecated. Makes sense to keep both of them in root if one is in root.

However, tf.nn.top_k was never added to root namespace before. We want to avoid growing the root namespace (unless we are adding a very common op). The canonical endpoints for these symbols are: tf.math.top_k, tf.math.argmin, tf.math.argmax. (i.e. all under same module - tf.math)

@perfinion
Copy link
Member

Is this RFC only about python? I'd like TF 2.0 to version the libtensorflow.so and _cc.so etc properly. (eg they need to be named libtensorflow.so.2.0) and the linker scripts should set versions on the symbols so in future ABI compat is easier. Should that be discussed somewhere else?

@martinwicke
Copy link
Member

Let's keep this about Python, and possibly start a discussion about .so on build@.




# Appendix 2: Deprecated Endpoints
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we deprecate the tf.logging module?

We clearly need some facility that TF can use internally, but I'm not sure that other people need to use the same? Maybe we keep a single endpoint with which users can get the logger, and then use that using their own import of python's logging?

Just a suggestion, triggered by tensorflow/tensorflow#21930, which made clear to me that we're just providing a wrapper around logging without adding any real value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just talked about this with @tensorflow/api-owners and we do want to remove tf.logging from the public API (we will continue to use it for TF itself, but it's unclear what value, if any, it provides to users over just using Python's logging module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note about deprecating tf.logging under "Deprecated namespaces"

| tf.saved_model.tag_constants.GPU | tf.saved_model.GPU |
| tf.saved_model.tag_constants.SERVING | tf.saved_model.SERVING |
| tf.saved_model.tag_constants.TPU | tf.saved_model.TPU |
| tf.saved_model.tag_constants.TRAINING | tf.saved_model.TRAINING |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of tag_constants here makes the meaning of these less clear. Is the primary goal reducing verbosity? Maybe .tags., or GPU_TAG even? Not sure.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pasting @martinwicke 's response here to preserve the coherence of the thread: The primary reason is to reduce incidental complexity. This is particular
important for signature_def_utils and signature_constants, which are
very redundant. For tag_constants, how about making it a class, not a
module? tf.saved_model.Tags.GPU (and yeah, shorter, as you suggest as well)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be perfectly honest, I am one of the world-experts in these sets of constants at this point, and I still forget what the difference between tag_constants and signature_constants is, and which is used where. Let me think about these a little, and see if I have a better suggestion.

@martinwicke
Copy link
Member

martinwicke commented Sep 5, 2018 via email

@martinwicke
Copy link
Member

martinwicke commented Sep 5, 2018 via email

…metrics, tf.keras.layers and deprecated symbols under tf.layers and tf.logging. I also removed tf.log_poisson_loss changes according to the comment as well as tf.global_norm - still need to decide where it needs to go.
@annarev
Copy link
Contributor Author

annarev commented Sep 7, 2018

I added some big changes to the doc, specifically:

  • All endpoints under tf.losses and tf.metrics will be available under tf.keras.losses and tf.keras.metrics as well.
  • tf.layers will be deprecated and replaced by tf.keras.layers.
  • tf.logging will be deprecated. Python logging module can be used instead.

@omoindrot
Copy link

@annarev : is there any particular reason why tf.layers is replaced by tf.keras.layers (and why we can't keep both) ?

Is the rationale to make users switch to tf.keras whenever they want to build a model?

…f.bitcast in tf.dtypes.bitcast to match tf.cast changes
…g that signatures of tf.layers, tf.losses and tf.metrics will likely change when we add additional endpoints under tf.keras.layers to match existing functions under tf.keras better
@annarev
Copy link
Contributor Author

annarev commented Sep 12, 2018

@omoindrot I updated the doc just now to keep layers under tf.layers which would be aliases for tf.keras.layers. Right now we have layers under tf.layers and tf.keras.layers. We want to keep one consistent set of layers. The plan is to add them to tf.keras.layers, but we can keep aliases in tf.layers as well.

However, signatures will likely change since we want all layers under tf.keras.layers to have consistent interface.

@annarev
Copy link
Contributor Author

annarev commented Sep 12, 2018

I added 2 more new endpoints:
tf.bitcast - proposed additional endpoint tf.dtypes.bitcast to match tf.dtypes.cast.
tf.global_norm - proposed additional endpoint tf.linalg.global_norm. We also want to deprecate tf.global_norm endpoint, so that tf.linalg.global_norm is the only available endpoint.

Given these recent changes, I am extending review by 2 more days (until Friday).

@yongtang
Copy link
Member

Not sure if it is related to api and namespace, though I am wondering if we have a plan for data format types? In the existing APIs, there are several ways of representing the data format: channels_first/channels_last/NCHW/NHWC/NCW/NWC. Sometimes they adds complexity to both implementation and to documentation. Would data format be covered as well?

@martinwicke
Copy link
Member

@perfinion let's talk about that on sig-build

@yongtang that's a good point. This RFC is mainly about the top-level symbols but we should talk about this. If 'channels-*' is sufficient I would prefer that, but it's not clear to me that it is. What we can definitely do is align these wherever we can. If HW vs WH is actually used, we should likely give up on the simpler symbols and just use that. @fchollet FYI.

* Remove all endpoints that have been moved to `tf.quantization` namespace.
* Remove all endpoints that have been moved to `tf.random` namespace.
* Remove all endpoints from `tf.logging`.
* Keeps most symbols in `tf.manip` in root namespace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we bother making manip? It's a mixture of set_ops and array_ops, and since we're keeping all of them (most of them?) in the root anyway, how about this:

tf.sets.* (only) for the set ops currently in manip
tf.* (only) for the tensor manipulation ops currently in manip

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also manip_ops:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/ops/manip_ops.cc
It currently only has 1 op (tf.manip.roll).
We also have tf.manip already:
https://www.tensorflow.org/api_docs/python/tf/manip

That being said, tf.manip.roll is not that frequently used and I think most references to other ops in tf.manip are still referring to these ops in root. So, we can probably deprecate tf.manip. It is just a bit strange since we just create it recently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know... I was looking through the list though, and it's a weird module. I think removing it would be an improvement. I don't think moving roll to tf.roll would be a problem.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for not bothering with tf.manipt

…milar to tf.sparse_tensor_to_dense except the latter takes SparseTensor as argument.
@annarev
Copy link
Contributor Author

annarev commented Sep 18, 2018

I updated the document to deprecate tf.manip namespace.
I also added deprecation for tf.sparse_to_dense. It is similar to tf.sparse_tensor_to_dense except the latter takes SparseTensor as input. So, calls to tf.sparse_to_dense can be replaced by creating SparseTensor and passing it to tf.sparse_tensor_to_dense.

@ewilderj ewilderj moved this from Open reviews to Awaiting Committee in RFC management Sep 18, 2018
@ewilderj
Copy link
Contributor

@annarev do you intend to have a design review meeting for this document?

tensorflow-copybara pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 26, 2018
@annarev
Copy link
Contributor Author

annarev commented Sep 27, 2018

@ewilderj sorry for a late reply. We are not planning to have a design review meeting.

tensorflow-copybara pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 1, 2018
…ing to

tensorflow/community#16.
In addition to the changes in the doc, I made the following updates (these changes make sense to me and I didn't notice them when compiling the doc):
* deprecate saved_model.builder.SavedModelBuilder - replaced with saved_model.SavedModelBuilder
* deprecate python_io.tf_record_iterator - replaced with io.tf_record_iterator
* deprecate python_io.TFRecordWriter - replaced with io.TFRecordWriter
* move reduce_join to tf.string

PiperOrigin-RevId: 215253944
tensorflow-copybara pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 9, 2018
…changes

are made according to tensorflow/community#16.

I am keeping a few symbols deprecated not mentioned in the doc:
tf.diag - it seems best to keep it next to tf.linalg.diag, so that the two are easy to compare and decide which one to use. The plan is to rename tf.diag to tf.tensor_diag.
tf.is_nan - similar to tf.is_inf, tf.is_finite, tf.is_numeric_tensor which are all getting deprecated and replaced by symbols in tf.debugging.
tf.string_to_number - other string endpoints in root namespace are getting deprecated: for e.g. tf.substr, tf.string_join.
tf.dequantize - all quantization ops should be under tf.quantize. I probably missed this one.
tf.check_numerics - similar to other debugging ops that are getting moved to tf.debugging.
tf.squared_difference - moved to tf.math namespace and not as popular as some other math ops such as tf.add to justify keeping endpoint in root.
tf.decode_raw - similar to other ops such as tf.decode_csv that are getting moved to tf.io.decode_csv.

PiperOrigin-RevId: 216278010
benjamintanweihao pushed a commit to benjamintanweihao/tensorflow that referenced this pull request Oct 12, 2018
…changes

are made according to tensorflow/community#16.

I am keeping a few symbols deprecated not mentioned in the doc:
tf.diag - it seems best to keep it next to tf.linalg.diag, so that the two are easy to compare and decide which one to use. The plan is to rename tf.diag to tf.tensor_diag.
tf.is_nan - similar to tf.is_inf, tf.is_finite, tf.is_numeric_tensor which are all getting deprecated and replaced by symbols in tf.debugging.
tf.string_to_number - other string endpoints in root namespace are getting deprecated: for e.g. tf.substr, tf.string_join.
tf.dequantize - all quantization ops should be under tf.quantize. I probably missed this one.
tf.check_numerics - similar to other debugging ops that are getting moved to tf.debugging.
tf.squared_difference - moved to tf.math namespace and not as popular as some other math ops such as tf.add to justify keeping endpoint in root.
tf.decode_raw - similar to other ops such as tf.decode_csv that are getting moved to tf.io.decode_csv.

PiperOrigin-RevId: 216278010
@dsmilkov
Copy link

Hi @annarev, is this proposal finalized, and do you know when is the final date? Asking since there is pending PRs adding ops in TensorFlow.js
(e.g. confusion_matrix) and we want to make sure we are aligned with the proposed namespaces here.

| tf.complex | tf.dtypes.complex |
| tf.complex128 | tf.dtypes.complex128 |
| tf.complex64 | tf.dtypes.complex64 |
| tf.confusion_matrix | tf.train.confusion_matrix |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move confusion_matrix under tf.train.*? Confusion matrix doesn't have to do with training intrinsically. One can evaluate a model's confusion matrix even without training the model. The tf.train.* namespace seems to be mainly about optimizers and checkpoints, etc, which are very different things from confusion matrix.

I think confusion matrix belong more under the tf.math.* namespace. Also, tf.metrics.* may seem appropriate at first glance, but keep in mind that confusion matrix is not a single scalar, unlike all the things we have under `tf.metrics.* right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for suggestions! I already made changes in this doc, but we can definitely add more endpoints and deprecate tf.train.confusion_matrix in TensorFlow 2.0.

tf.metrics.* seems to be a good match. Let me confirm if adding tf.metrics.confusion_matrix would be ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi, I moved confusion_matrix to tf.math.confusion_matrix.

@annarev
Copy link
Contributor Author

annarev commented Oct 15, 2018

@dsmilkov This document has been finalized, but I can still make changes if a different name makes sense.

We plan to add the following additional namespaces:

**tf.random** - will contain random sampling ops.
**tf.keras.layers** - will contain all symbols that are currently under `tf.layers`. Note that signatures of these symbols will likely change to match layers under tf.keras.layers better.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinwicke : Should this be rephrased (as are other references to tf.layers, since we do plan to remove the functional layer calls and also the variable-scope related wrapping of tf.keras.Layers?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I do think though that in addition to this rename RFC, we'll need another to publish the more detailed changes we're discussing.

tensorflow-copybara pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 22, 2018
tf.train.confusion_matrix.
This change is based on feedback at
tensorflow/tfjs#771 (comment)
tensorflow/community#16

PiperOrigin-RevId: 218216013
tensorflow-copybara pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 23, 2018
(tf.keras.layers.Foo should be used instead of tf.layers.Foo or tf.layers.foo
as per tensorflow/community#16)

PiperOrigin-RevId: 218273732
@goldiegadde goldiegadde added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Nov 8, 2018
@goldiegadde goldiegadde merged commit c0494d4 into tensorflow:master Nov 8, 2018
RFC management automation moved this from Awaiting Committee Notes to Accepted RFCs Nov 8, 2018
benjamintanweihao pushed a commit to benjamintanweihao/tensorflow that referenced this pull request Dec 5, 2018
…changes

are made according to tensorflow/community#16.

I am keeping a few symbols deprecated not mentioned in the doc:
tf.diag - it seems best to keep it next to tf.linalg.diag, so that the two are easy to compare and decide which one to use. The plan is to rename tf.diag to tf.tensor_diag.
tf.is_nan - similar to tf.is_inf, tf.is_finite, tf.is_numeric_tensor which are all getting deprecated and replaced by symbols in tf.debugging.
tf.string_to_number - other string endpoints in root namespace are getting deprecated: for e.g. tf.substr, tf.string_join.
tf.dequantize - all quantization ops should be under tf.quantize. I probably missed this one.
tf.check_numerics - similar to other debugging ops that are getting moved to tf.debugging.
tf.squared_difference - moved to tf.math namespace and not as popular as some other math ops such as tf.add to justify keeping endpoint in root.
tf.decode_raw - similar to other ops such as tf.decode_csv that are getting moved to tf.io.decode_csv.

PiperOrigin-RevId: 216278010
| tf.scalar_mul | tf.math.scalar_mul |
| tf.serialize_many_sparse | tf.io.serialize_many_sparse |
| tf.serialize_sparse | tf.io.serialize_sparse |
| tf.set_random_seed | tf.random.set_random_seed |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be tf.random.set_seed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.0 TensorFlow 2.0 development RFC: Accepted RFC Design Document: Accepted by Review
Projects
RFC management
  
Accepted RFCs
Development

Successfully merging this pull request may close these issues.

None yet