API usage logging within TorchVision

# Goal

To understand TorchVision usage within an organization(e.g. Meta).

The events give insights into torchvision usage with regards to individual callsites, workflows etc. The organization could also learn the trending APIs, which could be used to guide component development/deprecation etc.

# Policy
* Usage should be recorded only once for the same API within a process;
* We should record events as broadly as possible, duplicated events(e.g. module and function log the same thing) is OK and can be dedup in downstream pipelines. 
* For modules, API usage should be recorded at the beginning of constructor of the main class. For example `__init__` of `RegNet`, but not on ones of submodules(e.g. `ResBottleneckBlock`) 
* For functions, API usage should be recorded at the beginning of the method;
* For `torchvision.io`, the logging must be added both on the Python and the C++ (using the csrc submodule as mentioned) side.
* On `torchvision.ops`, the calls should be added both on the main class of the operator (eg StochasticDepth) and on its functional equivalent (eg stochastic_depth) if available. 
* On `torchvision.transforms`, the calls should be placed on the constructors of the Transform classes, the Auto-Augment classes and the functional methods.
* On `torchvision.datasets`, the calls are placed once on the constructor of VisionDataset so we don't need to add them individually on each dataset.
* On `torchvision.utils`, call should be added to the top of each public method.

# Event Format
Full qualified name of the component is logged as the event. For example: `torchvision.models.resnet.ResNet`
Note: for events from C++ APIs, “.csrc” should be added after torchvision, for example: `torchvision.csrc.ops.nms.nms`
# Usage Log API

* C++: `C10_LOG_API_USAGE_ONCE()`
* Python: 
```python
from ..utils import _log_api_usage_once
# for class
_log_api_usage_once(self)
# for method
if not torch.jit.is_scripting() and not torch.jit.is_tracing():
  _log_api_usage_once(nms) 
```

Above APIs are lightweight. By default, they are just [no-op](https://github.com/pytorch/pytorch/blob/c236247826bbf49d4f491d97bac6df1da6c1abe8/c10/util/Logging.cpp#L102-L106). It’s guaranteed that the same event is only recorded once within a process. Please note that 8 GPUs will still lead to 8 events, so the events should be dedup by a unique identifier like workflow job_id.

# Implementation
```python
def _log_api_usage_once(obj: Any) -> None:
    if not obj.__module__.startswith("torchvision"):
        return
    name = obj.__class__.__name__
    if isinstance(obj, FunctionType):
        name = obj.__name__
    torch._C._log_api_usage_once(f"{obj.__module__}.{name}")
```

# Also considered
* log usage in base class
  * Create a base class for all models, datasets, transforms and log usage in the init of base class
  * Introducing extra abstraction only for logging seems overkill. In #4569, we couldn't find any other features to be added to model base class; In addition, we also need a way to log non-class usage;
* use decorator
  * For example: `@log_api_usage` in #4976
  * doesn’t work with TorchScript since decorator needs to use kwargs, which is not supported in TorchScript;
* use function’s `__module__`:
  * For example: `_log_api_usage(nms.__module__, "nms")`
  * doesn’t work with TorchScript: attribute lookup is not defined on function
* use global constant for module
  * For example: `_log_api_usage(MODULE, "nms")`
  * doesn’t work with TorchScript; 
* use flat namespace
  * For example: log events as "torchvision.{models|transforms|datasets}.{class or function name}"
  * there might be name collisions;
* use object or function as param in logging API
  * For example: `_log_api_usage_once(self)`
  * doesn’t work with TorchScript;
* log fully qualified name with `__qualname__` for class, string for function
  * For example: #5096
  * too cumbersome for functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

API usage logging within TorchVision #5052

Goal

Policy

Event Format

Usage Log API

Implementation

Also considered

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

API usage logging within TorchVision #5052

Description

Goal

Policy

Event Format

Usage Log API

Implementation

Also considered

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions