# Short intro to the SCT library of AutoGraph

**Work in progress, use with care and expect changes.**

The `pyct` module packages the source code transformation APIs used by AutoGraph.

This tutorial is just a preview - there is no PIP package yet, and the API has not been finalized, although most of those shown here are quite stable.

[Run in Colab](https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/python/autograph/g3doc/pyct_tutorial.ipynb)

Requires `tf-nightly`:

In [0]:
!pip install tf-nightly

### Writing a custom code generator

[transformer.CodeGenerator](https://github.com/tensorflow/tensorflow/blob/40802bcdb5c8a4379da2145441f51051402bd29b/tensorflow/python/autograph/pyct/transformer.py#L480) is an AST visitor that outputs a string. This makes it useful in the final stage of translating Python to another language.

Here's a toy C++ code generator written using a `transformer.CodeGenerator`, which is just a fancy subclass of [ast.NodeVisitor](https://docs.python.org/3/library/ast.html#ast.NodeVisitor):

In [0]:
import gast
from tensorflow.python.autograph.pyct import transformer

class BasicCppCodegen(transformer.CodeGenerator):

  def visit_Name(self, node):
    self.emit(node.id)

  def visit_arguments(self, node):
    self.visit(node.args[0])
    for arg in node.args[1:]:
      self.emit(', ')
      self.visit(arg)

  def visit_FunctionDef(self, node):
    self.emit('void {}'.format(node.name))
    self.emit('(')
    self.visit(node.args)
    self.emit(') {\n')
    self.visit_block(node.body)
    self.emit('\n}')

  def visit_Call(self, node):
    self.emit(node.func.id)
    self.emit('(')
    self.visit(node.args[0])
    for arg in node.args[1:]:
      self.emit(', ')
      self.visit(arg)
    self.emit(');')


Let's try it on a simple function:

In [0]:
def f(x, y):
  print(x, y)

First, parse the Python code and annotate the AST. This is easily done with standard libraries, but [parser.parse_entity](https://github.com/tensorflow/tensorflow/blob/40802bcdb5c8a4379da2145441f51051402bd29b/tensorflow/python/autograph/pyct/parser.py#L182) makes it a single call. It returns a [gast](https://github.com/serge-sans-paille/gast) AST, so you don't have to worry about Python version:

In [0]:
from tensorflow.python.autograph.pyct import parser

node, source = parser.parse_entity(f, ())

There are a couple of context objects that most transformer objects like `CodeGenerator` use.

Of note here is `EntityInfo.namespace`, which contains the runtime values for all the global and closure names that the function has access to. Inside a transformer object, this is available under `self.ctx.info.namespace`.

For example, if a function uses NumPy, its namespace will typically include `'np'`.

In [0]:
from tensorflow.python.autograph.pyct import inspect_utils

f_info = transformer.EntityInfo(
    name='f',
    source_code=source,
    source_file=None,
    future_features=(),
    namespace=inspect_utils.getnamespace(f))
ctx = transformer.Context(f_info, None, None)

Finally, it's just a matter of running the generator:

In [0]:
codegen = BasicCppCodegen(ctx)
codegen.visit(node)

print(codegen.code_buffer)

### Helpful static analysis passes

The `static_analysis` module contains various helper passes for dataflow analyis.

All these passes annotate the AST. These annotations can be extracted using [anno.getanno](https://github.com/tensorflow/tensorflow/blob/40802bcdb5c8a4379da2145441f51051402bd29b/tensorflow/python/autograph/pyct/anno.py#L111). Most of them rely on the `qual_names` annotations, which just simplify the way more complex identifiers like `a.b.c` are accessed.

The most useful is the activity analysis which just inventories symbols read, modified, etc.:

In [0]:
def get_node_and_ctx(f):
  node, source = parser.parse_entity(f, ())
  f_info = transformer.EntityInfo(
    name='f',
    source_code=source,
    source_file=None,
    future_features=(),
    namespace=None)
  ctx = transformer.Context(f_info, None, None)
  return node, ctx

In [0]:
from tensorflow.python.autograph.pyct import anno
from tensorflow.python.autograph.pyct import qual_names
from tensorflow.python.autograph.pyct.static_analysis import annos
from tensorflow.python.autograph.pyct.static_analysis import activity


def f(a):
  b = a + 1
  return b


node, ctx = get_node_and_ctx(f)

node = qual_names.resolve(node)
node = activity.resolve(node, ctx)

fn_scope = anno.getanno(node, annos.NodeAnno.BODY_SCOPE)  # Note: tag will be changed soon.


print('read:', fn_scope.read)
print('modified:', fn_scope.modified)

Another useful utility is the control flow graph builder.

Of course, a CFG that fully accounts for all effects is impractical to build in a late-bound language like Python without creating an almost fully-connected graph. However, one can be reasonably built if we ignore the potential for functions to raise arbitrary exceptions.

In [0]:
from tensorflow.python.autograph.pyct import cfg


def f(a):
  if a > 0:
    return a
  b = -a

node, ctx = get_node_and_ctx(f)

node = qual_names.resolve(node)
cfgs = cfg.build(node)
cfgs[node]

Other useful analyses include liveness analysis. Note that these make simplifying assumptions, because in general the CFG of a Python program is a graph that's almost complete. The only robust assumption is that execution can't jump backwards.

In [0]:
from tensorflow.python.autograph.pyct import anno
from tensorflow.python.autograph.pyct import cfg
from tensorflow.python.autograph.pyct import qual_names
from tensorflow.python.autograph.pyct.static_analysis import annos
from tensorflow.python.autograph.pyct.static_analysis import liveness


def f(a):
  b = a + 1
  return b


node, ctx = get_node_and_ctx(f)

node = qual_names.resolve(node)
cfgs = cfg.build(node)
node = activity.resolve(node, ctx)
node = liveness.resolve(node, ctx, cfgs)

print('live into `b = a + 1`:', anno.getanno(node.body[0], anno.Static.LIVE_VARS_IN))
print('live into `return b`:', anno.getanno(node.body[1], anno.Static.LIVE_VARS_IN))

### Writing a custom Python transpiler

`transpiler.FunctionTranspiler` is a generic class for a Python [source-to-source compiler](https://en.wikipedia.org/wiki/Source-to-source_compiler). It operates on Python ASTs. Subclasses override its [transform_ast](https://github.com/tensorflow/tensorflow/blob/95ea3404528afcb1a74dd5f0946ea8d17beda28b/tensorflow/python/autograph/pyct/transpiler.py#L261) method.

Unlike the `transformer` module, which have an AST as input/output, the `transpiler` APIs accept and return actual Python objects, handling the tasks associated with parsing, unparsing and loading of code.

Here's a transpiler that does nothing:

In [0]:
from tensorflow.python.autograph.pyct import transpiler


class NoopTranspiler(transpiler.FunctionTranspiler):

  def transform_ast(self, ast, transformer_context):
    return ast

tr = NoopTranspiler()

The main method is [transform_function](https://github.com/tensorflow/tensorflow/blob/95ea3404528afcb1a74dd5f0946ea8d17beda28b/tensorflow/python/autograph/pyct/transpiler.py#L384), which as its name suggests, operates on functions.

In [0]:
def f(x, y):
  return x + y


new_f, _, _ = tr.transform_function(f, None, None, {})

print(new_f(1, 1))

### Adding new variables to the transformed code

The transformed function has the same global and local variables as the original function. You can of course generate local imports to add any new references into the generated code, but an easier method is to use the `extra_locals` arg of `transform_function`:

In [0]:
from tensorflow.python.autograph.pyct import parser


class HelloTranspiler(transpiler.FunctionTranspiler):

  def transform_ast(self, ast, transformer_context):
    print_code = parser.parse('print("Hello", name)')
    ast.body = [print_code] + ast.body
    return ast


def f(x, y):
  pass


extra_locals = {'name': 'you'}
new_f, _, _ = HelloTranspiler().transform_function(f, None, None, extra_locals)

_ = new_f(1, 1)

In [0]:
import inspect

print(inspect.getsource(new_f))