# OpenLineage Python Integration

This notebook demonstrates the use of python decorator to emit OpenLineage events for python and pySpark functions. 

## Problem and Solutions

[OpenLineage](https://github.com/OpenLineage/OpenLineage) is an open standard and an open-source implementation supported by the Linux Foundation for data observability and data lineage tracking. It defines a standard for data lineage events (see schema at https://github.com/OpenLineage/OpenLineage/blob/main/spec/OpenLineage.json) that data processing pipelines send to a centralized RESTful service, backed by a Postgres database. This data can be queried and visualized, e.g., using the [open-source Marquez UI](https://github.com/MarquezProject/marquez). The goal is to log every data transformation and transaction, with versioning history. 

The challenge is to make this process unintrusive and ubiquitous. In our use case (Spark/pySpark, Delta Lake on Databricks), there are two approaches to integrate OpenLineage:

1. Using SparkListener, see https://openlineage.io/blog/openlineage-spark/. This involves adding a custom SparkListner to the Spark/Databricks environment, which automatically reports low-level Spark operations and file I/O. This is demonstrated in a separate notebook [OpenLineage-Spark Demo.ipynb](/notebooks/notebooks/OpenLineage-Spark%20Demo.ipynb).
    - The advantage is that once set up, it tracks everything it can track without any additional work.
    - The disadvantage is also the lack of control -- you can't refine what you want to track; e.g., instead of your pySpark code, it only has access to the low-level Spark execution plans. 
2. Using Python and the RESTful API. 
    - Advantage is that you can log anything you want.
    - The drawback is that you have to specify what you want to log.
    
The goal of this demo is to show how we can simplify the Python/API approach, by using python decorators. The idea is to pack all OpenLineage functions in a decorator function, so that the user only need to do the following to transformations that requires loggin:

```python
from openlineage-decor import OpenLineageDecor

# set up OL config
ol_config = {
    # set up your OL URL, namespace, password, etc.
}

# defining the transformation using the decorator
@OpenLineageDecor(ol_config)
def my_transformation(df1, df2):
    # ...
    return output_df

# calling this function will generate OL events
df = my_transformation(df1, df2)
```

And the decorator `@openlineage` will take care all the logging. 

## Getting Started

This demo uses `docker-compose` to run a set of connected services:
- the Marquez OpenLineage API
- the Marquez web server/viz UI
- the Postgres database as the backend
- a Jupyter notebook service with Spark 3.1+

To start, clone my branch of `openlineage` where I fixed https://github.com/OpenLineage/OpenLineage/issues/633. It's a very easy fix you can just patch the official repo if you want to. See the issue link.

```sh
git clone https://github.com/garyfeng/OpenLineage.git
cd OpenLineage/integration/spark
mkdir -p docker/notebooks
# copy these notebooks to the above folder if they do not exist there
docker-compose up
```

Start Jupyter at http://127.0.0.1:8888. You may need to look into the server logs to find the passcode to access the notebooks.

To view the web UI, in a different terminal

```sh
docker run --network spark_default -p 3000:3000 -e MARQUEZ_HOST=marquez-api -e MARQUEZ_PORT=5000 --link marquez-api:marquez-api marquezproject/marquez-web
```

Then open Marquez at http://127.0.0.1:3000/. Look for the `namespace` 

# Best Practice for Development

Here are the best practice for using this pattern with OpenLineage:

OpenLineage code/config management:

- The OpenLineage decorator function should be part of a library that is imported for each notebooks or work
- The OpenLineage code should be versioned
- The OpenLineage configurations (server URL, schemas, etc.) should be managed, with cutomization in each notebook/work

In each Notebook:
- Import OpenLineage decor lib 
- Initialize the instance of the OpenLineage client
  - with the Notebook Name or Work Name as the `namespace`
  - confirm the configuration is working
- Register global variables (typically limited to Spark/pandas Dataframes). This will emit an initial list of `dataset` records to the OL backend
- Use a function for all data manipulations, especially with dataframes
  - Use the `@openlineage` decorator for any function you want to track
  - Name your function descriptively; the function name will be the name of the `job`/`run` in OL
  - Use the `docstring` to briefly describe the purpose of the function; this will be tracked as well.
  - The source of the function will also be tracked, including all comments.
  - Keep your input and output simple, e.g., avoid returning a tuple or something too complex. Typically a function will take one or more dataframes as input (maybe with a few other parameters as options) and return a dataframe, series, or scalar. 
- I/O sideeffects (such as saving data to file) is NOT tracked with this method (YET; TO-DO)
  - As an interim solution, you can use `____` to send custom meessage in the code 
  
```python
from openlineage_decor import openlineage, OpenLineage, OPENLINAGE_CONFIG

# change default config
OPENLINAGE_CONFIG["url"] = "http://host:port/api/v1/openlineage"
OPENLINAGE_CONFIG["namespace"] = "Notebook Name"
# set up the ol client
ol_client = OpenLineage(OPENLINAGE_CONFIG)
# test ol_client
pass
# pass the ol_client to the openlineage decor function
OPENLINAGE_CONFIG["client"] = ol_client
# register global dataframes; 
df_list = get_global_dfs()
ol_client.register_dfs(df_list)

```

# TO-DOs

- add decorators for register_data_sources: to register data sources and data frames with the metadata store. Not sure what we will use as the unique identifier -- for python/pyspark/pandas it seems `id()` is just fine, because it is unique within the session. Maybe `session+id()`?
  - so long as we use a unique ID to track data the DAG should connect all `datasets`/`jobs` within the `Namespace`
  - but the downside is that we will have too many data objects in the OL registery (duplicated for every run). The DAG is based on all runs of all objects. 
  - We need to distinguish between dataset `name` vs `version`, where the `id(df)` is like a version number. 
- add code version tracking
- add support for RDD, Pandas dataframe, series, etc. 
----

# Start Spark Session

We will use it later.

In [15]:
from pyspark.sql import SparkSession
import urllib.request

# Set these to your own project and bucket
spark = (SparkSession.builder.master('local').appName('openlineage_python_API')
             .getOrCreate())

22/04/02 01:59:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/04/02 01:59:47 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.


# OpenLineage Python: Manual Testing

In this demo we actually do not use the python client. We use python `request` library and the OpenLineage API directly for clarity. We can consider using the official python client in the future. 

We will do some manual testing. Run the following a few cells, and check the Marquez UI at http://127.0.0.1:3000/. Look for `namespace` "gary-namespace". 

In [None]:
# !pip install openlineage-python

In [8]:
import requests
url = 'http://marquez_api:5000/api/v1/lineage'
headers = {"charset": "utf-8", "Content-Type": "application/json"}


In [10]:
data = """{
        "eventType": "START",
        "eventTime": "2020-12-28T19:52:00.001+10:00",
        "run": {
          "runId": "d46e465b-d358-4d32-83d4-df660ff614dd"
        },
        "job": {
          "namespace": "gary-namespace",
          "name": "my-job"
        },
        "inputs": [{
          "namespace": "gary-namespace",
          "name": "my-input"
        }],  
        "producer": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client"
      }"""
r = requests.post(url, data=data, headers=headers)
r.text

''

In [11]:
data2 = """{
        "eventType": "COMPLETE",
        "eventTime": "2020-12-28T20:52:00.001+10:00",
        "run": {
          "runId": "d46e465b-d358-4d32-83d4-df660ff614dd"
        },
        "job": {
          "namespace": "gary-namespace",
          "name": "my-job"
        },
        "outputs": [{
          "namespace": "gary-namespace",
          "name": "my-output",
          "facets": {
            "schema": {
              "_producer": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client",
              "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/spec/OpenLineage.json#/definitions/SchemaDatasetFacet",
              "fields": [
                { "name": "a", "type": "VARCHAR"},
                { "name": "b", "type": "VARCHAR"}
              ]
            }
          }
        }],     
        "producer": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client"
      }"""

r = requests.post(url, data=data2, headers=headers)
r.text

''

# Wrappers for OpenLineage Trackers

The idea is to use python decorators to log OpenLineage events when running pySpark transformations. 

Assuming major transformations are coded as pySpark functions (as part of a library), and further assuming that they follow the pattern of having an input df and an output df, we can use an decorator to 
- extract `run` info, with python source code of the function and all parameters/configurations, timing and start/completion
- extract `data` info, with input and output data frames
- post to OpenLineage server, with error logging if needed 

Need to plan the granularity of the logs -- not individual operations but a block of code or a notebook. 

In [12]:
import functools
import time, uuid, json
from datetime import datetime

import requests
import inspect

from pyspark.sql import DataFrame
from pyspark.rdd import RDD

# making this a decorator factor to take an input of the namespace
def openlineage(namespace):

    def openlineage_decor(func):
        """Report OpenLineage events"""
        ol_event_producer = "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client"
        ol_producer = "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client"
        ol_schemaURL = "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/spec/OpenLineage.json#/definitions/SchemaDatasetFacet"
        ol_url = 'http://marquez_api:5000/api/v1/lineage'
        ol_headers = {"charset": "utf-8", "Content-Type": "application/json"}
        ol_runId = str(uuid.uuid4())
        ol_namespace = namespace # from the decorator factory
        ol_job_name = func.__name__
        
        def ol_send_event(ol_data):
            """Posting the OpenLineage event"""
            # print(ol_data)
            r = requests.post(ol_url, data=ol_data, headers=ol_headers)
            if not r.ok:
                print("Failed to post endEvent: {ol_data}")
            
        def ol_data_description(var):
            """Given a {var_name: var}, return an object containing data description for the object
            param var: a dict in the form of {v_name: v_obj}, where v_name is a string, and v_object is 
                        the actual reference to the object.
            returns: a dict in the form of the OpenLineage DataSchema 
            """
            assert isinstance(var, dict)
            assert len(var.keys())==1
            v_name = list(var.keys())[0]
            v_obj = list(var.values())[0]
            output = None

            # Detect if it's a DF, report the DF schema
            try:
                if isinstance(v_obj, DataFrame):
                    # Spark Dataframe
                    # name = "df_"+id(df), where id(df) is a reference to the memory allocation;
                    # the resulting name is the same for all references to the same df: df1 = df
                    # this helps to link the DAG in marquez to track data across jobs
                    data_fields = [ {"name":v, "type":t, 
                                     "description": ""} for v,t in v_obj.dtypes]
                    output = {
                      "namespace": ol_namespace,
                      "name": "df_{}".format(id(v_obj)), #switching to global id() of the input
                      "facets": {
                          "schema": {
                            "fields": data_fields,
                            "_producer": ol_producer,
                            "_schemaURL": ol_schemaURL,
                          }
                      }
                    }                    
                else:
                    # all other data types, assuming scalar
                    # name = v_name, field is for this var
                    output = {
                      "namespace": ol_namespace,
                      "name": v_name,
                      "facets": {
                          "schema": {
                            "fields": [{"name": v_name, "type": type(v_obj).__name__, 
                                        "description": ""}],
                            "_producer": ol_producer,
                            "_schemaURL": ol_schemaURL,
                          }
                      }
                    }
            except Exception as e:
                print(e) # in case of empty parameters, etc.

            return output

        def ol_run_description(func):
            """Given a funcion, generate an OpenLineage Run facet description.
            We use the name of the function as the run/job name; we include
            the input parameters, source code, docstring, and path of the python 
            file in the run record.
            """
            # Job/run info
            source_code=source_file=""
            try:
                source_code = inspect.getsource(func)
                source_file = inspect.getsourcefile(func)
            except:
                pass
            run_info = {
              "runId": ol_runId,
              "description":inspect.getdoc(func),
              "facets": {# TO-DO: add git URL and version info
                  "function": {
                    "name": func.__name__,
                    "docstring": inspect.getdoc(func),
                    "source": source_code, 
                    "sourceFile": source_file,
                    "signature": list(inspect.signature(func).parameters),
                    #"params": param,
                    "_producer": ol_producer,
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/spec/OpenLineage.json#/definitions/JobFacet",
                  }
              }
            }         
            return run_info

        def ol_beforerun(*args, **kwargs):
            """Emitting the OpenLineage START event with info on the job/run and input"""
            # get parameter list to func
            param1 = [{k:v} for k,v in zip(inspect.signature(func).parameters, args)]
            param2 = [{key: value} for key, value in kwargs.items()]
            param = param1 + param2
            # make each para an input facet; this may not make sense when params are scalar
            # but when the inputs are dataframes we can better track the data
            input_list = [ol_data_description(p) for p in param]

            # get run info
            run_facet = ol_run_description(func)

            startEvent = json.dumps({
                "eventType": "START",
                "eventTime": datetime.now().isoformat(sep='T', timespec='milliseconds'),
                "run": run_facet,
                "job": {# TO-DO: what more do we need?
                  "namespace": ol_namespace,
                  "name": ol_job_name
                },
                "inputs": input_list,  
                "producer": ol_event_producer
            })
            ol_send_event(startEvent)
            return

        def ol_afterrun(value):
            """Emitting the OpenLineage COMPLETE event with info on the job/run and output"""
            endEvent = json.dumps({
                "eventType": "COMPLETE",
                "eventTime": datetime.now().isoformat(sep='T', timespec='milliseconds'),
                "run": {
                  "runId": ol_runId
                },
                "job": {
                  "namespace": ol_namespace,
                  "name": ol_job_name
                },
                "outputs": [ol_data_description({"{}_output".format(ol_job_name): value})],     
                "producer": ol_event_producer
            })
            ol_send_event(endEvent)
            return

        @functools.wraps(func)
        def emit_ol_event(*args, **kwargs):
            # before run
            ol_beforerun(*args, **kwargs)

            # now actually run the function and time it
            start_time = time.perf_counter()    # 1
            value = func(*args, **kwargs)
            end_time = time.perf_counter()      # 2

            # after run
            ol_afterrun(value)

            run_time = end_time - start_time    # 3
            # print(f"Finished {func.__name__!r} in {run_time:.4f} secs")
            return value

        # return the decorator function
        return emit_ol_event
    
    return openlineage_decor



In [16]:
# Testing with a dummy function
@openlineage("test2_namespace")
def waste_some_time(num_times, dummy, **kwargs):
    """Just a dummy function
    
    param num_times: input for # of iterations
    param dummy: useless parameter
    returns: a dummy Spark DataFrame
    """
    for _ in range(num_times):
        sum([i**2 for i in range(10000)])
    return spark.sparkContext.parallelize([("foo", 1)]).toDF()

import numpy as np
import pandas as pd

@openlineage("test2_namespace")
def waste2(df:DataFrame, **kwargs):
    """Convert a Spark DF to Pandas and then back"""
    pdf = df.select("*").toPandas()
    return spark.createDataFrame(pdf)

In [27]:
# this will trigger the first part of the job
df = waste_some_time(10, 0, junk=2, other=None)

In [28]:
# this will trigger the second part of the job, which will be connected with the firt part via the shared df
waste2(df)

DataFrame[_1: string, _2: bigint]

In [17]:
# you can chain them together. Still chained by the temp df that is the output of one and input for the other. 
# in OL, the named df and temp df are treated the same, all identified by their id()
waste2(waste_some_time(10, 0, anther="useless"))

                                                                                

DataFrame[_1: string, _2: bigint]

# Class-based approach

We create a class for OpenLinearDecorator, and initiate one for each.

Typical approach uses a python `class` as a decorator uses the `.__call__()` function in the class to do the decorating function. This allows one to first define a decorator class

```python
class OpenLineage(object):
    def __init__(self, param):
        ...
    def __call__(self, func):
        ...
        def emit_event(*args, **kwargs):
            # do before
            val= func(*args, **kwargs)
            # do after
            return val
        return emit_event
# define a fun
@OpenLineage
def add(a,b):
    return a+b
```

But keep in mind that every time `@OpenLineage` is called, it creates an instance that will **not** be reused. So if you need to initiate the decorator class with parameters/configurations, you need to do that every time (`@OpenLineage(config)`).

The following is one approach, where we define a generic class, instantiate an instance with configurations, and use the instance as decorator for subsequent func definitions.
- pros:
  - us class and instances; allow multiple decorators to be initiated with different parameters, such as URLs and Namespaces.
  - don't need to include config for every decorators (if we had a `class OpenLineage` as the decorator class that takes parameters, we'd need to do `@OpenLineage(config)` every time).

In [18]:
class OpenLineage(object):
    def __init__(self, url, namespace):
        self.url = url
        self.namespace = namespace
        # print at instance initiation
        print("Initiated with url={}, namespace={}".format(self.url, self.namespace))
        
    def ol_send_event(self, event):
        """Mock function for sending OpenLineage event data"""
        print("OL: {}".format(event))
        
    def track(self, func):
        """This is a decorator function. Given a function that does data transformation'
        , this adds OpenLineage even tracking before and after the run."""
        # print message in decorator stage
        self.ol_send_event("JOB: job={}, url={}, namespace={}".format(func.__name__, self.url, self.namespace))
        
        @functools.wraps(func)
        def emit_ol_event(*args, **kwargs):
            # before run
            # print at runtime
            self.ol_send_event("RUN INIT: with url={}, namespace={}".format(self.url, self.namespace))
            self.ol_send_event("RUN START: calling {}{}".format(func.__name__, args))
            # now actually run the function and time it
            value = func(*args, **kwargs)
            # after run
            self.ol_send_event("RUN COMPLETE: returning data type {}".format(type(value).__name__))
            return value
        # return the decorator function
        return emit_ol_event
    
    def register(self, *args):
        """Mock to register a dataset. Notes:
        1). This only registers data in the globals() scope.
        2). The registration is done by id(var), and will register all variables referring to this object.
        TO-DO: to take variable parameters, *arg """
        arg_ids = [id(x) for x in args]
        l = globals()
        dataset = [(x, type(l[x]).__name__, id(l[x])) for x in l.keys() 
                   if id(l[x]) in arg_ids and not x.startswith("_")]
        self.ol_send_event("DATASET: {}".format(dataset))

# instantiate with parameters
ol = OpenLineage(url="http://localhost:500", namespace="test_namespace")

Initiated with url=http://localhost:500, namespace=test_namespace


In [19]:
@ol.track
def add(a,b):
    print("running add({}, {})".format(a,b))
    return a+b


OL: JOB: job=add, url=http://localhost:500, namespace=test_namespace


In [20]:
add(1,2)

OL: RUN INIT: with url=http://localhost:500, namespace=test_namespace
OL: RUN START: calling add(1, 2)
running add(1, 2)
OL: RUN COMPLETE: returning data type int


3

## Register Global Data Objects

The Python decorator approach tracks data transformation functions, which corresponds to `jobs` or `runs` in OpenLineage. How do we track data objects such as data frames?

This is implicitly done in the decorator code when we assigned dataset schema for input and output of a job/run. Basically,
- for Spark/Pandas dataframes, we track the identity of the df (in this run) with a temporary name `"df_"+str(id(df))`. The reason for this, as opposed to using the "name" of the df is that all dfs are by reference. You can have multiple dfs pointing to the same data. You can also reassign a df to point to a different data object. The DF name is actually not a unique identifier. We want to track the actual data object. 
  - we have a field schema based on the column names and data types.
- for other data types, particular built-in types, we use the variable name as data name, and use the var name and type as the field schema.

This does not necessarily cover all other data objects we may want to track -- e.g., ones that are not manipulated within a function that is tracked by the decorator, or in an interaction session. 

We can't use a decorator because variables are not callable. A complementary way is to declare the intention to track a data object by `.register(data)`. We use the same OpenLineage instance but use a method call `.register()` for each variable we want to register. 

In [21]:
# interesting to see how even built-in vars are by reference.
a=12352523
ol.register(a)


OL: DATASET: [('a', 'int', 140142062703056)]


In [22]:
# in this case b and c points to a
b=a; c=b
ol.register(a)


OL: DATASET: [('a', 'int', 140142062703056), ('b', 'int', 140142062703056), ('c', 'int', 140142062703056)]


In [23]:
a=2
ol.register(a)
ol.register(a, b)

OL: DATASET: [('a', 'int', 140143782160720)]
OL: DATASET: [('a', 'int', 140143782160720), ('b', 'int', 140142062703056), ('c', 'int', 140142062703056)]


In [24]:
# does everything that points to int2 have the same id now?
b=2
RandomVar=int("2")
ol.register(b, RandomVar)

OL: DATASET: [('a', 'int', 140143782160720), ('b', 'int', 140143782160720), ('RandomVar', 'int', 140143782160720)]


In [25]:
# this fails to work when you have large numbers.
z=2122344
RandomVar=int("2122344")
ol.register(z, RandomVar)

OL: DATASET: [('RandomVar', 'int', 140142062703472), ('z', 'int', 140142062703536)]


Now data frames are always by reference. 

In [29]:
ol.register(df)

OL: DATASET: [('df', 'DataFrame', 140142062575328)]


In [30]:
df1 = df
ol.register(df)

OL: DATASET: [('df', 'DataFrame', 140142062575328), ('df1', 'DataFrame', 140142062575328)]


In [31]:
# nothing is registered since the output of waste2(df) is a temp var that starts with _ and gets filtered out
ol.register(waste2(df))

OL: DATASET: []


In [32]:
# this named df works. 
df2 = waste2(df)
ol.register(df2)

OL: DATASET: [('df2', 'DataFrame', 140143128052352)]


## Monitoring global variables

The idea here is to automatically monitor variables/objects of interest in the relevant scope, and register. If this is done at all, it should be done only for specific data types such as Spark/Pandas data frames and file handlers, non-temp variables only. In addition, we should not rely on scanning the `globals()` frequently. This should be triggered only after some relevant actions. 

In general, use `global()`. Avoid `gc.get_objects()`, which runs the risk of interfering with the gc process and memory leak. If you really want to get into gc, consider using `weakref` in reporting, but I haven't found a reason to.


In [None]:
# examples to get built-in types, to identify non-built-in types such as data frames, etc.
import builtins
builtin_types = [getattr(builtins, d) for d in dir(builtins) if isinstance(getattr(builtins, d), type)]
[t for t in builtin_types if not t.__name__[0].isupper()]

In [None]:
# eliminate temporary vars that will be gc-ed. 
set([x for x in locals().keys() if not x.startswith("_")])

In [None]:
# depending on the gc process, this may return temporary dataframes that starts with '_'
[(x, type(l[x])) for x in l.keys() if id(l[x])==id(df)]

In [None]:
# generally not advised. But you can use gc to find an object by its id(), 
# which is related to the memory allocation. Know what you doing before you go there. 

import gc

def deref(id_):
    f = {id(x):x for x in gc.get_objects()}
    return f[id_]

df1=df

# this gets you the object, but not the name reference, which may not be unique. 
# you can search through globals() for objects pointing to this id() to find all co-references
# but be careful that stuff in the gc may get erased any time; the same id() may not give you
# the same object. 
deref(id(df1))

# Using Notebook Magic to log OpenLineage cells

See custom cell magic for Jupyter notebooks https://ipython.readthedocs.io/en/stable/config/custommagics.html

The idea is to define `cell magic` commands to generate `OpenLineage` events before and after running the cell. 
- The `cell magic` returns the content of the cell as a string. We can run the python script with a separate python process and capture the results to pass back. Not sure about security and access to Spark session, etc. 


This only works when the user runs the notebook via the UI. Not sure it works when Databricks "runs" the notebook automatically.



In [None]:
from IPython.core.magic import (register_line_magic, register_cell_magic,
                                register_line_cell_magic)

@register_line_magic
def lmagic(line):
    print("my line magic")
    return line

@register_cell_magic
def cmagic(line, cell):
    "my cell magic"
    return line, cell

@register_line_cell_magic
def lcmagic(line, cell=None):
    "Magic that works both as %lcmagic and as %%lcmagic"
    if cell is None:
        print("Called as line magic")
        return line
    else:
        print("Called as cell magic")
        return line, cell
    
# In an interactive session, we need to delete these to avoid
# name conflicts for automagic to work on line magics.
# you can use %lmagic and %%lcmagic after this point
del lmagic, lcmagic    

In [None]:
%lmagic
1+1

In [None]:
%%lcmagic
1+1