# ODSC Ivy Demo

First, let's install Ivy and some dependencies 😄

In [None]:
!git clone https://github.com/unifyai/ivy.git
!cd ivy && git checkout f705efe7cb5d18df17ce6c1e20f04d0eb4933f48 && python3 -m pip install --user -e .
!pip install dm-haiku
!pip install kornia
!pip install timm
!pip install pyvis
!pip install transformers
exit()

## Ivy as a Framework

In this introduction, we will cover the fundamentals of using Ivy to write your own framework-indepent and future-proof code!

If you are interested in exploring the theoretical aspects behind the contents of this notebook you can check out the [Design](https://lets-unify.ai/docs/ivy/overview/design.html) and the [Deep Dive](https://lets-unify.ai/docs/ivy/overview/deep_dive.html) sections of the documentation!

First of all, let's import Ivy

In [None]:
import ivy

### Ivy Backend Handler

When used as a ML framework, Ivy is esentially an abstraction layer that supports multiple frameworks as the backend. This means that any code written in Ivy can be executed in any of the supported frameworks, with Ivy managing the framework-specific data structures, functions, optimizations, quirks and perks under the hood.

To switch the backend, we can use the `ivy.set_backend` function and pass the appropriate framework as a string. This is the easiest way to interact with the Backend Handler submodule, which manages the current backend and links Ivy’s objects and functions with the corresponding framework-specific ones.

For example:

In [None]:
ivy.set_backend("tensorflow")

### Data Structures

The basic data structure in Ivy is the `ivy.Array`. This is an abstraction of the `array` classes of the supported frameworks. Likewise, we also have `ivy.NativeArray`, which is an alias for the `array` class of the selected backend.

Lastly, there is another structure called the `ivy.Container`. It's a subclass of dict that is optimized for recursive operations. If you want to learn more about it, you can defer to the following [link](https://lets-unify.ai/docs/ivy/overview/design/ivy_as_a_framework/ivy_container.html)!

Let’s create an array using `ivy.array()`. Similarly, we can use `ivy.native_array()` to create a `torch.Tensor` now that the backend is set to `torch`.

In [None]:
ivy.set_backend("torch")

x = ivy.array([1, 2, 3])
print(type(x))

x = ivy.native_array([1, 2, 3])
print(type(x))

### Ivy Functional API

Ivy does not implement its own low-level (C++/CUDA) backend for its functions. Instead, it wraps the functional API of existing frameworks, unifying their fundamental functions under a common signature. For example, let’s take a look at `ivy.matmul()`:

In [None]:
ivy.set_backend("jax")
x1, x2 = ivy.array([[1.], [2.], [3.]]), ivy.array([[1., 2., 3.]])
output = ivy.matmul(x1, x2)
print(type(output.to_native()))

ivy.set_backend("tensorflow")
x1, x2 = ivy.array([[1.], [2.], [3.]]), ivy.array([[1., 2., 3.]])
output = ivy.matmul(x1, x2)
print(type(output.to_native()))

ivy.set_backend("torch")
x1, x2 = ivy.array([[1.], [2.], [3.]]), ivy.array([[1., 2., 3.]])
output = ivy.matmul(x1, x2)
print(type(output.to_native()))

The output arrays shown above are `ivy.Array` instances. To obtain the underlying native array, we need to use the `to_native()` method.

However, if you want the functions to return the native arrays directly, you can disable the `array_mode` of Ivy using `ivy.set_array_mode()`.

In [None]:
ivy.set_array_mode(False)

ivy.set_backend("jax")
x1, x2 = ivy.native_array([[1.], [2.], [3.]]), ivy.native_array([[1., 2., 3.]])
output = ivy.matmul(x1, x2)
print(type(output))

ivy.set_backend("tensorflow")
x1, x2 = ivy.native_array([[1.], [2.], [3.]]), ivy.native_array([[1., 2., 3.]])
output = ivy.matmul(x1, x2)
print(type(output))

ivy.set_backend("torch")
x1, x2 = ivy.native_array([[1.], [2.], [3.]]), ivy.native_array([[1., 2., 3.]])
output = ivy.matmul(x1, x2)
print(type(output))

ivy.set_array_mode(True)

Keeping this in mind, you can build any function you want as a composition of Ivy functions. When executed, this function will ultimately call the current backend functions from its functional API.

In [None]:
def sigmoid(z):
    return ivy.divide(1, (1 + ivy.exp(-z)))

In essence, this means that by writing your code just once with Ivy, it becomes accessible for for use within any project regardless of the underlying framework being used!

### Ivy Stateful API

As we have seen in the slides, Ivy also has a stateful API which builds on its functional API and the `ivy.Container` class to provide high-level classes such as optimizers, network layers, or trainable modules.

The most important stateful class within Ivy is ivy.Module, which can be used to create trainable layers and entire networks. A very simple example of an `ivy.Module` could be:

In [None]:
class Regressor(ivy.Module):
    def __init__(self, input_dim, output_dim):
        self.linear0 = ivy.Linear(input_dim, 128)
        self.linear1 = ivy.Linear(128, output_dim)
        ivy.Module.__init__(self)

    def _forward(self, x):
        x = self.linear0(x)
        x = ivy.functional.relu(x)
        x = self.linear1(x)
        return x

To use this model, we would simply have to set a backend and instantiate the model:

In [None]:
ivy.set_backend('torch')  # set backend to PyTorch

model = Regressor(input_dim=1, output_dim=1)
optimizer = ivy.Adam(0.1)

Now we can generate some sample data and train the model using Ivy as well.

In [None]:
n_training_examples = 2000
noise = ivy.random.random_normal(shape=(n_training_examples, 1), mean=0, std=0.1)
x = ivy.linspace(-6, 3, n_training_examples).reshape((n_training_examples, 1))
y = 0.2 * x ** 2 + 0.5 * x + 0.1 + noise

In [None]:
def loss_fn(pred, target):
    return ivy.mean((pred - target)**2)

for epoch in range(50):
    # forward pass
    pred = model(x)

    # compute loss and gradients
    loss, grads = ivy.execute_with_gradients(lambda v: loss_fn(pred, y), model.v)

    # update parameters
    model.v = optimizer.step(model.v, grads)

    # print current loss
    print(f'Epoch: {epoch + 1:2d} --- Loss: {ivy.to_numpy(loss).item():.5f}')

print('Finished training!')

### Graph Tracer

We have just explored how to create framework agnostic functions and models with Ivy. Nonetheless, due to the wrapping Ivy performs on top of native functions, there is a slight performance overhead introduced with each function call. To address this, we can use Ivy's graph tracer.

The purpose of the Graph Tracer is to extract a fully functional, efficient graph composed only of functions from the corresponding functional APIs of the underlying framework (backend).

On top of using the Graph Tracer to remove the overhead introduced by Ivy, it can also be used with functions and modules written directly with a given framework. In this case, the GC will decompose any high-level API into a fully-functional graph of functions from said framework.

As an example, let's write a simple `normalize` function using Ivy:

In [None]:
def normalize(x):
    mean = ivy.mean(x)
    std = ivy.std(x)
    return ivy.divide(ivy.subtract(x, mean), std)

To trace this function, simply call `ivy.trace_graph()`. To specify the underlying framework, you can pass the name of the framework as an argument using `to`. Otherwise, the current backend will be used by default.

In [None]:
import torch
x0 = torch.tensor([1., 2., 3.])
normalize_traced = ivy.trace_graph(normalize, to="torch", args=(x0,))

This results in the following graph:

In [None]:
from IPython.display import HTML
normalize_traced.show(fname="graph.html", notebook=True)
HTML(filename="graph.html")

As anticipated, the traced function, which uses native `torch` operations directly, is faster than the original function:

In [None]:
%%timeit
normalize(x0)

In [None]:
%%timeit
normalize_traced(x0)

Additionally, we can set the `backend_compile` arg to `True` to apply the (native) target framework compilation function to Ivy's traced graph, making the resulting function even more efficient.

In [None]:
normalize_native_comp = ivy.trace_graph(normalize, return_backend_compiled_fn=True, to="torch", args=(x0,))

In [None]:
%%timeit
normalize_native_comp(x0)

In the example above, we compiled the function eagerly, which means that the compilation process happened immediately, as we have passed the arguments for tracing. However, if we don't pass any arguments to the `trace_graph` function, compilation will occur lazily, and the graph will be built only when we call the compiled function for the first time. To summarize:

In [None]:
import torch

x1 = torch.tensor([1., 2., 3.])

In [None]:
# Arguments are available -> tracing happens eagerly
eager_graph = ivy.trace_graph(normalize, to="torch", args=(x1,))

# eager_graph is now torch code and runs efficiently
ret = eager_graph(x1)

In [None]:
# Arguments are not available -> tracing happens lazily
lazy_graph = ivy.trace_graph(normalize, to="torch")

# The traced graph is initialized, tracing will happen here
ret = lazy_graph(x1)

# lazy_graph is now torch code and runs efficiently
ret = lazy_graph(x1)

## Ivy as a Transpiler

We have just learned how to write framework-agnostic code and trace it into an efficient graph. However, many codebases, libraries, and models have already been developed (and will continue to be!) using other frameworks.

To allow for speed-of-thought research and development, Ivy also allows you to use any code directly into your project, regardless of the framework it was written in. No matter what ML code you want to use, Ivy's Transpiler is the tool for the job 🛠️

### Any function

Let's start by transpiling a very simple `torch` function.

In [None]:
def normalize(x):
    mean = torch.mean(x)
    std = torch.std(x)
    return torch.div(torch.sub(x, mean), std)

jax_normalize = ivy.transpile(normalize, source="torch", to="jax")

Similar to `trace_graph`, the `transpile` function can be used eagerly or lazily. In this particular example, transpilation is being performed lazily, since we haven't passed any arguments or keyword arguments to `ivy.transpile`.

In [None]:
import jax
key = jax.random.PRNGKey(42)
jax.config.update('jax_enable_x64', True)
x = jax.random.uniform(key, shape=(10,))

jax_out = jax_normalize(x)
print(jax_out, type(jax_out))

That's pretty much it! You can now use any function you need in your projects regardless of the framework you're using 🚀

However, transpiling functions one by one is far from ideal. But don't worry, with `transpile`, you can transpile entire libraries at once and easily bring them into your projects. Let's see how this works by transpiling `kornia`, a wisely-used computer vision library written in `torch`:

### Any library

In [None]:
import kornia
import requests
import jax.numpy as jnp
import numpy as np
from PIL import Image

Let's get the transpiled library by calling `transpile`.

In [None]:
jax_kornia = ivy.transpile(kornia, source="torch", to="jax")

Now let's get a sample image and preprocess so that it has the format kornia expects:

In [None]:
url = "http://images.cocodataset.org/train2017/000000000034.jpg"
raw_img = Image.open(requests.get(url, stream=True).raw)
img = jnp.transpose(jnp.array(raw_img), (2, 0, 1))
img = jnp.expand_dims(img, 0) / 255
display(raw_img)

And we can call any function from kornia in `jax`, as simple as that!

In [None]:
out = jax_kornia.enhance.sharpness(img, 10)
type(out)

Finally, let's see if the transformation has been applied correctly:

In [None]:
np_image = np.uint8(np.array(out[0])*255)
display(Image.fromarray(np.transpose(np_image, (1, 2, 0))))

It's worth noting that every operation in the transpiled functions is performed natively in the target framework, which means that gradients can be tracked and the resulting functions are fully differentiable. Even after transpilation, you can still take advantage of the powerful features of your chosen framework.

While transpiling functions and libraries is useful, trainable modules play a critical role in ML and DL. The good news is that Ivy makes it just as easy to transpile modules and models from one framework to another with just one line of code.

### Any model

For the purpose of this demonstration, let's define a very basic CNN block using the Sequential API of `keras`.

In [None]:
import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 3)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(10, activation='softmax')
])

The model we just defined is an instance of `tf.keras.Model`. Using `ivy.transpile`, we can effortlessly convert it into a `torch.nn.Module`, for instance.

In [None]:
input_array = tf.random.normal((1, 28, 28, 3))
torch_model = ivy.transpile(model, to="torch", args=(input_array,))

After transpilation, we can pass a `torch` tensor and obtain the expected output. As mentioned previously, all operations are now PyTorch native functions, making them differentiable. Additionally, Ivy automatically converts all parameters of the original model to the new one, allowing you to transpile pre-trained models and fine-tune them in your preferred framework.

In [None]:
isinstance(torch_model, torch.nn.Module)

In [None]:
input_array = torch.rand((1, 28, 28, 3)).to(ivy.default_device(as_native="True"))
torch_model.to(ivy.default_device(as_native="True"))
output_array = torch_model(input_array)
print(output_array)

While we have only transpiled a simple model for demonstration purposes, we can certainly transpile more complex models as well. Let's take a more complex model from `timm` and see how we can build upon transpiled modules.

In [None]:
import timm

We will only be using the encoder, so we can remove the unnecessary layers by setting `num_classes=0`, and then pass `pretrained=True` to download the pre-trained parameters.

In [None]:
mlp_encoder = timm.create_model("mixer_b16_224", pretrained=True, num_classes=0)

Let's transpile the model to tensorflow with `ivy.transpile` 🔀

In [None]:
noise = torch.randn(1, 3, 224, 224)
tf_mlp_encoder = ivy.transpile(mlp_encoder, to="tensorflow", args=(noise,))

And now let's build a model on top of our pretrained encoder!

In [None]:
class Classifier(tf.keras.Model):
    def __init__(self):
        super(Classifier, self).__init__()
        self.encoder = tf_mlp_encoder
        self.output_dense = tf.keras.layers.Dense(units=1000, activation="softmax")

    def call(self, x):
        x = self.encoder(x)
        return self.output_dense(x)

In [None]:
model = Classifier()

x = tf.random.normal(shape=(1, 3, 224, 224))
ret = model(x)
print(type(ret), ret.shape)

As the encoder now consists of `tensorflow` functions, we can extend the transpiled modules as much as we want, leveraging existing weights and the tools and infrastructure of all frameworks 🚀

Last but not least, let's see how easily we can improve the performance of a model by transpiling a ResNet from Hugging Face from PyTorch to JAX ⬇️

First we need to load the model and its corresponding feature extractor from the `transformers` library.

In [None]:
import jax
from transformers import AutoModel, AutoFeatureExtractor

jax.config.update("jax_enable_x64", False)

arch_name = "ResNet"
checkpoint_name = "microsoft/resnet-50"

feature_extractor = AutoFeatureExtractor.from_pretrained(checkpoint_name)
model = AutoModel.from_pretrained(checkpoint_name)

Now let's download a sample image from the COCO dataset and use the feature extractor we've just created to generate the torch tensors we'll be using during tracing

In [None]:
import requests
from PIL import Image

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(
    images=image, return_tensors="pt"
)

We can now convert the model from `torch` to `haiku` simply calling `ivy.transpile()`!

In [None]:
transpiled_graph = ivy.transpile(model, to="haiku", kwargs=inputs)

After transpiling the model, let's see what the improvement in runtime efficiency looks like. For this we'll compile the original PyTorch model using `torch.compile`

In [None]:
import torch

inputs = feature_extractor(images=image, return_tensors="pt").to("cuda")

model.to("cuda")

def _f(**kwargs):
  return model(**kwargs)

comp_model = torch.compile(_f)
_ = comp_model(**inputs)


And the equivalent compilation of our `haiku` model with `jax.jit`

In [None]:
import haiku as hk

inputs_jax = feature_extractor(images=image, return_tensors="jax")

def _forward(**kwargs):
  module = transpiled_graph()
  return module(**kwargs).last_hidden_state

_forward = jax.jit(_forward)
rng_key = jax.random.PRNGKey(42)
jax_forward = hk.transform(_forward)
params = jax_forward.init(rng=rng_key, **inputs_jax)

Now that both models are compiled in their corresponding frameworks, let's see how their runtime speeds compare to each other:

In [None]:
%%timeit
_ = comp_model(**inputs)

In [None]:
%%timeit
_ = jax_forward.apply(params, None, **inputs_jax)

As expected, we have made the model significantly faster with just one line of code, getting a ~2x increase in its execution speed 🚀

Finally, as a sanity check, let's load a different image and make sure that the results are the same in both models

In [None]:
url = "http://images.cocodataset.org/train2017/000000283921.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="pt").to("cuda")
inputs_jax = feature_extractor(images=image, return_tensors="jax")
out_torch = comp_model(**inputs)
out_jax = jax_forward.apply(params, None, **inputs_jax)

np.allclose(out_torch.last_hidden_state.detach().cpu().numpy(), out_jax, atol=1e-4)

That's pretty much it! The results from both models are the same, but we have achieved a solid speed up by using Ivy's transpiler to convert the model to JAX!