# Deploy and inference Jina Reranker V2 with Azure app

This notebook demonstrates how to deploy a Jina Reranker V2 model-powered [Azure managed application](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/jinaai.jina-reranker-v2-base-multilingual?tab=Overview) and perform inference with this application.

## Deploy the managed application

To deploy your Azure managed application, start by consulting the [official deployment guide](https://learn.microsoft.com/en-us/azure/azure-resource-manager/managed-applications/deploy-marketplace-app-quickstart). This document provides comprehensive steps for the deployment process.

It's worth mentioning that in the Basics tab of the deployment setup, you will need to provide several details about your deployment. 

You can customize the VM used, and for certain types, you might need to adjust the allowed quota to ensure access. It is recommended to use the [Standard_NC4as_T4_v3](https://learn.microsoft.com/en-us/azure/virtual-machines/nct4-v3-series) VM. This VM features up to 1 NVIDIA T4 GPU with 16 GB of memory.

<img src="images/deploy_reranking_v2_app.png" width="50%" height="50%">

Once the deployment of the managed application is complete, proceed to the resource group created for your deployment (for instance, `mrg-jina-reranker-v2-base-multilingual-preview-20240802152633` as referenced in the provided screenshot) to verify the resources that have been established. 

Within this resource group, look for the `jina-inference-vm`. Here, you'll find the DNS Name through which you can access your application. In this example, the application is accessible via `testv2offer.eastus.cloudapp.azure.com`.

Note that the application won't be able to serve traffics right after the deployment is done. Because separate processes in the machine will be run in order to install drivers, dependencies and reboot happens too. Please allow at least 15 mins 

# Perform inference with the managed application

The Python example below demonstrates how to perform real-time inference using the DNS of the deployed Jina Reranker V2 model-powered managed application.

In [None]:
import json

import requests


def invoke_endpoint():
    url = "http://<Insert here your DNS prefix>.<Insert here your region>.cloudapp.azure.com:8080/invocations"  # With above example, it's "http://testv2offer.eastus.cloudapp.azure.com:8080/invocations"
    headers = {"Content-Type": "application/json"}
    json_data = {
        "data": {
            "documents": [
                {"text": "the dog is in my house"},
                {"text": "he likes dog"},
                {"text": "hello world"},
            ],
            "query": "where is the dog",
            "top_n": 2,
        }
    }

    response = requests.post(url, headers=headers, data=json.dumps(json_data))
    print(response.json())


invoke_endpoint()