diff --git a/supporting-blog-content/re-ranking-elasticsearch-hosted/re-ranking-elasticsearch-hosted.ipynb b/supporting-blog-content/re-ranking-elasticsearch-hosted/re-ranking-elasticsearch-hosted.ipynb
index 281f8a47..84dcc175 100644
--- a/supporting-blog-content/re-ranking-elasticsearch-hosted/re-ranking-elasticsearch-hosted.ipynb
+++ b/supporting-blog-content/re-ranking-elasticsearch-hosted/re-ranking-elasticsearch-hosted.ipynb
@@ -1,819 +1,821 @@
 {
- "nbformat": 4,
- "nbformat_minor": 0,
- "metadata": {
-  "colab": {
-   "provenance": []
-  },
-  "kernelspec": {
-   "name": "python3",
-   "display_name": "Python 3"
-  },
-  "language_info": {
-   "name": "python"
-  }
- },
- "cells": [
-  {
-   "cell_type": "markdown",
-   "source": [
-    "*Reranking with a locally hosted reranker model from HuggingFace*"
-   ],
-   "metadata": {
-    "id": "bW9q8qD_bPhY"
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "# Setup the notebook"
-   ],
-   "metadata": {
-    "id": "BecBOzyDbWik"
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Install required libs"
-   ],
-   "metadata": {
-    "id": "6ayhDP72bZAe"
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "id": "2Xz9uWQFbNkH"
-   },
-   "outputs": [],
-   "source": [
-    "!pip install -qqU elasticsearch\n",
-    "!pip install -qqU eland[pytorch]\n",
-    "!pip install -qqU datasets"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Import the required python libraries"
-   ],
-   "metadata": {
-    "id": "LgHQaJh0bmJQ"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "import os\n",
-    "from elasticsearch import Elasticsearch, helpers, exceptions\n",
-    "from urllib.request import urlopen\n",
-    "from getpass import getpass\n",
-    "import json\n",
-    "import time\n",
-    "from datasets import load_dataset\n",
-    "import pandas as pd"
-   ],
-   "metadata": {
-    "id": "CsL466H0bjNX"
-   },
-   "execution_count": null,
-   "outputs": []
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Create an Elasticsearch Python client\n",
-    "\n"
-   ],
-   "metadata": {
-    "id": "gsQ4XIpkbpd4"
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Free Trial\n",
-    "If you don't have an Elasticsearch cluster, or what one to test out. Head over to [cloud.elastic.co](https://cloud.elastic.co/registration?onboarding_token=search&cta=cloud-registration&tech=trial&plcmt=article%20content&pg=search-labs) and sign up. You can sign up and have a serverless project up and running in only a few mintues!"
-   ],
-   "metadata": {
-    "id": "M_JppGte1Wc5"
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "We are using an Elastic Cloud cloud_id and deployment (cluster) API key.\n",
-    "\n",
-    "[See this guide](https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud) for finding the `cloud_id` and creating an `api_key`"
-   ],
-   "metadata": {
-    "id": "gdsEjwhV1tr5"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "cloud_id = getpass(prompt=\"Enter your Elasticsearch Cloud ID: \")\n",
-    "api_key = getpass(prompt=\"Enter your Elasticsearch API key: \")\n",
-    "\n",
-    "\n",
-    "es = Elasticsearch(cloud_id=cloud_id, api_key=api_key)\n",
-    "\n",
-    "try:\n",
-    "    es.info()\n",
-    "    print(\"Successfully connected to Elasticsearch!\")\n",
-    "except exceptions.ConnectionError as e:\n",
-    "    print(f\"Error connecting to Elasticsearch: {e}\")"
-   ],
-   "metadata": {
-    "id": "UY5WCB0HUVTb"
-   },
-   "execution_count": null,
-   "outputs": []
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "# Ready Elasticsearch"
-   ],
-   "metadata": {
-    "id": "jQdzhNbB_e3n"
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Hugging Face Reranking Model\n",
-    "Run this cell to:\n",
-    "- Use Eland's `eland_import_hub_model` command to upload the reranking model to Elasticsearch.\n",
-    "\n",
-    "For this example we've chosen the [`cross-encoder/ms-marco-MiniLM-L-6-v2`](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) text similarity model.\n",
-    "<br><br>\n",
-    "**Note**:\n",
-    "While we are importing the model for use as a reranker, Eland and Elasticsearch do not have a dedicated rerank task type, so we still use `text_similarity`"
-   ],
-   "metadata": {
-    "id": "5bsLLnqCfNKk"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "model_id = \"cross-encoder/ms-marco-MiniLM-L-6-v2\"\n",
-    "\n",
-    "!eland_import_hub_model \\\n",
-    "  --cloud-id $cloud_id \\\n",
-    "  --es-api-key $api_key \\\n",
-    "  --hub-model-id $model_id \\\n",
-    "  --task-type text_similarity"
-   ],
-   "metadata": {
-    "id": "J2MTEYrUfk9R"
-   },
-   "execution_count": null,
-   "outputs": []
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Create Inference Endpoint\n",
-    "Run this cell to:\n",
-    "- Create an inference Endpoint\n",
-    "- Deploy the reranking model we impoted in the previous section\n",
-    "We need to create an endpoint queries can use for reranking\n",
-    "\n",
-    "Key points about the `model_config`\n",
-    "- `service` - in this case `elasticsearch` will tell the inference API to use a locally hosted (in Elasticsearch) model\n",
-    "- `num_allocations` sets the number of allocations to 1\n",
-    "    - Allocations are independent units of work for NLP tasks. Scaling this allows for an increase in concurrent throughput\n",
-    "- `num_threads` - sets the number of threads per allocation to 1\n",
-    "    - Threads per allocation affect the number of threads used by each allocation during inference. Scaling this generally increased the speed of inference requests (to a point).\n",
-    "- `model_id` - This is the id of the model as it is named in Elasticsearch\n",
-    "\n"
-   ],
-   "metadata": {
-    "id": "-rrQV6SAgWz8"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "model_config = {\n",
-    "    \"service\": \"elasticsearch\",\n",
-    "    \"service_settings\": {\n",
-    "        \"num_allocations\": 1,\n",
-    "        \"num_threads\": 1,\n",
-    "        \"model_id\": \"cross-encoder__ms-marco-minilm-l-6-v2\",\n",
-    "    },\n",
-    "    \"task_settings\": {\"return_documents\": True},\n",
-    "}\n",
-    "\n",
-    "inference_id = \"semantic-reranking\"\n",
-    "\n",
-    "create_endpoint = es.inference.put(\n",
-    "    inference_id=inference_id, task_type=\"rerank\", body=model_config\n",
-    ")\n",
-    "\n",
-    "create_endpoint.body"
-   ],
-   "metadata": {
-    "id": "Abu084BYgWCE",
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
     "colab": {
-     "base_uri": "https://localhost:8080/"
+      "provenance": []
     },
-    "outputId": "d58cc940-281e-4d56-d6e6-040e4881e78a"
-   },
-   "execution_count": null,
-   "outputs": [
-    {
-     "output_type": "execute_result",
-     "data": {
-      "text/plain": [
-       "{'inference_id': 'semantic-reranking',\n",
-       " 'task_type': 'rerank',\n",
-       " 'service': 'elasticsearch',\n",
-       " 'service_settings': {'num_allocations': 1,\n",
-       "  'num_threads': 1,\n",
-       "  'model_id': 'cross-encoder__ms-marco-minilm-l-6-v2'},\n",
-       " 'task_settings': {'return_documents': True}}"
-      ]
-     },
-     "metadata": {},
-     "execution_count": 6
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
     }
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "### Verify it was created\n",
-    "\n",
-    "- Run the two cells in this section to verify:\n",
-    "- The Inference Endpoint has been completed\n",
-    "- The model has been deployed\n",
-    "\n",
-    "You should see JSON output with information about the semantic endpoint"
-   ],
-   "metadata": {
-    "id": "X8rQXMrHhMkS"
-   }
   },
-  {
-   "cell_type": "code",
-   "source": [
-    "check_endpoint = es.inference.get(\n",
-    "    inference_id=inference_id,\n",
-    ")\n",
-    "\n",
-    "check_endpoint.body"
-   ],
-   "metadata": {
-    "id": "n3Yk7rgYhP-N",
-    "colab": {
-     "base_uri": "https://localhost:8080/"
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "source": [
+        "*Reranking with a locally hosted reranker model from HuggingFace*"
+      ],
+      "metadata": {
+        "id": "bW9q8qD_bPhY"
+      }
     },
-    "outputId": "d9e68225-5796-411e-964a-6db3be5541aa"
-   },
-   "execution_count": null,
-   "outputs": [
     {
-     "output_type": "execute_result",
-     "data": {
-      "text/plain": [
-       "{'endpoints': [{'inference_id': 'semantic-reranking',\n",
-       "   'task_type': 'rerank',\n",
-       "   'service': 'elasticsearch',\n",
-       "   'service_settings': {'num_allocations': 1,\n",
-       "    'num_threads': 1,\n",
-       "    'model_id': 'cross-encoder__ms-marco-minilm-l-6-v2'},\n",
-       "   'task_settings': {'return_documents': True}}]}"
+      "cell_type": "markdown",
+      "source": [
+        "# Setup the notebook"
+      ],
+      "metadata": {
+        "id": "BecBOzyDbWik"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Install required libs"
+      ],
+      "metadata": {
+        "id": "6ayhDP72bZAe"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "2Xz9uWQFbNkH"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -qqU elasticsearch\n",
+        "!pip install -qqU eland[pytorch]\n",
+        "!pip install -qqU datasets"
       ]
-     },
-     "metadata": {},
-     "execution_count": 7
-    }
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Create the index mapping\n",
-    "\n",
-    "We are going to index the `title` and `abstract` from the dataset.  "
-   ],
-   "metadata": {
-    "id": "4vqimyNWAhWb"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "index_name = \"arxiv-papers\"\n",
-    "\n",
-    "index_mapping = {\n",
-    "    \"mappings\": {\n",
-    "        \"properties\": {\"title\": {\"type\": \"text\"}, \"abstract\": {\"type\": \"text\"}}\n",
-    "    }\n",
-    "}\n",
-    "\n",
-    "\n",
-    "try:\n",
-    "    es.indices.create(index=index_name, body=index_mapping)\n",
-    "    print(f\"Index '{index_name}' created successfully.\")\n",
-    "except exceptions.RequestError as e:\n",
-    "    if e.error == \"resource_already_exists_exception\":\n",
-    "        print(f\"Index '{index_name}' already exists.\")\n",
-    "    else:\n",
-    "        print(f\"Error creating index '{index_name}': {e}\")"
-   ],
-   "metadata": {
-    "colab": {
-     "base_uri": "https://localhost:8080/"
     },
-    "outputId": "4bc9ba0b-9be3-410a-d1d6-f1da04bbfec7",
-    "id": "DPADF_7ytTmR"
-   },
-   "execution_count": null,
-   "outputs": [
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": [
-      "Index 'arxiv-papers' created successfully.\n"
-     ]
-    }
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Ready the dataset\n",
-    "We are going to use the [CShorten/ML-ArXiv-Papers](https://huggingface.co/datasets/CShorten/ML-ArXiv-Papers) dataset."
-   ],
-   "metadata": {
-    "id": "FqQmaT5P-Nhx"
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Download Dataset\n",
-    "**Note** You may get a warning *The secret `HF_TOKEN` does not exist in your Colab secrets*.\n",
-    "\n",
-    "You can safely ignore this."
-   ],
-   "metadata": {
-    "id": "aN0dbYO7oB47"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "dataset = load_dataset(\"CShorten/ML-ArXiv-Papers\")"
-   ],
-   "metadata": {
-    "colab": {
-     "base_uri": "https://localhost:8080/"
+      "cell_type": "markdown",
+      "source": [
+        "## Import the required python libraries"
+      ],
+      "metadata": {
+        "id": "LgHQaJh0bmJQ"
+      }
     },
-    "id": "IVnpj5bBoEBL",
-    "outputId": "bc6371d9-d66f-482c-95f8-8cdb89e4f0ef"
-   },
-   "execution_count": null,
-   "outputs": [
     {
-     "output_type": "stream",
-     "name": "stderr",
-     "text": [
-      "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:89: UserWarning: \n",
-      "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
-      "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n",
-      "You will be able to reuse this secret in all of your notebooks.\n",
-      "Please note that authentication is recommended but still optional to access public models or datasets.\n",
-      "  warnings.warn(\n"
-     ]
-    }
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "### Index into Elasticsearch\n",
-    "\n",
-    "We will loop through the dataset and send batches of rows to Elasticsearch\n",
-    "- This may take a couple minutes depending on your cluster sizing."
-   ],
-   "metadata": {
-    "id": "GQxDITCpAKWb"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "def bulk_insert_elasticsearch(dataset, index_name, chunk_size=1000):\n",
-    "    actions = []\n",
-    "    for record in dataset:\n",
-    "        action = {\n",
-    "            \"_index\": index_name,\n",
-    "            \"_source\": {\"title\": record[\"title\"], \"abstract\": record[\"abstract\"]},\n",
-    "        }\n",
-    "        actions.append(action)\n",
-    "\n",
-    "        if len(actions) == chunk_size:\n",
-    "            helpers.bulk(es, actions)\n",
-    "            actions = []\n",
-    "\n",
-    "    if actions:\n",
-    "        helpers.bulk(es, actions)\n",
-    "\n",
-    "\n",
-    "bulk_insert_elasticsearch(dataset[\"train\"], index_name)"
-   ],
-   "metadata": {
-    "id": "tDZ0qEbW-ozW"
-   },
-   "execution_count": null,
-   "outputs": []
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "# Query with Reranking\n",
-    "\n",
-    "This containes a `text_similarity_reranker` retriever which:\n",
-    "1. Uses a Standard Retriever to :\n",
-    "    1. Perform a lexical query against `title field\n",
-    "2. Perform a reranking:\n",
-    "    1. Taks as input the top 100 results from the previous search\n",
-    "      - `\"rank_window_size\": 100`\n",
-    "    2. Taks as input the query\n",
-    "      - `\"inference_text\": query`\n",
-    "    3.  Uses our previously created reranking API and model\n"
-   ],
-   "metadata": {
-    "id": "2bwvzLfRjJ2n"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "query = \"sparse vector embedding\"\n",
-    "\n",
-    "# Query scored from score\n",
-    "response_scored = es.search(\n",
-    "    index=\"arxiv-papers\",\n",
-    "    body={\n",
-    "        \"size\": 10,\n",
-    "        \"retriever\": {\"standard\": {\"query\": {\"match\": {\"title\": query}}}},\n",
-    "        \"fields\": [\"title\", \"abstract\"],\n",
-    "        \"_source\": False,\n",
-    "    },\n",
-    ")\n",
-    "\n",
-    "# Query with Semantic Reranker\n",
-    "response_reranked = es.search(\n",
-    "    index=\"arxiv-papers\",\n",
-    "    body={\n",
-    "        \"size\": 10,\n",
-    "        \"retriever\": {\n",
-    "            \"text_similarity_reranker\": {\n",
-    "                \"retriever\": {\"standard\": {\"query\": {\"match\": {\"title\": query}}}},\n",
-    "                \"field\": \"abstract\",\n",
-    "                \"inference_id\": \"semantic-reranking\",\n",
-    "                \"inference_text\": query,\n",
-    "                \"rank_window_size\": 100,\n",
-    "            }\n",
-    "        },\n",
-    "        \"fields\": [\"title\", \"abstract\"],\n",
-    "        \"_source\": False,\n",
-    "    },\n",
-    ")"
-   ],
-   "metadata": {
-    "id": "HWXQBS35jQ3n"
-   },
-   "execution_count": null,
-   "outputs": []
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Print the table comparing the scored and reranked results"
-   ],
-   "metadata": {
-    "id": "Hnam80Irbj6a"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "titles_scored = [\n",
-    "    paper[\"fields\"][\"title\"][0] for paper in response_scored.body[\"hits\"][\"hits\"]\n",
-    "]\n",
-    "titles_reranked = [\n",
-    "    paper[\"fields\"][\"title\"][0] for paper in response_reranked.body[\"hits\"][\"hits\"]\n",
-    "]\n",
-    "\n",
-    "# Creating a DataFrame\n",
-    "df = pd.DataFrame(\n",
-    "    {\"Scored Results\": titles_scored, \"Reranked Results\": titles_reranked}\n",
-    ")\n",
-    "\n",
-    "df_styled = df.style.set_properties(**{\"text-align\": \"left\"}).set_caption(\n",
-    "    f\"Comparison of Scored and Semantic Reranked Results - Query: '{query}'\"\n",
-    ")\n",
-    "\n",
-    "# Display the table\n",
-    "df_styled"
-   ],
-   "metadata": {
-    "colab": {
-     "base_uri": "https://localhost:8080/",
-     "height": 415
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "from elasticsearch import Elasticsearch, helpers, exceptions\n",
+        "from urllib.request import urlopen\n",
+        "from getpass import getpass\n",
+        "import json\n",
+        "import time\n",
+        "from datasets import load_dataset\n",
+        "import pandas as pd"
+      ],
+      "metadata": {
+        "id": "CsL466H0bjNX"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Create an Elasticsearch Python client\n",
+        "\n"
+      ],
+      "metadata": {
+        "id": "gsQ4XIpkbpd4"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Free Trial\n",
+        "If you don't have an Elasticsearch cluster, or what one to test out. Head over to [cloud.elastic.co](https://cloud.elastic.co/registration?onboarding_token=search&cta=cloud-registration&tech=trial&plcmt=article%20content&pg=search-labs) and sign up. You can sign up and have a serverless project up and running in only a few mintues!"
+      ],
+      "metadata": {
+        "id": "M_JppGte1Wc5"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "We are using an Elastic Cloud cloud_id and deployment (cluster) API key.\n",
+        "\n",
+        "[See this guide](https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud) for finding the `cloud_id` and creating an `api_key`"
+      ],
+      "metadata": {
+        "id": "gdsEjwhV1tr5"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "cloud_id = getpass(prompt=\"Enter your Elasticsearch Cloud ID: \")\n",
+        "api_key = getpass(prompt=\"Enter your Elasticsearch API key: \")\n",
+        "\n",
+        "\n",
+        "es = Elasticsearch(cloud_id=cloud_id, api_key=api_key)\n",
+        "\n",
+        "try:\n",
+        "    es.info()\n",
+        "    print(\"Successfully connected to Elasticsearch!\")\n",
+        "except exceptions.ConnectionError as e:\n",
+        "    print(f\"Error connecting to Elasticsearch: {e}\")"
+      ],
+      "metadata": {
+        "id": "UY5WCB0HUVTb"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Ready Elasticsearch"
+      ],
+      "metadata": {
+        "id": "jQdzhNbB_e3n"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Hugging Face Reranking Model\n",
+        "Run this cell to:\n",
+        "- Use Eland's `eland_import_hub_model` command to upload the reranking model to Elasticsearch.\n",
+        "\n",
+        "For this example we've chosen the [`cross-encoder/ms-marco-MiniLM-L-6-v2`](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) text similarity model.\n",
+        "<br><br>\n",
+        "**Note**:\n",
+        "While we are importing the model for use as a reranker, Eland and Elasticsearch do not have a dedicated rerank task type, so we still use `text_similarity`"
+      ],
+      "metadata": {
+        "id": "5bsLLnqCfNKk"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "model_id = \"cross-encoder/ms-marco-MiniLM-L-6-v2\"\n",
+        "\n",
+        "!eland_import_hub_model \\\n",
+        "  --cloud-id $cloud_id \\\n",
+        "  --es-api-key $api_key \\\n",
+        "  --hub-model-id $model_id \\\n",
+        "  --task-type text_similarity"
+      ],
+      "metadata": {
+        "id": "J2MTEYrUfk9R"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Create Inference Endpoint\n",
+        "Run this cell to:\n",
+        "- Create an inference Endpoint\n",
+        "- Deploy the reranking model we impoted in the previous section\n",
+        "We need to create an endpoint queries can use for reranking\n",
+        "\n",
+        "Key points about the `model_config`\n",
+        "- `service` - in this case `elasticsearch` will tell the inference API to use a locally hosted (in Elasticsearch) model\n",
+        "- `num_allocations` sets the number of allocations to 1\n",
+        "    - Allocations are independent units of work for NLP tasks. Scaling this allows for an increase in concurrent throughput\n",
+        "- `num_threads` - sets the number of threads per allocation to 1\n",
+        "    - Threads per allocation affect the number of threads used by each allocation during inference. Scaling this generally increased the speed of inference requests (to a point).\n",
+        "- `model_id` - This is the id of the model as it is named in Elasticsearch\n",
+        "\n"
+      ],
+      "metadata": {
+        "id": "-rrQV6SAgWz8"
+      }
     },
-    "id": "yTTNYCYcBtll",
-    "outputId": "0f0af538-13fd-4e62-e3d2-1dd185689904"
-   },
-   "execution_count": null,
-   "outputs": [
     {
-     "output_type": "execute_result",
-     "data": {
-      "text/plain": [
-       "<pandas.io.formats.style.Styler at 0x7dad78316950>"
+      "cell_type": "code",
+      "source": [
+        "model_config = {\n",
+        "    \"service\": \"elasticsearch\",\n",
+        "    \"service_settings\": {\n",
+        "        \"num_allocations\": 1,\n",
+        "        \"num_threads\": 1,\n",
+        "        \"model_id\": \"cross-encoder__ms-marco-minilm-l-6-v2\",\n",
+        "    },\n",
+        "    \"task_settings\": {\"return_documents\": True},\n",
+        "}\n",
+        "\n",
+        "inference_id = \"semantic-reranking\"\n",
+        "\n",
+        "create_endpoint = es.inference.put(\n",
+        "    inference_id=inference_id, task_type=\"rerank\", body=model_config\n",
+        ")\n",
+        "\n",
+        "create_endpoint.body"
       ],
-      "text/html": [
-       "<style type=\"text/css\">\n",
-       "#T_b97e6_row0_col0, #T_b97e6_row0_col1, #T_b97e6_row1_col0, #T_b97e6_row1_col1, #T_b97e6_row2_col0, #T_b97e6_row2_col1, #T_b97e6_row3_col0, #T_b97e6_row3_col1, #T_b97e6_row4_col0, #T_b97e6_row4_col1, #T_b97e6_row5_col0, #T_b97e6_row5_col1, #T_b97e6_row6_col0, #T_b97e6_row6_col1, #T_b97e6_row7_col0, #T_b97e6_row7_col1, #T_b97e6_row8_col0, #T_b97e6_row8_col1, #T_b97e6_row9_col0, #T_b97e6_row9_col1 {\n",
-       "  text-align: left;\n",
-       "}\n",
-       "</style>\n",
-       "<table id=\"T_b97e6\" class=\"dataframe\">\n",
-       "  <caption>Comparison of Scored and Semantic Reranked Results - Query: 'sparse vector embedding'</caption>\n",
-       "  <thead>\n",
-       "    <tr>\n",
-       "      <th class=\"blank level0\" >&nbsp;</th>\n",
-       "      <th id=\"T_b97e6_level0_col0\" class=\"col_heading level0 col0\" >Scored Results</th>\n",
-       "      <th id=\"T_b97e6_level0_col1\" class=\"col_heading level0 col1\" >Reranked Results</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
-       "      <td id=\"T_b97e6_row0_col0\" class=\"data row0 col0\" >Compact Speaker Embedding: lrx-vector</td>\n",
-       "      <td id=\"T_b97e6_row0_col1\" class=\"data row0 col1\" >Scaling Up Sparse Support Vector Machines by Simultaneous Feature and\n",
-       "  Sample Reduction</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
-       "      <td id=\"T_b97e6_row1_col0\" class=\"data row1 col0\" >Quantum Sparse Support Vector Machines</td>\n",
-       "      <td id=\"T_b97e6_row1_col1\" class=\"data row1 col1\" >Spaceland Embedding of Sparse Stochastic Graphs</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
-       "      <td id=\"T_b97e6_row2_col0\" class=\"data row2 col0\" >Sparse Support Vector Infinite Push</td>\n",
-       "      <td id=\"T_b97e6_row2_col1\" class=\"data row2 col1\" >Elliptical Ordinal Embedding</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
-       "      <td id=\"T_b97e6_row3_col0\" class=\"data row3 col0\" >The Sparse Vector Technique, Revisited</td>\n",
-       "      <td id=\"T_b97e6_row3_col1\" class=\"data row3 col1\" >Minimum-Distortion Embedding</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
-       "      <td id=\"T_b97e6_row4_col0\" class=\"data row4 col0\" >L-Vector: Neural Label Embedding for Domain Adaptation</td>\n",
-       "      <td id=\"T_b97e6_row4_col1\" class=\"data row4 col1\" >Free Gap Information from the Differentially Private Sparse Vector and\n",
-       "  Noisy Max Mechanisms</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
-       "      <td id=\"T_b97e6_row5_col0\" class=\"data row5 col0\" >Spaceland Embedding of Sparse Stochastic Graphs</td>\n",
-       "      <td id=\"T_b97e6_row5_col1\" class=\"data row5 col1\" >Interpolated Discretized Embedding of Single Vectors and Vector Pairs\n",
-       "  for Classification, Metric Learning and Distance Approximation</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row6\" class=\"row_heading level0 row6\" >6</th>\n",
-       "      <td id=\"T_b97e6_row6_col0\" class=\"data row6 col0\" >Sparse Signal Recovery in the Presence of Intra-Vector and Inter-Vector\n",
-       "  Correlation</td>\n",
-       "      <td id=\"T_b97e6_row6_col1\" class=\"data row6 col1\" >Attention Word Embedding</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row7\" class=\"row_heading level0 row7\" >7</th>\n",
-       "      <td id=\"T_b97e6_row7_col0\" class=\"data row7 col0\" >Stable Sparse Subspace Embedding for Dimensionality Reduction</td>\n",
-       "      <td id=\"T_b97e6_row7_col1\" class=\"data row7 col1\" >Binary Speaker Embedding</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row8\" class=\"row_heading level0 row8\" >8</th>\n",
-       "      <td id=\"T_b97e6_row8_col0\" class=\"data row8 col0\" >Auto-weighted Mutli-view Sparse Reconstructive Embedding</td>\n",
-       "      <td id=\"T_b97e6_row8_col1\" class=\"data row8 col1\" >NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_b97e6_level0_row9\" class=\"row_heading level0 row9\" >9</th>\n",
-       "      <td id=\"T_b97e6_row9_col0\" class=\"data row9 col0\" >Embedding Words in Non-Vector Space with Unsupervised Graph Learning</td>\n",
-       "      <td id=\"T_b97e6_row9_col1\" class=\"data row9 col1\" >Estimating Vector Fields on Manifolds and the Embedding of Directed\n",
-       "  Graphs</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n"
+      "metadata": {
+        "id": "Abu084BYgWCE",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "d58cc940-281e-4d56-d6e6-040e4881e78a"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "{'inference_id': 'semantic-reranking',\n",
+              " 'task_type': 'rerank',\n",
+              " 'service': 'elasticsearch',\n",
+              " 'service_settings': {'num_allocations': 1,\n",
+              "  'num_threads': 1,\n",
+              "  'model_id': 'cross-encoder__ms-marco-minilm-l-6-v2'},\n",
+              " 'task_settings': {'return_documents': True}}"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 6
+        }
       ]
-     },
-     "metadata": {},
-     "execution_count": 17
-    }
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Print out Title and Abstract\n",
-    "This will print the title and the abstract for the top 10 results after semantic reranking."
-   ],
-   "metadata": {
-    "id": "A0HyNZoWyeun"
-   }
-  },
-  {
-   "cell_type": "code",
-   "source": [
-    "for paper in response_scored.body[\"hits\"][\"hits\"]:\n",
-    "    print(\n",
-    "        f\"Title {paper['fields']['title'][0]} \\n  Abstract: {paper['fields']['abstract'][0]}\"\n",
-    "    )"
-   ],
-   "metadata": {
-    "id": "4ZEx-46rn3in",
-    "colab": {
-     "base_uri": "https://localhost:8080/"
     },
-    "outputId": "e83318ad-ca42-4aa7-98d4-37c4428eb70a"
-   },
-   "execution_count": null,
-   "outputs": [
     {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": [
-      "Title Compact Speaker Embedding: lrx-vector \n",
-      "  Abstract:   Deep neural networks (DNN) have recently been widely used in speaker\n",
-      "recognition systems, achieving state-of-the-art performance on various\n",
-      "benchmarks. The x-vector architecture is especially popular in this research\n",
-      "community, due to its excellent performance and manageable computational\n",
-      "complexity. In this paper, we present the lrx-vector system, which is the\n",
-      "low-rank factorized version of the x-vector embedding network. The primary\n",
-      "objective of this topology is to further reduce the memory requirement of the\n",
-      "speaker recognition system. We discuss the deployment of knowledge distillation\n",
-      "for training the lrx-vector system and compare against low-rank factorization\n",
-      "with SVD. On the VOiCES 2019 far-field corpus we were able to reduce the\n",
-      "weights by 28% compared to the full-rank x-vector system while keeping the\n",
-      "recognition rate constant (1.83% EER).\n",
-      "\n",
-      "Title Quantum Sparse Support Vector Machines \n",
-      "  Abstract:   We analyze the computational complexity of Quantum Sparse Support Vector\n",
-      "Machine, a linear classifier that minimizes the hinge loss and the $L_1$ norm\n",
-      "of the feature weights vector and relies on a quantum linear programming solver\n",
-      "instead of a classical solver. Sparse SVM leads to sparse models that use only\n",
-      "a small fraction of the input features in making decisions, and is especially\n",
-      "useful when the total number of features, $p$, approaches or exceeds the number\n",
-      "of training samples, $m$. We prove a $\\Omega(m)$ worst-case lower bound for\n",
-      "computational complexity of any quantum training algorithm relying on black-box\n",
-      "access to training samples; quantum sparse SVM has at least linear worst-case\n",
-      "complexity. However, we prove that there are realistic scenarios in which a\n",
-      "sparse linear classifier is expected to have high accuracy, and can be trained\n",
-      "in sublinear time in terms of both the number of training samples and the\n",
-      "number of features.\n",
-      "\n",
-      "Title Sparse Support Vector Infinite Push \n",
-      "  Abstract:   In this paper, we address the problem of embedded feature selection for\n",
-      "ranking on top of the list problems. We pose this problem as a regularized\n",
-      "empirical risk minimization with $p$-norm push loss function ($p=\\infty$) and\n",
-      "sparsity inducing regularizers. We leverage the issues related to this\n",
-      "challenging optimization problem by considering an alternating direction method\n",
-      "of multipliers algorithm which is built upon proximal operators of the loss\n",
-      "function and the regularizer. Our main technical contribution is thus to\n",
-      "provide a numerical scheme for computing the infinite push loss function\n",
-      "proximal operator. Experimental results on toy, DNA microarray and BCI problems\n",
-      "show how our novel algorithm compares favorably to competitors for ranking on\n",
-      "top while using fewer variables in the scoring function.\n",
-      "\n",
-      "Title The Sparse Vector Technique, Revisited \n",
-      "  Abstract:   We revisit one of the most basic and widely applicable techniques in the\n",
-      "literature of differential privacy - the sparse vector technique [Dwork et al.,\n",
-      "STOC 2009]. This simple algorithm privately tests whether the value of a given\n",
-      "query on a database is close to what we expect it to be. It allows to ask an\n",
-      "unbounded number of queries as long as the answer is close to what we expect,\n",
-      "and halts following the first query for which this is not the case.\n",
-      "  We suggest an alternative, equally simple, algorithm that can continue\n",
-      "testing queries as long as any single individual does not contribute to the\n",
-      "answer of too many queries whose answer deviates substantially form what we\n",
-      "expect. Our analysis is subtle and some of its ingredients may be more widely\n",
-      "applicable. In some cases our new algorithm allows to privately extract much\n",
-      "more information from the database than the original.\n",
-      "  We demonstrate this by applying our algorithm to the shifting heavy-hitters\n",
-      "problem: On every time step, each of $n$ users gets a new input, and the task\n",
-      "is to privately identify all the current heavy-hitters. That is, on time step\n",
-      "$i$, the goal is to identify all data elements $x$ such that many of the users\n",
-      "have $x$ as their current input. We present an algorithm for this problem with\n",
-      "improved error guarantees over what can be obtained using existing techniques.\n",
-      "Specifically, the error of our algorithm depends on the maximal number of times\n",
-      "that a single user holds a heavy-hitter as input, rather than the total number\n",
-      "of times in which a heavy-hitter exists.\n",
-      "\n",
-      "Title L-Vector: Neural Label Embedding for Domain Adaptation \n",
-      "  Abstract:   We propose a novel neural label embedding (NLE) scheme for the domain\n",
-      "adaptation of a deep neural network (DNN) acoustic model with unpaired data\n",
-      "samples from source and target domains. With NLE method, we distill the\n",
-      "knowledge from a powerful source-domain DNN into a dictionary of label\n",
-      "embeddings, or l-vectors, one for each senone class. Each l-vector is a\n",
-      "representation of the senone-specific output distributions of the source-domain\n",
-      "DNN and is learned to minimize the average L2, Kullback-Leibler (KL) or\n",
-      "symmetric KL distance to the output vectors with the same label through simple\n",
-      "averaging or standard back-propagation. During adaptation, the l-vectors serve\n",
-      "as the soft targets to train the target-domain model with cross-entropy loss.\n",
-      "Without parallel data constraint as in the teacher-student learning, NLE is\n",
-      "specially suited for the situation where the paired target-domain data cannot\n",
-      "be simulated from the source-domain data. We adapt a 6400 hours\n",
-      "multi-conditional US English acoustic model to each of the 9 accented English\n",
-      "(80 to 830 hours) and kids' speech (80 hours). NLE achieves up to 14.1%\n",
-      "relative word error rate reduction over direct re-training with one-hot labels.\n",
-      "\n",
-      "Title Spaceland Embedding of Sparse Stochastic Graphs \n",
-      "  Abstract:   We introduce a nonlinear method for directly embedding large, sparse,\n",
-      "stochastic graphs into low-dimensional spaces, without requiring vertex\n",
-      "features to reside in, or be transformed into, a metric space. Graph data and\n",
-      "models are prevalent in real-world applications. Direct graph embedding is\n",
-      "fundamental to many graph analysis tasks, in addition to graph visualization.\n",
-      "We name the novel approach SG-t-SNE, as it is inspired by and builds upon the\n",
-      "core principle of t-SNE, a widely used method for nonlinear dimensionality\n",
-      "reduction and data visualization. We also introduce t-SNE-$\\Pi$, a\n",
-      "high-performance software for 2D, 3D embedding of large sparse graphs on\n",
-      "personal computers with superior efficiency. It empowers SG-t-SNE with modern\n",
-      "computing techniques for exploiting in tandem both matrix structures and memory\n",
-      "architectures. We present elucidating embedding results on one synthetic graph\n",
-      "and four real-world networks.\n",
-      "\n",
-      "Title Sparse Signal Recovery in the Presence of Intra-Vector and Inter-Vector\n",
-      "  Correlation \n",
-      "  Abstract:   This work discusses the problem of sparse signal recovery when there is\n",
-      "correlation among the values of non-zero entries. We examine intra-vector\n",
-      "correlation in the context of the block sparse model and inter-vector\n",
-      "correlation in the context of the multiple measurement vector model, as well as\n",
-      "their combination. Algorithms based on the sparse Bayesian learning are\n",
-      "presented and the benefits of incorporating correlation at the algorithm level\n",
-      "are discussed. The impact of correlation on the limits of support recovery is\n",
-      "also discussed highlighting the different impact intra-vector and inter-vector\n",
-      "correlations have on such limits.\n",
-      "\n",
-      "Title Stable Sparse Subspace Embedding for Dimensionality Reduction \n",
-      "  Abstract:   Sparse random projection (RP) is a popular tool for dimensionality reduction\n",
-      "that shows promising performance with low computational complexity. However, in\n",
-      "the existing sparse RP matrices, the positions of non-zero entries are usually\n",
-      "randomly selected. Although they adopt uniform sampling with replacement, due\n",
-      "to large sampling variance, the number of non-zeros is uneven among rows of the\n",
-      "projection matrix which is generated in one trial, and more data information\n",
-      "may be lost after dimension reduction. To break this bottleneck, based on\n",
-      "random sampling without replacement in statistics, this paper builds a stable\n",
-      "sparse subspace embedded matrix (S-SSE), in which non-zeros are uniformly\n",
-      "distributed. It is proved that the S-SSE is stabler than the existing matrix,\n",
-      "and it can maintain Euclidean distance between points well after dimension\n",
-      "reduction. Our empirical studies corroborate our theoretical findings and\n",
-      "demonstrate that our approach can indeed achieve satisfactory performance.\n",
-      "\n",
-      "Title Auto-weighted Mutli-view Sparse Reconstructive Embedding \n",
-      "  Abstract:   With the development of multimedia era, multi-view data is generated in\n",
-      "various fields. Contrast with those single-view data, multi-view data brings\n",
-      "more useful information and should be carefully excavated. Therefore, it is\n",
-      "essential to fully exploit the complementary information embedded in multiple\n",
-      "views to enhance the performances of many tasks. Especially for those\n",
-      "high-dimensional data, how to develop a multi-view dimension reduction\n",
-      "algorithm to obtain the low-dimensional representations is of vital importance\n",
-      "but chanllenging. In this paper, we propose a novel multi-view dimensional\n",
-      "reduction algorithm named Auto-weighted Mutli-view Sparse Reconstructive\n",
-      "Embedding (AMSRE) to deal with this problem. AMSRE fully exploits the sparse\n",
-      "reconstructive correlations between features from multiple views. Furthermore,\n",
-      "it is equipped with an auto-weighted technique to treat multiple views\n",
-      "discriminatively according to their contributions. Various experiments have\n",
-      "verified the excellent performances of the proposed AMSRE.\n",
-      "\n",
-      "Title Embedding Words in Non-Vector Space with Unsupervised Graph Learning \n",
-      "  Abstract:   It has become a de-facto standard to represent words as elements of a vector\n",
-      "space (word2vec, GloVe). While this approach is convenient, it is unnatural for\n",
-      "language: words form a graph with a latent hierarchical structure, and this\n",
-      "structure has to be revealed and encoded by word embeddings. We introduce\n",
-      "GraphGlove: unsupervised graph word representations which are learned\n",
-      "end-to-end. In our setting, each word is a node in a weighted graph and the\n",
-      "distance between words is the shortest path distance between the corresponding\n",
-      "nodes. We adopt a recent method learning a representation of data in the form\n",
-      "of a differentiable weighted graph and use it to modify the GloVe training\n",
-      "algorithm. We show that our graph-based representations substantially\n",
-      "outperform vector-based methods on word similarity and analogy tasks. Our\n",
-      "analysis reveals that the structure of the learned graphs is hierarchical and\n",
-      "similar to that of WordNet, the geometry is highly non-trivial and contains\n",
-      "subgraphs with different local topology.\n",
-      "\n"
-     ]
+      "cell_type": "markdown",
+      "source": [
+        "### Verify it was created\n",
+        "\n",
+        "- Run the two cells in this section to verify:\n",
+        "- The Inference Endpoint has been completed\n",
+        "- The model has been deployed\n",
+        "\n",
+        "You should see JSON output with information about the semantic endpoint"
+      ],
+      "metadata": {
+        "id": "X8rQXMrHhMkS"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "check_endpoint = es.inference.get(\n",
+        "    inference_id=inference_id,\n",
+        ")\n",
+        "\n",
+        "check_endpoint.body"
+      ],
+      "metadata": {
+        "id": "n3Yk7rgYhP-N",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "d9e68225-5796-411e-964a-6db3be5541aa"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "{'endpoints': [{'inference_id': 'semantic-reranking',\n",
+              "   'task_type': 'rerank',\n",
+              "   'service': 'elasticsearch',\n",
+              "   'service_settings': {'num_allocations': 1,\n",
+              "    'num_threads': 1,\n",
+              "    'model_id': 'cross-encoder__ms-marco-minilm-l-6-v2'},\n",
+              "   'task_settings': {'return_documents': True}}]}"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 7
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Create the index mapping\n",
+        "\n",
+        "We are going to index the `title` and `abstract` from the dataset.  "
+      ],
+      "metadata": {
+        "id": "4vqimyNWAhWb"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "index_name = \"arxiv-papers\"\n",
+        "\n",
+        "index_mapping = {\n",
+        "    \"mappings\": {\n",
+        "        \"properties\": {\"title\": {\"type\": \"text\"}, \"abstract\": {\"type\": \"text\"}}\n",
+        "    }\n",
+        "}\n",
+        "\n",
+        "\n",
+        "try:\n",
+        "    es.indices.create(index=index_name, body=index_mapping)\n",
+        "    print(f\"Index '{index_name}' created successfully.\")\n",
+        "except exceptions.RequestError as e:\n",
+        "    if e.error == \"resource_already_exists_exception\":\n",
+        "        print(f\"Index '{index_name}' already exists.\")\n",
+        "    else:\n",
+        "        print(f\"Error creating index '{index_name}': {e}\")"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "4bc9ba0b-9be3-410a-d1d6-f1da04bbfec7",
+        "id": "DPADF_7ytTmR"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Index 'arxiv-papers' created successfully.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Ready the dataset\n",
+        "We are going to use the [CShorten/ML-ArXiv-Papers](https://huggingface.co/datasets/CShorten/ML-ArXiv-Papers) dataset."
+      ],
+      "metadata": {
+        "id": "FqQmaT5P-Nhx"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Download Dataset\n",
+        "**Note** You may get a warning *The secret `HF_TOKEN` does not exist in your Colab secrets*.\n",
+        "\n",
+        "You can safely ignore this."
+      ],
+      "metadata": {
+        "id": "aN0dbYO7oB47"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "dataset = load_dataset(\"CShorten/ML-ArXiv-Papers\")"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "IVnpj5bBoEBL",
+        "outputId": "bc6371d9-d66f-482c-95f8-8cdb89e4f0ef"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stderr",
+          "text": [
+            "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:89: UserWarning: \n",
+            "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
+            "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n",
+            "You will be able to reuse this secret in all of your notebooks.\n",
+            "Please note that authentication is recommended but still optional to access public models or datasets.\n",
+            "  warnings.warn(\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "### Index into Elasticsearch\n",
+        "\n",
+        "We will loop through the dataset and send batches of rows to Elasticsearch\n",
+        "- This may take a couple minutes depending on your cluster sizing."
+      ],
+      "metadata": {
+        "id": "GQxDITCpAKWb"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def bulk_insert_elasticsearch(dataset, index_name, chunk_size=1000):\n",
+        "    actions = []\n",
+        "    for record in dataset:\n",
+        "        action = {\n",
+        "            \"_index\": index_name,\n",
+        "            \"_source\": {\"title\": record[\"title\"], \"abstract\": record[\"abstract\"]},\n",
+        "        }\n",
+        "        actions.append(action)\n",
+        "\n",
+        "        if len(actions) == chunk_size:\n",
+        "            helpers.bulk(es, actions)\n",
+        "            actions = []\n",
+        "\n",
+        "    if actions:\n",
+        "        helpers.bulk(es, actions)\n",
+        "\n",
+        "\n",
+        "bulk_insert_elasticsearch(dataset[\"train\"], index_name)"
+      ],
+      "metadata": {
+        "id": "tDZ0qEbW-ozW"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Query with Reranking\n",
+        "\n",
+        "This contains a `text_similarity_reranker` retriever, which:\n",
+        "\n",
+        "- Uses a standard retriever to:\n",
+        "  - Perform a lexical query against the `title` field\n",
+        "- Performs a reranking:\n",
+        "  - Takes as input the top 100 results from the previous search\n",
+        "  - `rank_window_size`: 100\n",
+        "  - Takes as input the query\n",
+        "  - `inference_text`: query\n",
+        "- Uses our previously created reranking API and model\n",
+        "\n"
+      ],
+      "metadata": {
+        "id": "2bwvzLfRjJ2n"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "query = \"sparse vector embedding\"\n",
+        "\n",
+        "# Query scored from score\n",
+        "response_scored = es.search(\n",
+        "    index=\"arxiv-papers\",\n",
+        "    body={\n",
+        "        \"size\": 10,\n",
+        "        \"retriever\": {\"standard\": {\"query\": {\"match\": {\"title\": query}}}},\n",
+        "        \"fields\": [\"title\", \"abstract\"],\n",
+        "        \"_source\": False,\n",
+        "    },\n",
+        ")\n",
+        "\n",
+        "# Query with Semantic Reranker\n",
+        "response_reranked = es.search(\n",
+        "    index=\"arxiv-papers\",\n",
+        "    body={\n",
+        "        \"size\": 10,\n",
+        "        \"retriever\": {\n",
+        "            \"text_similarity_reranker\": {\n",
+        "                \"retriever\": {\"standard\": {\"query\": {\"match\": {\"title\": query}}}},\n",
+        "                \"field\": \"abstract\",\n",
+        "                \"inference_id\": \"semantic-reranking\",\n",
+        "                \"inference_text\": query,\n",
+        "                \"rank_window_size\": 100,\n",
+        "            }\n",
+        "        },\n",
+        "        \"fields\": [\"title\", \"abstract\"],\n",
+        "        \"_source\": False,\n",
+        "    },\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "HWXQBS35jQ3n"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Print the table comparing the scored and reranked results"
+      ],
+      "metadata": {
+        "id": "Hnam80Irbj6a"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "titles_scored = [\n",
+        "    paper[\"fields\"][\"title\"][0] for paper in response_scored.body[\"hits\"][\"hits\"]\n",
+        "]\n",
+        "titles_reranked = [\n",
+        "    paper[\"fields\"][\"title\"][0] for paper in response_reranked.body[\"hits\"][\"hits\"]\n",
+        "]\n",
+        "\n",
+        "# Creating a DataFrame\n",
+        "df = pd.DataFrame(\n",
+        "    {\"Scored Results\": titles_scored, \"Reranked Results\": titles_reranked}\n",
+        ")\n",
+        "\n",
+        "df_styled = df.style.set_properties(**{\"text-align\": \"left\"}).set_caption(\n",
+        "    f\"Comparison of Scored and Semantic Reranked Results - Query: '{query}'\"\n",
+        ")\n",
+        "\n",
+        "# Display the table\n",
+        "df_styled"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 415
+        },
+        "id": "yTTNYCYcBtll",
+        "outputId": "0f0af538-13fd-4e62-e3d2-1dd185689904"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "<pandas.io.formats.style.Styler at 0x7dad78316950>"
+            ],
+            "text/html": [
+              "<style type=\"text/css\">\n",
+              "#T_b97e6_row0_col0, #T_b97e6_row0_col1, #T_b97e6_row1_col0, #T_b97e6_row1_col1, #T_b97e6_row2_col0, #T_b97e6_row2_col1, #T_b97e6_row3_col0, #T_b97e6_row3_col1, #T_b97e6_row4_col0, #T_b97e6_row4_col1, #T_b97e6_row5_col0, #T_b97e6_row5_col1, #T_b97e6_row6_col0, #T_b97e6_row6_col1, #T_b97e6_row7_col0, #T_b97e6_row7_col1, #T_b97e6_row8_col0, #T_b97e6_row8_col1, #T_b97e6_row9_col0, #T_b97e6_row9_col1 {\n",
+              "  text-align: left;\n",
+              "}\n",
+              "</style>\n",
+              "<table id=\"T_b97e6\" class=\"dataframe\">\n",
+              "  <caption>Comparison of Scored and Semantic Reranked Results - Query: 'sparse vector embedding'</caption>\n",
+              "  <thead>\n",
+              "    <tr>\n",
+              "      <th class=\"blank level0\" >&nbsp;</th>\n",
+              "      <th id=\"T_b97e6_level0_col0\" class=\"col_heading level0 col0\" >Scored Results</th>\n",
+              "      <th id=\"T_b97e6_level0_col1\" class=\"col_heading level0 col1\" >Reranked Results</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
+              "      <td id=\"T_b97e6_row0_col0\" class=\"data row0 col0\" >Compact Speaker Embedding: lrx-vector</td>\n",
+              "      <td id=\"T_b97e6_row0_col1\" class=\"data row0 col1\" >Scaling Up Sparse Support Vector Machines by Simultaneous Feature and\n",
+              "  Sample Reduction</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
+              "      <td id=\"T_b97e6_row1_col0\" class=\"data row1 col0\" >Quantum Sparse Support Vector Machines</td>\n",
+              "      <td id=\"T_b97e6_row1_col1\" class=\"data row1 col1\" >Spaceland Embedding of Sparse Stochastic Graphs</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
+              "      <td id=\"T_b97e6_row2_col0\" class=\"data row2 col0\" >Sparse Support Vector Infinite Push</td>\n",
+              "      <td id=\"T_b97e6_row2_col1\" class=\"data row2 col1\" >Elliptical Ordinal Embedding</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
+              "      <td id=\"T_b97e6_row3_col0\" class=\"data row3 col0\" >The Sparse Vector Technique, Revisited</td>\n",
+              "      <td id=\"T_b97e6_row3_col1\" class=\"data row3 col1\" >Minimum-Distortion Embedding</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
+              "      <td id=\"T_b97e6_row4_col0\" class=\"data row4 col0\" >L-Vector: Neural Label Embedding for Domain Adaptation</td>\n",
+              "      <td id=\"T_b97e6_row4_col1\" class=\"data row4 col1\" >Free Gap Information from the Differentially Private Sparse Vector and\n",
+              "  Noisy Max Mechanisms</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
+              "      <td id=\"T_b97e6_row5_col0\" class=\"data row5 col0\" >Spaceland Embedding of Sparse Stochastic Graphs</td>\n",
+              "      <td id=\"T_b97e6_row5_col1\" class=\"data row5 col1\" >Interpolated Discretized Embedding of Single Vectors and Vector Pairs\n",
+              "  for Classification, Metric Learning and Distance Approximation</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row6\" class=\"row_heading level0 row6\" >6</th>\n",
+              "      <td id=\"T_b97e6_row6_col0\" class=\"data row6 col0\" >Sparse Signal Recovery in the Presence of Intra-Vector and Inter-Vector\n",
+              "  Correlation</td>\n",
+              "      <td id=\"T_b97e6_row6_col1\" class=\"data row6 col1\" >Attention Word Embedding</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row7\" class=\"row_heading level0 row7\" >7</th>\n",
+              "      <td id=\"T_b97e6_row7_col0\" class=\"data row7 col0\" >Stable Sparse Subspace Embedding for Dimensionality Reduction</td>\n",
+              "      <td id=\"T_b97e6_row7_col1\" class=\"data row7 col1\" >Binary Speaker Embedding</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row8\" class=\"row_heading level0 row8\" >8</th>\n",
+              "      <td id=\"T_b97e6_row8_col0\" class=\"data row8 col0\" >Auto-weighted Mutli-view Sparse Reconstructive Embedding</td>\n",
+              "      <td id=\"T_b97e6_row8_col1\" class=\"data row8 col1\" >NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th id=\"T_b97e6_level0_row9\" class=\"row_heading level0 row9\" >9</th>\n",
+              "      <td id=\"T_b97e6_row9_col0\" class=\"data row9 col0\" >Embedding Words in Non-Vector Space with Unsupervised Graph Learning</td>\n",
+              "      <td id=\"T_b97e6_row9_col1\" class=\"data row9 col1\" >Estimating Vector Fields on Manifolds and the Embedding of Directed\n",
+              "  Graphs</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 17
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Print out Title and Abstract\n",
+        "This will print the title and the abstract for the top 10 results after semantic reranking."
+      ],
+      "metadata": {
+        "id": "A0HyNZoWyeun"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "for paper in response_reranked.body[\"hits\"][\"hits\"]:\n",
+        "    print(\n",
+        "        f\"Title {paper['fields']['title'][0]} \\n  Abstract: {paper['fields']['abstract'][0]}\"\n",
+        "    )"
+      ],
+      "metadata": {
+        "id": "4ZEx-46rn3in",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "e83318ad-ca42-4aa7-98d4-37c4428eb70a"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Title Compact Speaker Embedding: lrx-vector \n",
+            "  Abstract:   Deep neural networks (DNN) have recently been widely used in speaker\n",
+            "recognition systems, achieving state-of-the-art performance on various\n",
+            "benchmarks. The x-vector architecture is especially popular in this research\n",
+            "community, due to its excellent performance and manageable computational\n",
+            "complexity. In this paper, we present the lrx-vector system, which is the\n",
+            "low-rank factorized version of the x-vector embedding network. The primary\n",
+            "objective of this topology is to further reduce the memory requirement of the\n",
+            "speaker recognition system. We discuss the deployment of knowledge distillation\n",
+            "for training the lrx-vector system and compare against low-rank factorization\n",
+            "with SVD. On the VOiCES 2019 far-field corpus we were able to reduce the\n",
+            "weights by 28% compared to the full-rank x-vector system while keeping the\n",
+            "recognition rate constant (1.83% EER).\n",
+            "\n",
+            "Title Quantum Sparse Support Vector Machines \n",
+            "  Abstract:   We analyze the computational complexity of Quantum Sparse Support Vector\n",
+            "Machine, a linear classifier that minimizes the hinge loss and the $L_1$ norm\n",
+            "of the feature weights vector and relies on a quantum linear programming solver\n",
+            "instead of a classical solver. Sparse SVM leads to sparse models that use only\n",
+            "a small fraction of the input features in making decisions, and is especially\n",
+            "useful when the total number of features, $p$, approaches or exceeds the number\n",
+            "of training samples, $m$. We prove a $\\Omega(m)$ worst-case lower bound for\n",
+            "computational complexity of any quantum training algorithm relying on black-box\n",
+            "access to training samples; quantum sparse SVM has at least linear worst-case\n",
+            "complexity. However, we prove that there are realistic scenarios in which a\n",
+            "sparse linear classifier is expected to have high accuracy, and can be trained\n",
+            "in sublinear time in terms of both the number of training samples and the\n",
+            "number of features.\n",
+            "\n",
+            "Title Sparse Support Vector Infinite Push \n",
+            "  Abstract:   In this paper, we address the problem of embedded feature selection for\n",
+            "ranking on top of the list problems. We pose this problem as a regularized\n",
+            "empirical risk minimization with $p$-norm push loss function ($p=\\infty$) and\n",
+            "sparsity inducing regularizers. We leverage the issues related to this\n",
+            "challenging optimization problem by considering an alternating direction method\n",
+            "of multipliers algorithm which is built upon proximal operators of the loss\n",
+            "function and the regularizer. Our main technical contribution is thus to\n",
+            "provide a numerical scheme for computing the infinite push loss function\n",
+            "proximal operator. Experimental results on toy, DNA microarray and BCI problems\n",
+            "show how our novel algorithm compares favorably to competitors for ranking on\n",
+            "top while using fewer variables in the scoring function.\n",
+            "\n",
+            "Title The Sparse Vector Technique, Revisited \n",
+            "  Abstract:   We revisit one of the most basic and widely applicable techniques in the\n",
+            "literature of differential privacy - the sparse vector technique [Dwork et al.,\n",
+            "STOC 2009]. This simple algorithm privately tests whether the value of a given\n",
+            "query on a database is close to what we expect it to be. It allows to ask an\n",
+            "unbounded number of queries as long as the answer is close to what we expect,\n",
+            "and halts following the first query for which this is not the case.\n",
+            "  We suggest an alternative, equally simple, algorithm that can continue\n",
+            "testing queries as long as any single individual does not contribute to the\n",
+            "answer of too many queries whose answer deviates substantially form what we\n",
+            "expect. Our analysis is subtle and some of its ingredients may be more widely\n",
+            "applicable. In some cases our new algorithm allows to privately extract much\n",
+            "more information from the database than the original.\n",
+            "  We demonstrate this by applying our algorithm to the shifting heavy-hitters\n",
+            "problem: On every time step, each of $n$ users gets a new input, and the task\n",
+            "is to privately identify all the current heavy-hitters. That is, on time step\n",
+            "$i$, the goal is to identify all data elements $x$ such that many of the users\n",
+            "have $x$ as their current input. We present an algorithm for this problem with\n",
+            "improved error guarantees over what can be obtained using existing techniques.\n",
+            "Specifically, the error of our algorithm depends on the maximal number of times\n",
+            "that a single user holds a heavy-hitter as input, rather than the total number\n",
+            "of times in which a heavy-hitter exists.\n",
+            "\n",
+            "Title L-Vector: Neural Label Embedding for Domain Adaptation \n",
+            "  Abstract:   We propose a novel neural label embedding (NLE) scheme for the domain\n",
+            "adaptation of a deep neural network (DNN) acoustic model with unpaired data\n",
+            "samples from source and target domains. With NLE method, we distill the\n",
+            "knowledge from a powerful source-domain DNN into a dictionary of label\n",
+            "embeddings, or l-vectors, one for each senone class. Each l-vector is a\n",
+            "representation of the senone-specific output distributions of the source-domain\n",
+            "DNN and is learned to minimize the average L2, Kullback-Leibler (KL) or\n",
+            "symmetric KL distance to the output vectors with the same label through simple\n",
+            "averaging or standard back-propagation. During adaptation, the l-vectors serve\n",
+            "as the soft targets to train the target-domain model with cross-entropy loss.\n",
+            "Without parallel data constraint as in the teacher-student learning, NLE is\n",
+            "specially suited for the situation where the paired target-domain data cannot\n",
+            "be simulated from the source-domain data. We adapt a 6400 hours\n",
+            "multi-conditional US English acoustic model to each of the 9 accented English\n",
+            "(80 to 830 hours) and kids' speech (80 hours). NLE achieves up to 14.1%\n",
+            "relative word error rate reduction over direct re-training with one-hot labels.\n",
+            "\n",
+            "Title Spaceland Embedding of Sparse Stochastic Graphs \n",
+            "  Abstract:   We introduce a nonlinear method for directly embedding large, sparse,\n",
+            "stochastic graphs into low-dimensional spaces, without requiring vertex\n",
+            "features to reside in, or be transformed into, a metric space. Graph data and\n",
+            "models are prevalent in real-world applications. Direct graph embedding is\n",
+            "fundamental to many graph analysis tasks, in addition to graph visualization.\n",
+            "We name the novel approach SG-t-SNE, as it is inspired by and builds upon the\n",
+            "core principle of t-SNE, a widely used method for nonlinear dimensionality\n",
+            "reduction and data visualization. We also introduce t-SNE-$\\Pi$, a\n",
+            "high-performance software for 2D, 3D embedding of large sparse graphs on\n",
+            "personal computers with superior efficiency. It empowers SG-t-SNE with modern\n",
+            "computing techniques for exploiting in tandem both matrix structures and memory\n",
+            "architectures. We present elucidating embedding results on one synthetic graph\n",
+            "and four real-world networks.\n",
+            "\n",
+            "Title Sparse Signal Recovery in the Presence of Intra-Vector and Inter-Vector\n",
+            "  Correlation \n",
+            "  Abstract:   This work discusses the problem of sparse signal recovery when there is\n",
+            "correlation among the values of non-zero entries. We examine intra-vector\n",
+            "correlation in the context of the block sparse model and inter-vector\n",
+            "correlation in the context of the multiple measurement vector model, as well as\n",
+            "their combination. Algorithms based on the sparse Bayesian learning are\n",
+            "presented and the benefits of incorporating correlation at the algorithm level\n",
+            "are discussed. The impact of correlation on the limits of support recovery is\n",
+            "also discussed highlighting the different impact intra-vector and inter-vector\n",
+            "correlations have on such limits.\n",
+            "\n",
+            "Title Stable Sparse Subspace Embedding for Dimensionality Reduction \n",
+            "  Abstract:   Sparse random projection (RP) is a popular tool for dimensionality reduction\n",
+            "that shows promising performance with low computational complexity. However, in\n",
+            "the existing sparse RP matrices, the positions of non-zero entries are usually\n",
+            "randomly selected. Although they adopt uniform sampling with replacement, due\n",
+            "to large sampling variance, the number of non-zeros is uneven among rows of the\n",
+            "projection matrix which is generated in one trial, and more data information\n",
+            "may be lost after dimension reduction. To break this bottleneck, based on\n",
+            "random sampling without replacement in statistics, this paper builds a stable\n",
+            "sparse subspace embedded matrix (S-SSE), in which non-zeros are uniformly\n",
+            "distributed. It is proved that the S-SSE is stabler than the existing matrix,\n",
+            "and it can maintain Euclidean distance between points well after dimension\n",
+            "reduction. Our empirical studies corroborate our theoretical findings and\n",
+            "demonstrate that our approach can indeed achieve satisfactory performance.\n",
+            "\n",
+            "Title Auto-weighted Mutli-view Sparse Reconstructive Embedding \n",
+            "  Abstract:   With the development of multimedia era, multi-view data is generated in\n",
+            "various fields. Contrast with those single-view data, multi-view data brings\n",
+            "more useful information and should be carefully excavated. Therefore, it is\n",
+            "essential to fully exploit the complementary information embedded in multiple\n",
+            "views to enhance the performances of many tasks. Especially for those\n",
+            "high-dimensional data, how to develop a multi-view dimension reduction\n",
+            "algorithm to obtain the low-dimensional representations is of vital importance\n",
+            "but chanllenging. In this paper, we propose a novel multi-view dimensional\n",
+            "reduction algorithm named Auto-weighted Mutli-view Sparse Reconstructive\n",
+            "Embedding (AMSRE) to deal with this problem. AMSRE fully exploits the sparse\n",
+            "reconstructive correlations between features from multiple views. Furthermore,\n",
+            "it is equipped with an auto-weighted technique to treat multiple views\n",
+            "discriminatively according to their contributions. Various experiments have\n",
+            "verified the excellent performances of the proposed AMSRE.\n",
+            "\n",
+            "Title Embedding Words in Non-Vector Space with Unsupervised Graph Learning \n",
+            "  Abstract:   It has become a de-facto standard to represent words as elements of a vector\n",
+            "space (word2vec, GloVe). While this approach is convenient, it is unnatural for\n",
+            "language: words form a graph with a latent hierarchical structure, and this\n",
+            "structure has to be revealed and encoded by word embeddings. We introduce\n",
+            "GraphGlove: unsupervised graph word representations which are learned\n",
+            "end-to-end. In our setting, each word is a node in a weighted graph and the\n",
+            "distance between words is the shortest path distance between the corresponding\n",
+            "nodes. We adopt a recent method learning a representation of data in the form\n",
+            "of a differentiable weighted graph and use it to modify the GloVe training\n",
+            "algorithm. We show that our graph-based representations substantially\n",
+            "outperform vector-based methods on word similarity and analogy tasks. Our\n",
+            "analysis reveals that the structure of the learned graphs is hierarchical and\n",
+            "similar to that of WordNet, the geometry is highly non-trivial and contains\n",
+            "subgraphs with different local topology.\n",
+            "\n"
+          ]
+        }
+      ]
     }
-   ]
-  }
- ]
+  ]
 }
\ No newline at end of file