# Connecting to a LLM hosted at a Hugging Face endpoint

For documentation see https://python.langchain.com/docs/integrations/llms/huggingface_pipelines


In [None]:
# We will be using LangChain
!pip install --quiet langchain langchain_community
import langchain_community
from langchain_community.llms import HuggingFaceEndpoint

# Using Huggingface endpoints

From https://python.langchain.com/docs/integrations/llms/huggingface_endpoint

Huggingface serverless inference is described here:

https://huggingface.co/docs/api-inference/index

This explains how to get an API token from your account, and how to construct the model URL. It also gives code for using a model, but we will instead used Langchain to interface to the Huggingface API.

You can set up your own dedicated endpoint to which you deploy a model, giving better availability than the public endpoints. There is a cost:

https://huggingface.co/inference-endpoints/dedicated

UI for starting and stopping and configuring endpoints is here:

https://ui.endpoints.huggingface.co/angusroberts/endpoints


protected endpoint - seems to need own token
public endpoint - still needs a token, but can be any




## Create a Hugging Face account and API token

We will be using a LLM hosted on a Hugging Face endpoint. In order to use this, you will need a Hugging Face account and API token.

### Hugging Face account




1.   Open https://huggingface.co/
1.   Click the "Sign Up" button at top right
1.   Fill in your email address and a new password and click "Next"
1.   Enter details on your user profile page. You will need to enter a username and full name, the rest is optional
1.   Select the terms and conditions check box
1.   Click "Create Account"
1.   You will be sent a verificaiton email with a link to click, completing account creation

### Hugging Face API token

1. Log in to Hugging Face https://huggingface.co/
1. Click your account icon at top right
1. Select the "Settings" option towards the bottom of the menu
1. From the Settings menu on the left of the window, select "Access Tokens"
1. Click the "New token" button
1. In the dialog box, give your token a name, and choose token type of "Read", then click "Generate token"
1. Your token will appear as a string hidden by asterisks
1. Copy the token to your clipboard by clicking the copy icon just to its right
1. In the next code cell, you will paste it in to a variable and make it available to your local environment








In [None]:
# get a token: https://huggingface.co/docs/api-inference/quicktour#get-your-api-token

from getpass import getpass
import os

HUGGINGFACEHUB_API_TOKEN = getpass(prompt = 'Paste in your Hugging Face API token and press enter:')

# We put the token in an environment variable, from where Langchain will access it when needed
os.environ["HUGGINGFACEHUB_API_TOKEN"] = HUGGINGFACEHUB_API_TOKEN

In [None]:
# This is where our endpoint is hosted
endpoint_url = "https://j278zkynwm0b3dky.eu-west-1.aws.endpoints.huggingface.cloud"

In [None]:
# Now we will create a LangChain LLM from a HuggingFaceEndpoint
# at our endpoint URL
llm = HuggingFaceEndpoint(
    endpoint_url=endpoint_url,
    add_to_git_credential=True,
    max_new_tokens=512,
    top_k=10,
    top_p=0.95,
    typical_p=0.95,
    temperature=0.01,
    repetition_penalty=1.03,
)

In [None]:
# We should now be able to interact with the endpoint, e.g. by sending a prompt
# and getting back a response
llm.invoke("Why did the chicken cross the road?")