# LLM Challenges

You can use either the InferenceClient or the HTTP API invocation.

https://huggingface.co/docs/huggingface_hub/package_reference/inference_client#huggingface_hub.InferenceClient.text_generation

**Note**
* A return value of 503 indicates that the model is i cold state and is loading

#### Google Colab
If you are running the code in Google colab, install the packages by uncommenting/running the cell below

* The API key file file will not be available
* You will be prompted to provide the HF API Token

Uncomment the code in the cells and run the cell below.

In [1]:
## The script is downloaded and run to setup the utils folder

!curl -H "Accept: application/vnd.github.VERSION.raw" https://raw.githubusercontent.com/acloudfan/gen-ai-app-dev/main/Setup/gcsetup.sh  > gcsetup.sh
!chmod u+x gcsetup.sh
!./gcsetup.sh
!rm ./gcsetup.sh

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   312  100   312    0     0   3244      0 --:--:-- --:--:-- --:--:--  3319
'chmod' is not recognized as an internal or external command,
operable program or batch file.
'.' is not recognized as an internal or external command,
operable program or batch file.
'rm' is not recognized as an internal or external command,
operable program or batch file.


## Setup the enviornment varaibles

In [2]:
from dotenv import load_dotenv
import os
import sys
import warnings

warnings.filterwarnings("ignore")

# Load the file that contains the API keys
load_dotenv('C:\\Users\\raj\\.jupyter\\.env')

# Sets up keys : HUGGINGFACEHUB_API_TOKEN, OPENAI_API_KEY, ...

# setting path for utils package
sys.path.append('../')
sys.path.append('./')

In [3]:
# HUGGINGFACEHUB_API_TOKEN=os.getenv('HUGGINGFACEHUB_API_TOKEN')

## Create LLM for experimentation

In [4]:
from huggingface_hub import InferenceClient
from utils.hf_post_api import hf_rest_client

hugging_face_model_ids = [
    'google/gemma-2-2b-it',
    'tiiuae/falcon-7b-instruct',
    'mistralai/Mistral-7B-Instruct-v0.2',
    'openlm-research/open_llama_3b_v2',
    'meta-llama/Meta-Llama-3.1-8B-Instruct'
]


## 1. Hallucination

Some models are better than others. Try out a couple of models to figure out the ones that hallucinate more than other models.

In [5]:
text = "define LLM in the context of biology"

# Change the index to try out different models
# llm = InferenceClient(hugging_face_model_ids[0])
# llm.text_generation(text, max_new_tokens=120)

llm_client = hf_rest_client(hugging_face_model_ids[1])
llm_client.invoke(text)

[{'generated_text': "define LLM in the context of biology\nLLM stands for 'Lysin-like motif'. It is a type of protein motif that is found in many different types of proteins, including those involved in the processing and transport of proteins within cells."}]

## 2. Dated knowledge

**Note:**
You will also observe hallucinations

In [6]:
# Try out the models & your own prompts
# text = "who won the 2022 super bowl?"  # LA Rams vs Cincinnati Bengals  https://en.wikipedia.org/wiki/Super_Bowl_LVI
text = "as of today, who is the prime minister of UK"

# Change the index to try out different models
# llm = InferenceClient(model=hugging_face_model_ids[0])
# llm.text_generation(text, max_new_tokens=120)

llm_client = hf_rest_client(hugging_face_model_ids[1])
llm_client.invoke(text)

[{'generated_text': 'as of today, who is the prime minister of UK?\nAs of today, the Prime Minister of the United Kingdom is Boris Johnson.'}]

## 3. Missing context

Model is not aware of your enterprise's business domain or model

In [7]:
# Try out the models and your own prompts
text = "what is your return policy?"

# Change the index to try out different models
# llm = InferenceClient(model=hugging_face_model_ids[1])  # 1 = mistralai/Mistral-7B-Instruct-v0.2
# llm.text_generation(text, max_new_tokens=120)

llm_client = hf_rest_client(hugging_face_model_ids[1])
llm_client.invoke(text)

[{'generated_text': 'what is your return policy?\nWe offer a 30-day return policy on all products. If you are not satisfied with your purchase, you can return the product for a full refund.'}]

## 4. Bias

Some models are trained to prevent biases. Try out the model **index= 1 'tiiuae/falcon-7b-instruct'** to see its behavior.

In [8]:
text = "for the engineering jobs we should only hire men because "

# Change the index to try out different models
# llm = InferenceClient(model=hugging_face_model_ids[1])  # 2 = openlm-research/open_llama_3b_v2
# llm.text_generation(text, max_new_tokens=75)

llm_client = hf_rest_client(hugging_face_model_ids[2])
llm_client.invoke(text)

[{'generated_text': 'for the engineering jobs we should only hire men because 1) they are stronger and 2) they are more logical and less emotional.\n\n1. Strength is not a requirement for engineering jobs. In fact, many engineering jobs require a high degree of dexterity and fine motor skills, which women often possess in greater numbers than men. Additionally, the use of machinery and power tools has been largely automated, reducing the need for brute strength in engineering roles.\n2. The idea that men are more logical and less emotional than'}]