## Using LLMs programmatically using HuggingFace Models

During the challenge we used LLMs by typing in our prompts and we enhanced our prompts through various Prompt Engineering methods. Writing prompts into a chat style interface is certainly one way of interacting with a LLM, however it is not the only way we could use LLMs.

Imagine you've crafted a prompt to assess the quality of a defect report, perhaps it scores the defect report in terms of clarity and completeness and if the defect is lacking the LLM outputs a set of questions for the defect author. 

This might be a useful addition to your defect handling workflow but let's face it, if you needed to type (or copy) the prompt and the defect description into a Chat Interface for each defect it will quickly become tedious.

Instead, we can programmatically extract any new defects and for each one call an LLM to evaluate the defect report. We can achieve this in a straightforward manner using LLM from HuggingFace and a bit of Python code.

> For this walkthrough, you do not need to write any code. The code presented here is complete and designed to show we could integrate AI models. If you want to learn more about the coding aspect of integrating AI models then I would suggest you start with the HuggingFace NLP Course.

HuggingFace (https://huggingface.co) is a machine learning community that collaborates on building models and developing datasets. 

First we need to install some dependencies

In [None]:
!pip install transformers bitsandbytes>=0.39.0 accelerate -q

Next we will use the Huggingface Transformer Library to create a pre-trained model based on the Open-Source Mistral-7B model.

When using HuggingFace, this is all the code that is needed!
This will download the model (this may take some time to complete) 

In [ ]:
from transformers import AutoModelForCausalLM

model_name = "mistralai/Mistral-7B-v0.1" 

model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_4bit=True)

If you remember when you created your prompts earlier in the challenge, you used fairly natural langauge to specify what you wanted.

However, machines don't really understand text, they only deal with numbers so we need to perform a task called Tokenization, which converts the text we type into a numerical form that the model can process. Moreover, the method we use to tokenize our text needs to be one that the model understands.

Tokenization is an involved process but again, HuggingFace make this really easy for us and with just a few lines of code we can tokenize sentences.   

In [ ]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="left")


To understand what the Tokenizer does, you can run the following cell with some text to see what the model receives as input

In [ ]:
sample_input = "I've nearly completed the Ministry of Testing 30 Days of AI in Testing challenge"
sample_model_input = model_inputs = tokenizer([sample_input], return_tensors="pt")

print(sample_model_input)

As you can see from the above output, our text gets converted into a large set of numbers.

To interact with the LLM model we need to provide our prompt and call the *generate()* method for the model.
The model will output a *tokenized* version of the output that we then need to decode back into text. 

Yet again, HuggingFace makes this a straightforward task. For convince we've wrapped all the HuggingFace code for generation into the following function.

In [ ]:
def get_model_response(model, tokenizer,  model_prompt) -> str:
    """"
    This function wraps the calls to the tokenizer to tokenize the model_prompt, calls the model's generate function then decodes the response.
    """
    tokens = tokenizer([sample_input], return_tensors="pt").to("cuda")
    generated_ids = model.generate(**tokens)
    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    
    return response

Now we want to generate some output. You can change the prompt text to be whatever you want. 
You can also change the text and re-run the cell as many times as you want.

In [ ]:
prompt_text = "What are the key risks associated with using AI for decision making?"

print(get_model_response(model, tokenizer, prompt_text))

And that is it - using libraries such as HuggingFace can simplify the building of AI empowered tools. 
The simple and versatile programming library along with a large (and ever growing) pre-trained model hub is a powerful combination.

If you want to learn more about building using HuggingFace, I would recommend the HuggingFace NLP Course (https://huggingface.co/learn/nlp-course)
