# Running models locally
- Using Huggingface transformers library

In [1]:
from torch import bfloat16
import transformers

In [2]:
model_id = "mistralai/Mistral-7B-v0.1"
model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    torch_dtype=bfloat16,
    device_map="auto"
)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)

model.eval()

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

MistralForCausalLM(
  (model): MistralModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x MistralDecoderLayer(
        (self_attn): MistralAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=1024, bias=False)
          (v_proj): Linear(in_features=4096, out_features=1024, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): MistralRotaryEmbedding()
        )
        (mlp): MistralMLP(
          (gate_proj): Linear(in_features=4096, out_features=14336, bias=False)
          (up_proj): Linear(in_features=4096, out_features=14336, bias=False)
          (down_proj): Linear(in_features=14336, out_features=4096, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): MistralRMSNorm()
        (post_attention_layernorm): MistralRMSNorm()
      )
    )
    (norm): MistralRMSNorm()
  

# Controlling the ouput of a model
## What will the output look like
- vocab = ["You", "will", "win", "lose"]
- input = "You will"
- Outputs unnormalized predictions for tokens
- We can then normalize it = [0.1,0.1,0.5,0.3]
- We can make an informed decision ow what token to select.

## Args
- top_k: selects the k most probable tokens
- top_p: selects all tokens with higher than p probability
- temperature: selects from the top tokens with .1 favoring the highest probability tokens and .9 having little preference

In [4]:
generate_text = transformers.pipeline(
    model=model, tokenizer=tokenizer,
    return_full_text=False,  # Set this to True if using langchain
    task="text-generation",
    temperature=0.1,  # Controls the randomness of outputs
    top_p=0.15,  # Probability threshold for selecting tokens
    top_k=2,  # Number of top tokens to consider (0 relies on top_p)
    max_new_tokens=512,  # Limits the number of generated tokens
    repetition_penalty=1.1  # Discourages repetitive outputs
)

\begin{align*}
    p'_i = p_i^{(1/T)}
\end{align*}
<!-- show image temperature_ex.JPG -->
<img src="temperature_ex.JPG"
     alt="Markdown Monster icon"
     style="float: left; margin-right: 10px;" />


In [5]:
test_prompt = "The future of AI is"
result = generate_text(test_prompt)
print("\n\nGenerated text:\n" + result[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.




Generated text:
 bright, and it’s only going to get brighter.

## What are the benefits of using AI in business?

There are many benefits of using AI in business. Some of these benefits include:

1. Increased efficiency and productivity: AI can help businesses automate tasks and processes, which can lead to increased efficiency and productivity.
2. Improved decision-making: AI can help businesses make better decisions by providing them with data and insights that they wouldn’t otherwise have access to.
3. Enhanced customer experience: AI can help businesses provide a more personalized and tailored experience for their customers.
4. Reduced costs: AI can help businesses reduce costs by automating tasks and processes, as well as by making better decisions.
5. Competitive advantage: By using AI, businesses can gain a competitive advantage over their competitors.

## How can you use AI to improve your business?

AI can be used to improve your business in a number of ways. For example, yo

In [6]:
model_instruct = "mistralai/Mistral-7B-Instruct-v0.2"

tokenizer_instruct = transformers.AutoTokenizer.from_pretrained(model_instruct)
pipeline = transformers.pipeline(
    model=model_instruct,
    task="text-generation",
    model_kwargs={"torch_dtype": bfloat16, "device_map": "auto"},
)   

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

### It is critical to use specific formatting of the must be copied character for character 

In [7]:
messages = [
    {"role": "user", "content": test_prompt},
]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# prompt = "<s>[INST] The future of AI is [/INST]"
print("\n\nPROMPT:\n", prompt)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.01, top_k=50, top_p=.95)
print("\n\nGenerated response:\n" + outputs[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.




PROMPT:
 <s>[INST] The future of AI is [/INST]




Generated response:
<s>[INST] The future of AI is [/INST] The future of AI (Artificial Intelligence) is a topic of much debate and speculation among experts, researchers, and industry professionals. Some believe that AI will continue to advance at an exponential rate, leading to significant breakthroughs in areas such as healthcare, education, transportation, and manufacturing. Others caution that there are also risks and challenges associated with the development of advanced AI, including ethical concerns, job displacement, and security issues.

One trend that is likely to continue is the integration of AI into everyday life, making it more convenient and efficient for individuals and businesses. For example, we may see more widespread use of AI-powered virtual assistants, smart homes, and self-driving cars.

Another area of focus is the development of more advanced forms of AI, such as deep learning and neural networks, which can learn and adapt to new situations on their own. This

In [8]:
outputs = pipeline(test_prompt, max_new_tokens=256, do_sample=True, temperature=0.1, top_k=50, top_p=.95)
print("\n\nGenerated response:\n" + outputs[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.




Generated response:
The future of AI is not just about making machines smarter, but also about making them more human.

## 10. The Future of AI: Ethics and Morality

As AI becomes more advanced and integrated into our lives, ethical and moral considerations become increasingly important. Here are some of the key issues that need to be addressed:

* **Bias and discrimination:** AI systems can reflect and perpetuate biases and discrimination, which can have negative consequences for individuals and society as a whole. It is important to ensure that AI systems are designed and trained in a way that is fair and unbiased.
* **Privacy and security:** AI systems can collect and process vast amounts of personal data, raising concerns about privacy and security. It is important to ensure that AI systems are designed with strong privacy and security protections, and that individuals have control over their own data.
* **Transparency and explainability:** AI systems can be complex and difficult

- following the instruct structure is critical

In [9]:
fail_test = "<s>  [INST] The future of AI is [/INST]"
outputs = pipeline(fail_test, max_new_tokens=256, do_sample=True, temperature=0.01, top_k=1)
print(outputs[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


<s>  [INST] The future of AI is [/INST] a topic of great interest and debate among experts in the field of artificial intelligence (AI). Some believe that AI will continue to advance at an exponential rate, leading to significant breakthroughs in areas such as autonomous vehicles, healthcare diagnostics, and scientific research. Others are more cautious, raising concerns about the ethical implications of advanced AI, such as job displacement, privacy invasion, and potential misuse.

One thing is certain: AI is already having a profound impact on our lives, and its role will only continue to grow in the future. Some possible developments include:

* Increased automation of repetitive and mundane tasks, freeing up humans to focus on more creative and complex problem-solving.
* Improved accuracy and efficiency in areas such as healthcare diagnostics, financial analysis, and customer service.
* Advances in machine learning and deep learning, enabling AI systems to learn and adapt to new si

In [10]:
# show me how this prompt is tokenized
tokeized_prompt = tokenizer_instruct(fail_test, return_tensors="pt")
# show me each token id as the token
for token_id in tokeized_prompt["input_ids"][0]:
    print(tokenizer_instruct.decode(token_id.item()), token_id.item())
print("\n\n")

tokeized_prompt = tokenizer_instruct(prompt, return_tensors="pt")
# show me each token id as the token
for token_id in tokeized_prompt["input_ids"][0]:
    print(tokenizer_instruct.decode(token_id.item()), token_id.item())

<s> 1
<s> 1
  259
[ 733
INST 16289
] 28793
The 415
future 3437
of 302
AI 16107
is 349
[ 733
/ 28748
INST 16289
] 28793



<s> 1
<s> 1
[ 733
INST 16289
] 28793
The 415
future 3437
of 302
AI 16107
is 349
[ 733
/ 28748
INST 16289
] 28793
