
### Using [Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) - [Meta](https://www.meta.com/at/en/)  

##### Running on g4dn.xlarge

_launched on 25.09.24_


•	small and medium-sized vision LLMs: 11B and 90B


•	lightweight, text-only models: 1B and **3B** - for edge devices 


with a 128 000 token context length.


Vision models can be used to understand documents, answer questions based on visual content or caption images. 


Multimodal models are not accessible from the European Union, so we will only test text-based models.


In [None]:
!pip install --upgrade transformers
dbutils.library.restartPython()

In [None]:
from huggingface_hub import login

huggingface_token = "<your token>"

login(token=huggingface_token)

In [None]:
import torch
from transformers import pipeline

model_id = "meta-llama/Llama-3.2-1B"

instruct = pipeline(
    "text-generation", 
    model=model_id, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)

####Text Generation

In [None]:
import time

prompt_generation = "Write me all about Arthur Schopenhauer."

#Tracking the time
start_time = time.time()
response = instruct(prompt_generation, max_new_tokens=200)
print(response[0]['generated_text'] if isinstance(response, list) else response['generated_text'])
end_time = time.time()
latency = end_time - start_time

_We can notice that the model is hallucinating._

In [None]:
print('Latency of the model is', latency)

####Text Summarization

In [None]:
template = """Write a short summary of this article for a business expert:

{article}
"""

cameroon = """The economic fallout from the COVID-19 pandemic and the subsequent global shocks provoked by the war in Ukraine have hit African countries hard, denting economic growth and aggravating their sovereign debt positions. The International Monetary Fund (IMF) forecasts that Cameroon, a Central African oil producer, will record 4.3% economic growth this year after it slumped to 0.5% in 2020. The Fund has classified Cameroon as being at high risk of debt distress, though in its most recent review of the country's loan programme it stated that, with active fiscal reforms and management, the debt could be sustainable. "Our debt service coverage from exports needs to be improved. That's the reason why we are ranked in a high risk debt distress position," said Alamine Ousmane Mey, Cameroon's minister of economy, planning and regional development. He was speaking at an event organised by the Atlantic Council think tank on the sidelines of the IMF and World Bank's Spring Meetings in Washington. "We're working to be able to improve our exports through import substitution policies to reduce imports, produce more and export more. This will give us better room for debt service coverage," he said. Cameroon has also relaunched talks with the U.S. to end its suspension from the Africa Growth and Opportunities Act (AGOA) initiative, which grants qualifying African countries tariff-free access to the U.S. market. Former President Donald Trump suspended Cameroon from the programme in late 2019 over "persistent gross violations of internationally recognised human rights" by Cameroonian security forces. Since 2017, factions of secessionist militias have been battling government troops in the majority Francophone country's two English-speaking regions. The conflict has killed thousands and displaced nearly 800,000 people. "All the issues that have been raised, we're working on in a very transparent open manner to be able to iron them out and solve the problems," Mey said, referring to the talks with U.S. officials to rejoin AGOA. Our Standards: The Thomson Reuters Trust Principles.
"""

response = instruct(template.format(article=cameroon), max_new_tokens=120)
print(response[0]['generated_text'] if isinstance(response, list) else response['generated_text'])

####Coding Task

_Code challenge from: https://edabit.com/challenge/ZdnwC3PsXPQTdTiKf_:

In [None]:
coding_template = """Write a code in Python to solve the following task:

{task}

Starter: 

{starter}
"""

coding_test = """Create a Python function that takes two numbers and a mathematical operator + - / * and will perform a calculation with the given numbers.If the input tries to divide by 0, return: Can't divide by 0! """

# examples = """calculator(2, "+", 2) ➞ 4

# calculator(2, "*", 2) ➞ 4

# calculator(4, "/", 2) ➞ 2"""

starter = """def calculator(num1, operator, num2):"""


response = instruct(coding_template.format(task=coding_test, starter=starter), max_new_tokens=1000)
print(response[0]['generated_text'] if isinstance(response, list) else response['generated_text'])

_Coding challenge from: https://edabit.com/challenge/3A3mHS5B3NNZddQL2_:

In [None]:
coding_template = """Write a code in Python to solve the following task:

{task}

Starter:

{starter}

"""

coding_test = """Create a function:
    
to check if a candidate is qualified in an imaginary coding interview of an imaginary tech startup.

The criteria for a candidate to be qualified in the coding interview is:
The candidate should have complete all the questions.
The maximum time given to complete the interview is 120 minutes.
The maximum time given for very easy questions is 5 minutes each.
The maximum time given for easy questions is 10 minutes each.
The maximum time given for medium questions is 15 minutes each.
The maximum time given for hard questions is 20 minutes each.
If all the above conditions are satisfied, return "qualified", else return "disqualified".

You will be given a list of time taken by a candidate to solve a particular question and the total time taken by the candidate to complete the interview.

Given a list , in a true condition will always be in the format [very easy, very easy, easy, easy, medium, medium, hard, hard].

The maximum time to complete the interview includes a buffer time of 20 minutes."""

starter = """def interview(lst, tot):"""


response = instruct(coding_template.format(task=coding_test, starter=starter), max_new_tokens=500)
print(response[0]['generated_text'] if isinstance(response, list) else response['generated_text'])

The minimum cluster configuration needed to run the model is **g4dn.xlarge[T4]** - 16 GB memory, 1 GPU.

Cost: 0.71 dbu/h.

Latency of the model: 6s.