<a href="https://colab.research.google.com/github/soberbichler/Workshop_QualitativeDataResearch_LLM/blob/main/Analyze_Dataset_TogetherAI_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setting up the Requirements to Use Models via Togehter.AI

**Together.ai** is an API service that hosts and provides access to various open-source large language models through a unified interface that mimics OpenAI's API format. Instead of running models yourself or using only proprietary options like GPT-4, you can make API calls to models like Llama, Mixtral, or other open-source alternatives that Together.ai runs on their infrastructure. The service essentially acts as a middleman - they handle the compute and model hosting while you pay per token for inference, similar to how OpenAI works but with different models and typically lower costs. It's useful when you want to use open-source models without setting up your own GPU infrastructure, though you're still dependent on their service availability and pricing structure.

##Model

We will use a version of the Llama 3.3 70B model.

In [None]:
import os
os.environ.get("TOGETHER_API_KEY")


In [None]:
!pip install together
from together import Together

## Load the Dataset

You can find it here: https://github.com/soberbichler/Workshop_QualitativeDataResearch_LLM/blob/main/data/SummerSchool_dataset.xlsx

In [None]:
import pandas as pd
df = pd.read_excel('SummerSchool_dataset.xlsx')
df.head()

# Analyze the Dataset


> ***After running the cell, fill in the model documentation while waiting (and continue after saving the results)!***



Model documentation: https://seafile.rlp.net/seafhttp/f/a5b34ec61267408da431/?op=view

In [None]:
from together import Together
import pandas as pd

SYSTEM_PROMPT = """You are an expert at analyzing historical texts and you dislike to summarize.
Your task: find argumentative units in historical articles.
Arguments in newspapers are often implicit but should contain a clear premise (with an inclusive claim).

OUTPUT FORMAT - EXACTLY these 4 XML tags and NOTHING else:
<argument>Original argument text OR "NA"</argument>
<claim>Core claim (implication) in one sentence OR "NA"</claim>
<explanation>Why this is an argument OR "NA"</explanation>
<human_verification_needed>True OR False</human_verification_needed>

EXAMPLE WITH ARGUMENT:
<argument>Es sind furchtbare Bilder, die sich dabei entrollen. Unter den Trümmern des einen Hauses, so erzählt Luigi Barsini im Corriere della Sera, findet man die Leichen von Unglücklichen, die in anderen Häusern gewohnt haben und die in der Verwirrung des schrecklichen Augenblickes instinktiv bei Fremden Hilfe und Unterschlupf suchten. Niemand erkennt jetzt diese armen Eindringlinge, ihre Leichen werden nicht reklamiert, und man trägt sie hinunter an den Strand, wo sie in langer Reihe einer neben den anderen hingebettet werden, in denselben Tüchern und Decken, in denen sie ihren Tod gefunden.</argument>
<claim>The earthquake's chaos led to unidentified victims dying in unfamiliar places.</claim>
<explanation>Describes how people fled to other houses seeking help during the disaster, died there, and now cannot be identified or claimed by relatives. Shows cause (panic/confusion) and effect (anonymous deaths).</explanation>
<human_verification_needed>False</human_verification_needed>

EXAMPLE WITHOUT ARGUMENT:
<argument>NA</argument>
<claim>NA</claim>
<explanation>NA</explanation>
<human_verification_needed>False</human_verification_needed>

RULES:
- NO SUMMARY; ONLY ORIGINAL EXCERPT FROM THE TEXT; don't extract anything that is not in the text. Only extract word by word.
- ONLY output these 4 XML tags.
- Factual reportings such as "Dem Vulkanausbruch folgten drei Sturzwellen in etwa 10 Meter Höhe" are NO arguments.
- Extract only original text without changes or use NA when you did not find an argument.
- The claim is not a translation or summary of the argument. It should say what the (implicit) argument implies.
- In cases of uncertainty or ambiguity, set <human_verification_needed>True</human_verification_needed>.
- If no argument exists, use NA for all fields except <human_verification_needed>.
- More than one argumentative unit possible per article; one unit has one clear claim and all four XML fields.
"""

def ask_llama_api(text, temperature=0.1, max_tokens=1000, random_seed=1):
    """Simple Llama API call using Together API with system + user roles"""

    client = Together()  # auth defaults to os.environ.get("TOGETHER_API_KEY")

    try:
        response = client.chat.completions.create(
            model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
            messages=[
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": f"Extract arguments from this text:\n{text}"}
            ],
            temperature=temperature,
            max_tokens=max_tokens,
            random_seed=random_seed
        )
        return response.choices[0].message.content
    except Exception as e:
        print(f"Error: {e}")
        return "Error"


df['model_answer_llama'] = df['extracted_articles'].apply(lambda x: ask_llama_api(x))
df


## Export the Dataset and name it differently than "results"

In [None]:
df.to_excel('results.xlsx', index=False)