# Testing OpenRouter models

We will be testing how well various models available on OpenRouter perform extraction of transport related information from text.

We will be using OPENAI style API to access the models.

In [44]:
# first we need to import the basic libraries
# date
from datetime import datetime
now = datetime.now()
print(f"Date: {now}")
# python version
import sys
print(f"Python version: {sys.version}")
from pathlib import Path
import json
# import time for delay
import time
import requests
# print version
print(f"Requests version: {requests.__version__}")
# import OpenAI
# let's try using OpenAI API
from openai import OpenAI
# import openai version
from openai import __version__ as openai_version
print(f"OpenAI version: {openai_version}")

# load OPENROUTER_API key from system environment
import os
openrouter_api_key = os.getenv('OPENROUTER_API_KEY')
# assert key is not None
assert openrouter_api_key is not None, "OPENROUTER_API_KEY is not set"
print("We have the OPENROUTER_API_KEY should be good to go")

Date: 2025-01-24 14:44:58.356330
Python version: 3.12.6 (tags/v3.12.6:a4a2d2b, Sep  6 2024, 20:11:23) [MSC v.1940 64 bit (AMD64)]
Requests version: 2.32.3
OpenAI version: 1.59.9
We have the OPENROUTER_API_KEY should be good to go


In [7]:
# let's make a request to the OpenRouter API
response = requests.post(
  url="https://openrouter.ai/api/v1/chat/completions",
  headers={
    "Authorization": f"Bearer {openrouter_api_key}",
    "HTTP-Referer": "VSAU", # Optional. Site URL for rankings on openrouter.ai.
    "X-Title": "VSAU", # Optional. Site title for rankings on openrouter.ai.
  },
  data=json.dumps({
    "model": "openai/gpt-3.5-turbo", # Optional
    "messages": [
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ]
    
  })
)
# print the response
print(response.json())

{'id': 'gen-1737619944-weNIM6vfs4oTmlWvCRGb', 'provider': 'OpenAI', 'model': 'openai/gpt-3.5-turbo', 'object': 'chat.completion', 'created': 1737619944, 'choices': [{'logprobs': None, 'finish_reason': 'stop', 'native_finish_reason': 'stop', 'index': 0, 'message': {'role': 'assistant', 'content': 'The meaning of life is a deeply philosophical and personal question that has been debated and contemplated by humans for centuries. Different individuals and cultures may have their own interpretations and beliefs about the purpose and significance of life. Some may believe that the meaning of life is to find happiness and fulfillment, to serve others, to seek spiritual enlightenment, to pursue personal growth and self-improvement, or to contribute to the greater good of society. Ultimately, the meaning of life is subjective and can vary greatly from person to person.', 'refusal': None}}], 'system_fingerprint': None, 'usage': {'prompt_tokens': 14, 'completion_tokens': 102, 'total_tokens': 116}

In [4]:
# print actual content
print(response.json()['choices'][0]['message']['content'])

The meaning of life is a philosophical question that has been debated for centuries. Some believe that the meaning of life is to seek happiness and fulfillment, others believe it is to serve a higher purpose or follow a specific set of beliefs or values. Ultimately, the meaning of life is subjective and can vary from person to person. It is up to each individual to determine their own meaning and purpose in life.


In [8]:
# let's try using OpenAI API
# from openai import OpenAI

client = OpenAI(
  base_url="https://openrouter.ai/api/v1",
  api_key=openrouter_api_key,
)

completion = client.chat.completions.create(
  extra_headers={
    "HTTP-Referer": "VSAU", # Optional. Site URL for rankings on openrouter.ai.
    "X-Title": "VSAU", # Optional. Site title for rankings on openrouter.ai.
  },
  model="openai/gpt-3.5-turbo",
  messages=[
    {
      "role": "user",
      "content": "What is the meaning of life?"
    }
  ]
)
print(completion.choices[0].message.content)

The meaning of life is a deeply philosophical question with many different interpretations. Some believe that the meaning of life is to seek happiness and fulfillment, others believe it is to fulfill a specific purpose or destiny, and some believe that life has no inherent meaning and it is up to each individual to create their own meaning. Ultimately, the meaning of life is subjective and can vary greatly from person to person.


In [45]:
# let's make a function that calls the OpenAI API, function will have following parameters:
# base_url, api_key, model, messages, headers
base_url = "https://openrouter.ai/api/v1"
api_key = openrouter_api_key
model = "google/gemini-flash-1.5"
model = "google/gemini-2.0-flash-exp:free"
print(f"Using model {model}")
role = "user"
headers = {
  "HTTP-Referer": "VSAU", # Optional. Site URL for rankings on openrouter.ai.
  "X-Title": "VSAU", # Optional. Site title for rankings on openrouter.ai.
}

system_prompt = """Please extract a comprehensive list of all mentioned land transportation vehicles from the given texts.
Include specific types (e.g., cars, trains, bicycles) as well as broader categories (e.g., public transportation).
Provide the terms in Latvian, preserving their original transcription variants as they appear in the text. Text:"""


def openai_chat(user_prompt, 
                system_prompt=system_prompt, 
                base_url=base_url,
                api_key=api_key, 
                model=model, 
                headers=headers,
                verbose=True,
                max_retries=10,
                delay=0.5):
  client = OpenAI(
    base_url=base_url,
    api_key=api_key,
  )
  messages = [
        {
        "role": "system",
        "content": system_prompt
        },
        {
        "role": "user",
        "content": user_prompt
        }
    ]
  # if verbose let's show model, system_prompt and start time for request
  if verbose:
    print(f"Using model {model}")
    print(f"System prompt: {system_prompt}")
    start_time = datetime.now()
    print(f"Start time: {start_time}")
  completion = None
  tries = 0
  while completion is None and tries < max_retries:
    try:
      completion = client.chat.completions.create(
        extra_headers=headers,
        model=model,
        messages=messages
      )
    except Exception as e:
      print(f"Error: {e}")
      print(f"Trying again... in {delay} seconds")
      # delay
      time.sleep(delay)
      tries += 1
  if verbose:
    print(f"Time taken: {datetime.now() - start_time}")
  return completion.choices[0].message.content


Using model google/gemini-2.0-flash-exp:free


In [10]:
# let's test the function with a prompt
prompt = """Pa ceļu gāja cilvēks ar suni, bet viņš bija ļoti noguris, tāpēc viņš nolēma izmantot sabiedrisko transportu. 
Vēlāk viņš sēdēja autobusā un skatījās uz logu.
Šis cilvēks sapņoja par jūru un pludmali, bet viņam bija jābrauc uz darbu.
Cik labī būtu bijis sēdēt laiva, makšķerēt zivis un baudīt sauli!"""
response = openai_chat(prompt)
print(response)

* sabiedriskais transports
* autobuss



In [13]:
# let's list all text files in data/docs folder
data_folder = Path("../data/docs")
# list all files
files = list(data_folder.glob("*.txt"))
# print all files
print(*files, sep="\n")


..\data\docs\JaunJ_Ziema_417458.txt
..\data\docs\PaulM_ProfS_1051757.txt
..\data\docs\UpitA_Sievi_771386.txt
..\data\docs\UpitA_Siev_part_1.txt


In [15]:
# let's open last file and read the content using utf-8 encoding
file = files[-1]
with open(file, encoding="utf-8") as f:
  text = f.read()
# first 300 characters
print(text[:300])

Sieviete


Pirmā daļa

Guberņas rentejas ierēdnis Apse pamodās, kad pulkstens aiz sienas ēdamistabā sāka sist. Pusmiegā viņš skaitīja... un nevarēja saskaitīt. Bet tomēr beigās zināja, ka četri. Četri... viņš vairāk reižu domās atkārtoja šo skaitli. Tad vienreiz izrunāja balsī tā, ka pats varēja dzi


In [17]:
# lets test the function with the text
response = openai_chat(text)
print(response)

Using model google/gemini-2.0-flash-exp:free
System prompt: Please extract a comprehensive list of all mentioned land transportation vehicles from the given texts.
Include specific types (e.g., cars, trains, bicycles) as well as broader categories (e.g., public transportation).
Provide the terms in Latvian, preserving their original transcription variants as they appear in the text. Text:
Start time: 2025-01-23 10:22:22.422112
Time taken: 0:00:11.144166
Pēc dotā teksta, zemes transporta līdzekļi ir:

*  zābaki (nav gan transportlīdzeklis, bet ir saistīts ar pārvietošanos pa zemi; zābaki, un arī kājas, bija bieži pieminēti)
*  kamanas (gan vienjūga, gan divjūga kamanas)
*  tramvaja vagons
*  dzelzceļa vilciens (minēts caur sajūtām - "vai sēd dzelzceļa vilcienā")
*  ekipaža (minēts kā nākotnes sapnis: "brauks sava paša eķipažā")
*  automobiļi (nav tieši minēti, bet netieši pieņemti, ka tie varētu pastāvēt, kad runāts par "pazīstamās automobiļu skaņām")
*  riteņi (saistīti ar velobraucēji

In [20]:
print(system_prompt)

Please extract a comprehensive list of all mentioned land transportation vehicles from the given texts.
Include specific types (e.g., cars, trains, bicycles) as well as broader categories (e.g., public transportation).
Provide the terms in Latvian, preserving their original transcription variants as they appear in the text. Text:


In [29]:
system_prompt = """You are an expert on Latvian language and transportation.
Please extract a comprehensive list of all mentioned land transportation vehicles from the given texts.
Show all specific land transportation types found in the text as a list.
No other information is needed, just the transportation terms as they appear in the text.
Provide the terms only in Latvian, preserving their original transcription variants as they appear in the text. Latvian Text follows: """

In [27]:
# lets test the function with the text
response = openai_chat(text, system_prompt=system_prompt)
print(response)

Using model google/gemini-2.0-flash-exp:free
System prompt: You are an expert on Latvian language and transporation.
Please extract a comprehensive list of all mentioned land transportation vehicles from the given texts.
Show all specific land transportation types found in the text as a list.
No other information is needed, just the transportation terms as they appear in the text.
Provide the terms only in Latvian, preserving their original transcription variants as they appear in the text. Latvian Text follows: 
Start time: 2025-01-24 14:23:16.215333
Time taken: 0:00:12.244052
* zābakiem
* gājēji
* braucēji
*  klavieres
* krēsla
*  vilciens
* kamanu
*  šoseju
*  tramvaja
* vagons
* eķipažā
*  dzelzceļa
* slidotavu
* pajūgu
*  pulktiens
*  ormani
* autobuss



In [28]:
transport_lemmas = sorted("""zirgs
dzelzceļš
ormanis
pajūgs
rati
kariete
kamanas
laiva
plosts
burinieks
kuģis
tvaikonis
velosipēds
automašīna
mašīna
vilciens
lokomotīve
tramvajs
bānītis
vāģi
stīmeris
zēģele
fords
Mercedes
""".split())
transport_lemmas

['Mercedes',
 'automašīna',
 'burinieks',
 'bānītis',
 'dzelzceļš',
 'fords',
 'kamanas',
 'kariete',
 'kuģis',
 'laiva',
 'lokomotīve',
 'mašīna',
 'ormanis',
 'pajūgs',
 'plosts',
 'rati',
 'stīmeris',
 'tramvajs',
 'tvaikonis',
 'velosipēds',
 'vilciens',
 'vāģi',
 'zirgs',
 'zēģele']

In [30]:
system_prompt_2 = f"""You are an expert on Latvian language and transportation.
1. Please identify which keywords from the given list of land transportaton vehicles are present in the given text.
2. List of keywords: {" ".join(transport_lemmas)}
3. Identify all other land transportation vehicles in the given text.
4. Provide terms in Latvian as a list, preserving their original spelling variants as they appear in the text."""

In [38]:
# lets test the function with the text
response = openai_chat(text, system_prompt=system_prompt_2)
print(response)

Using model google/gemini-2.0-flash-exp:free
System prompt: You are an expert on Latvian language and transportation.
1. Please identify which keywords from the given list of land transportaton vehicles are present in the given text.
2. List of keywords: Mercedes automašīna burinieks bānītis dzelzceļš fords kamanas kariete kuģis laiva lokomotīve mašīna ormanis pajūgs plosts rati stīmeris tramvajs tvaikonis velosipēds vilciens vāģi zirgs zēģele
3. Identify all other land transportation vehicles in the given text.
4. Provide terms in Latvian as a list, preserving their original spelling variants as they appear in the text.
Start time: 2025-01-24 14:33:04.955773
Time taken: 0:00:06.156933
Okay, I can help with that! Let's break down the text as you requested.

**1. Keywords from the list found in the text:**

From the provided keyword list, the following terms are present in the text:

*   kamanas
*    mašīna
*   zirgs 
*   tramvajs
*   vilciens
*   vāģi

**2. Additional Land Transportati

In [39]:
files

[WindowsPath('../data/docs/JaunJ_Ziema_417458.txt'),
 WindowsPath('../data/docs/PaulM_ProfS_1051757.txt'),
 WindowsPath('../data/docs/UpitA_Sievi_771386.txt'),
 WindowsPath('../data/docs/UpitA_Siev_part_1.txt')]

In [40]:
# let's load second text in profesors variable
profesors = files[1]
with open(profesors, encoding="utf-8") as f:
  text = f.read()
# first 300 characters
print(text[:300])

title: Profesora Sūnas brīnišķīgais eleksīrs
isPartOf:
creator: Paulockis, Miķelis
dateIssued: 1938
publisher: Senatne
publicationPlace: Rīga
firstPublished:
firstEdition: 1938
dateModified:
copyright:
uri: http://dom.lndb.lv/data/obj/1051757
normalization:
ocrCorrection:
verified:



Profesora Sūna


In [41]:
# now let's use pandas to load last csv from csv subfolder
import pandas as pd

csv_folder = Path("../csv")
# assert csv_folder exists
assert csv_folder.exists(), "csv folder does not exist"
# list all files
csv_files = list(csv_folder.glob("*.csv"))
# sort by last modified
csv_files = sorted(csv_files, key=lambda x: x.stat().st_mtime)
# get latest file
csv_file = csv_files[-1]
# read csv file
df = pd.read_csv(csv_file)
# print first 5 rows
df.head()

Unnamed: 0,original,term,score
0,zirgs,vezums,0.813284
1,zirgs,rats,0.804218
2,zirgs,kumeļš,0.764619
3,zirgs,bēris,0.740647
4,zirgs,zirdziņš,0.732735


In [42]:
# get unique values from first column
unique_values = df.iloc[:, 0].unique()
# print unique values
print(unique_values)

['zirgs' 'dzelzceļš' 'ormanis' 'pajūgs' 'kariete' 'kamanas' 'laiva'
 'plosts' 'burinieks' 'kuģis' 'tvaikonis' 'velosipēds' 'automašīna'
 'mašīna' 'vilciens' 'lokomotīve' 'tramvajs' 'bānītis' 'stīmeris' 'zēģele']


In [48]:
land_transportation = """'zirgs' 'dzelzceļš' 'ormanis' 'pajūgs' 'kariete' 'kamanas'
'velosipēds' 'automašīna' 'mašīna' 'vilciens' 'lokomotīve' 'tramvajs' 'bānītis'"""
# get rid of '
land_transportation = land_transportation.replace("'", "")
# split by space
land_transportation = land_transportation.split()
# print land_transportation
print(land_transportation)

['zirgs', 'dzelzceļš', 'ormanis', 'pajūgs', 'kariete', 'kamanas', 'velosipēds', 'automašīna', 'mašīna', 'vilciens', 'lokomotīve', 'tramvajs', 'bānītis']


In [49]:
# now let's recreate system_prompt_2 with unique values
system_prompt_3 = f"""You are an expert on Latvian language and transportation.
1. Please identify which keywords from the given list of land transportaton vehicles are present in the given text.
2. List of keywords: {" ".join(land_transportation)}
3. Identify all other land transportation vehicles in the given text.
4. Provide terms in Latvian as a list, preserving their original spelling variants as they appear in the text."""


In [50]:
# let's load last first file as text
ziema_file = files[0]
with open(ziema_file, encoding="utf-8") as f:
  text = f.read()
# first 300 characters
print(text[:300])

title: Ziema
isPartOf:
creator: Jaunsudrabiņš, Jānis, 1877-1962
dateIssued: 1925
publisher: Valters un Rapa
publicationPlace: Rīga
firstPublished:
firstEdition: 1925
dateModified: 26.06.2024
copyright:
uri: http://dom.lndb.lv/data/obj/417458
normalization:
ocrCorrection: Jā
verified: 26.06.2024, And


In [51]:
# let's test the function with profesors text and system_prompt_3
response = openai_chat(text, system_prompt=system_prompt_3)
print(response)

Using model google/gemini-2.0-flash-exp:free
System prompt: You are an expert on Latvian language and transportation.
1. Please identify which keywords from the given list of land transportaton vehicles are present in the given text.
2. List of keywords: zirgs dzelzceļš ormanis pajūgs kariete kamanas velosipēds automašīna mašīna vilciens lokomotīve tramvajs bānītis
3. Identify all other land transportation vehicles in the given text.
4. Provide terms in Latvian as a list, preserving their original spelling variants as they appear in the text.
Start time: 2025-01-24 14:57:33.166346
Time taken: 0:00:11.862424
Okay, here's the breakdown of land transportation terms from the provided text, based on your criteria:

**1. Keywords present in the text:**

Based on the provided list of keywords, the following are present in the text:
*   zirgs
*   velosipēds (not directly, but implied when referencing Pakalns and his method for obtaining water)
*   automašīna (not directly, but implied when ref

In [9]:
# now let's try the same prompt but with a different model
# we will use the following model meta-llama/llama-3.3-70b-instruct
model = "meta-llama/llama-3.3-70b-instruct"
response = openai_chat(prompt, model=model)
print(response)

Here's a list of transportation terms in the Latvian language:

1. Auto (Car) - Auto
2. Autoceļš (Highway) - Autoceļš
3. Velosipēds (Bicycle) - Velosipēds
4. Riteņbraucējs (Cyclist) - Riteņbraucējs
5. Būve (Construction) - Būve
6. Dzelzceļš (Railway) - Dzelzceļš
7. Vilciens (Train) - Vilciens
8. Lidmašīna (Airplane) - Lidmašīna
9. Osta (Port) - Osta
10. Kuģis (Ship) - Kuģis
11. Autobuss (Bus) - Autobuss
12. Trolejbuss (Trolleybus) - Trolejbuss
13. Tramvajs (Tram) - Tramvajs
14. Taksi (Taxi) - Taksi
15. Stacija (Station) - Stacija
16. Pietura (Stop) - Pietura
17. Brauciens (Trip) - Brauciens
18. Reiss (Journey) - Ceļojums (more common term for journey)
19. Maršruts (Route) - Maršruts
20. Satiksme (Transportation) - Satiksme

Additionally, here are some common phrases related to transportation:

- "Kā piesāpināt?" (How to get there?) - Kā piesāpināt?
- "Kā braukt?" (How to travel?) - Kā braukt?
- "Cik ilgi?" (How long does it take?) - Cik ilgi?
- "Cik daudz maksā?" (How much does it cost

In [10]:
# let's try the same prompt but with a different model
# we will use the following model meta-llama/llama-3.2-90b-vision-instruct:free 
model = "meta-llama/llama-3.2-90b-vision-instruct:free"
response = openai_chat(prompt, model=model)
print(response)

Here's a list of common transportation terms in Latvian:

1. Transport - transports (transportation)
2. Autobuss - autobus (bus)
3. Viena - vilciens (train)
4. Lidmašīna - lidmašīna (airplane)
5. Kuģis - kuģis (ship)
6. Laiva - laiva (boat)
7. Riteņbraucējs - riteņbraucējs (bicyclist)
8. Velosipēds - velosipēds (bicycle)
9. Motocikls - motocikls (motorcycle)
10. Automobiļi - mašīna (car)
11. Taksi - taksometrs (taxi)
12. Metro - metro (subway)
13. Tramvajs - tramvajs (tram)
14. Elektrovilciens - elektrības vilciens (electric train)
15. Starptautiskā lidosta - starptautiskā lidosta (international airport)
16. Tuksneša mašīna - kutermašīna (jeep)
17. Gājējs - gājējs (pedestrian)
18. Ceļš - ceļš (road)
19. Automaģistrāle - automaģistrāle (highway)
20. Tilts - tilts (bridge).

Note: Latvian language uses its specific characters, but they were not used here for easier search. Example of those would be:
ā (in transportation - transports), 
ē (in ceļš - road),
ī (in lidmašīna - airplane),
ū (

In [11]:
# now let's try with latest GPT-4o model
model = "openai/gpt-4o-2024-11-20"
response = openai_chat(prompt, model=model)
print(response)

Here is a list of transportation-related terms in Latvian:

1. **Automašīna** – Car  
2. **Autobuss** – Bus  
3. **Vilciens** – Train  
4. **Lidmašīna** – Airplane  
5. **Kuģis** – Ship  
6. **Velosipēds** – Bicycle  
7. **Motocikls** – Motorcycle  
8. **Skrejritenis** – Scooter  
9. **Taksometrs (Taksi)** – Taxi  
10. **Kravas automašīna** – Truck  
11. **Laiva** – Boat  
12. **Tramvajs** – Tram  
13. **Trolejbuss** – Trolleybus  
14. **Helikopters** – Helicopter  
15. **Metro** – Subway  
16. **Automaģistrāle** – Highway  
17. **Satiksmes līdzeklis** – Means of transport  
18. **Ceļš** – Road  
19. **Maršruts** – Route  
20. **Stacija** – Station  
21. **Pietura** – Stop  
22. **Autoosta** – Bus station  
23. **Osta** – Port  
24. **Lidosta** – Airport  
25. **Ceļojums** – Journey  
26. **Biļete** – Ticket  
27. **Ātrums** – Speed  
28. **Degviela** – Fuel  
29. **Pasažieris** – Passenger  
30. **Vadītājs** – Driver  
31. **Stūre** – Steering wheel  
32. **Bagāža** – Luggage  
33. **

In [12]:
prompt

'Provide a list of transportation terms in Latvian language'

In [21]:
model

'google/gemini-flash-1.5'

In [25]:
# let's get list of all txt files in data docs folder
from pathlib import Path
text_files = list(Path("../data/docs").rglob("*.txt"))
print(f"We have {len(text_files)} text files in the data/docs folder")
# first one
print(f"Name of first file: {text_files[0]}")
# 2nd one
print(f"Name of second file: {text_files[1]}")
# last one 
print(f"Name of last file: {text_files[-1]}")

We have 4 text files in the data/docs folder
Name of first file: ..\data\docs\JaunJ_Ziema_417458.txt
Name of second file: ..\data\docs\PaulM_ProfS_1051757.txt
Name of last file: ..\data\docs\UpitA_Siev_part_1.txt


In [18]:
# let's read text of last file using utf-8 encoding
with open(text_files[-1], "r", encoding="utf-8") as file:
  text = file.read()
# how many characters in the text
print(f"Number of characters in the text: {len(text)}")

Number of characters in the text: 163835


In [23]:
print(system_prompt)

Please extract a comprehensive list of all mentioned land transportation vehicles from the given texts.
Include specific types (e.g., cars, trains, bicycles) as well as broader categories (e.g., public transportation).
Provide the terms in Latvian, preserving their original transcription variants as they appear in the text. Text:


In [24]:
# let's use text as a prompt
response = openai_chat(text)
print(response)

The provided text mentions the following land transportation vehicles in Latvian:

* **kamanas** (sleds)
* **tramvaja vagons** (tram)
* **ormanis** (sleigh, specifically referring to a horse-drawn sleigh in this context)
* **kamanu slieces** (sled runners)
* **dzelzceļa vilciens** (train)


There is no mention of cars or bicycles.  The broader category of "public transportation" is not explicitly mentioned although "tramvajs" falls under this category.



In [29]:
elixir_text = Path(r"..\data\docs\PaulM_ProfS_1051757.txt").read_text(encoding="utf-8")
# how many characters in the text
print(f"Number of characters in the text: {len(elixir_text)}")
# first 200 characters
print(elixir_text[:400])

Number of characters in the text: 188657
title: Profesora Sūnas brīnišķīgais eleksīrs
isPartOf:
creator: Paulockis, Miķelis
dateIssued: 1938
publisher: Senatne
publicationPlace: Rīga
firstPublished:
firstEdition: 1938
dateModified:
copyright:
uri: http://dom.lndb.lv/data/obj/1051757
normalization:
ocrCorrection:
verified:



Profesora Sūnas brīnišķīgais eleksīrs

M. Paulockis

Romāns

IZDEVNIECĪBA „SENĀTNE*, RĪGĀ

Sp. „STAR* Rīgā,
Jumara


In [30]:
response = openai_chat(elixir_text)
print(response)

The provided text mentions the following land transportation vehicles in Latvian:

* auto-limuzīna
* tramvaji
* taksometrs
* autobuss
* vilcienos (plural - trains)
* lokomotīves (plural - locomotives)
* auto (plural - cars)
* elektrovāģis


Note that "automobīļus" (plural - automobiles) appears in the text within a sentence describing how people adapted to a changing world, not as a specific event in the narrative.  Therefore it's not included above, as a direct mention of a vehicle type in function within the story is required to be included in the list.



In [32]:
print(system_prompt)

Please extract a comprehensive list of all mentioned land transportation vehicles from the given texts.
Include specific types (e.g., cars, trains, bicycles) as well as broader categories (e.g., public transportation).
Provide the terms in Latvian, preserving their original transcription variants as they appear in the text. Text:


In [35]:
# # let's try different model - google/gemini-2.0-flash-exp:free
# model = "google/gemini-2.0-flash-exp:free"
# model = "google/gemini-exp-1206:free"

# response = openai_chat(elixir_text, model=model)
# print(response)

In [37]:
model = "openai/gpt-4o-mini"
response = openai_chat(elixir_text, model=model)
print(response)

Based on the provided text, here is a comprehensive list of mentioned land transportation vehicles, both specific types and broader categories in Latvian:

1. **Automobiļi** (Cars)
2. **Tramvaji** (Trams)
3. **Taksometri** (Taxis)
4. **Dzelzceļš** (Railway/Train)
5. **Vilciens** (Train)
6. **Centrālapkures katls** (Central Heating Boiler) - indirectly related as part of infrastructure
7. **Gaisa dzelzceļš** (Aerial Cableway) - mentioned in the context of transportation infrastructure

Broader categories and terms:
8. **Sabiedriskā transporta** (Public Transportation)
9. **Līdzekļi** (Vehicles/Means of Transportation)

This list includes various forms of land transportation cited directly in the text and captures the essence of the broader transportation context as discussed.


In [38]:
system_prompt = """You are an expert on Latvian language and transporation.
Please extract a comprehensive list of all mentioned land transportation vehicles from the given Latvian document.
Include all specific types (e.g., vilciens, tramvajs, divritenis, ormanis) as well as broader categories (e.g., dzelceļš, sabiedriskais transports).
Provide the terms solely in Latvian, preserving their original spelling as they appear in the text. Text: 
"""
print(f"Using model {model}")
response = openai_chat(elixir_text, system_prompt=system_prompt, model=model)
print(response)

Using model openai/gpt-4o-mini
Here is a comprehensive list of the land transportation vehicles mentioned in the provided text:

1. **vilciens** (train)
2. **tramvajs** (tram)
3. **taksometrs** (taxi)
4. **divritenis** (bicycle)
5. **auto** (car)
6. **gaisa dzelzceļš** (aerial tramway)
7. **dzelzceļš** (railway)
   
Broader categories:
1. **sabiedriskais transports** (public transportation) 

Please note that the terms are preserved in their original spelling as they appear in the text.


In [39]:
# let's try with Gemini model
model = "google/gemini-2.0-flash-exp:free"
print(f"Using model {model}")
response = openai_chat(elixir_text, system_prompt=system_prompt, model=model)
print(response)

Using model google/gemini-2.0-flash-exp:free
Okay, here's the comprehensive list of land transportation vehicles mentioned in the provided Latvian text, presented solely in Latvian:

*   sabiedriskais transports
*   auto-limuzīnā
*   taksometram
*   taksometra
*   tramvaja
*   tramvaji
*   dzelzceļpārbrauktuvi
*   dzelzceļš
*   vilciens
*   sliedēm
*   auto

