- create .venv
- run `pip install -r requirements.txt`
- get API key from https://kisski.gwdg.de/leistungen/2-02-llm-service/
- create `.env` file with:
  > OPENAI_API_KEY = "YOUR-API-KEY"  
  > KISSKI_URL = "https://chat-ai.academiccloud.de/v1"

In [None]:
from enum import Enum

index_dir = "./faiss_index"
tokens_per_chunk = 1024
embed_model = "sentence-transformers/all-MiniLM-L6-v2"

topic = "climate change and its effects on islands"

class Prompt(Enum):
    BASIC = f'You are an expert on the topic of {topic} . You are explaining it to someone with basic knowledge of the topic.'
    ADVANCED = f'You are an expert on the topic of {topic}. You are explaining it to someone with advanced knowledge of the topic.'

class Model(Enum):
    LLAMA = 'meta-llama-3.1-8b-instruct'
    MISTRAL = 'Mistral-Large-Instruct-2407'
    GEMMA = 'gemma-3-27b-it'

llm_model = Model.LLAMA.value
answer_level = Prompt.ADVANCED.value

In [53]:
import faiss
import os
from llama_index.vector_stores.faiss import FaissVectorStore
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage, SimpleDirectoryReader

# Load HTML File/s
documents = SimpleDirectoryReader(input_dir="html").load_data()
print(f"Loaded {len(documents)} document(s).")

# Chunk with SentenceSplitter
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=tokens_per_chunk, chunk_overlap=200) 
# 1 token = 4 characters
nodes = splitter.get_nodes_from_documents(documents)

# # show chunks
# for i, node in enumerate(nodes):
#     print(f"Chunk {i}:\n{node.get_content()}\n{'='*40}")
    
print(f"Generated {len(nodes)} chunks.") 


# Embed Chunks with HuggingFace
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(model_name=embed_model)

if os.path.exists(index_dir) and os.listdir(index_dir):
  vector_store = FaissVectorStore.from_persist_dir(index_dir)
  storage_context = StorageContext.from_defaults(
      vector_store=vector_store, persist_dir=index_dir
  )
  index = load_index_from_storage(storage_context=storage_context, embed_model=embed_model)
  print("Using stored index.")
  
else:
  # Create Index
  faiss_index = faiss.IndexFlatL2(384)
  vector_store = FaissVectorStore(faiss_index=faiss_index)
  storage_context = StorageContext.from_defaults(vector_store=vector_store)

  index = VectorStoreIndex(
      nodes,
      embed_model=embed_model,
      storage_context=storage_context,
  )

  # Save index
  index.storage_context.persist(persist_dir=index_dir)
  print(f"Index stored in {index_dir}")


Loaded 1 document(s).
Generated 326 chunks.
Using stored index.


In [None]:
# LLM

from openai import OpenAI
import os
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")
base_url = os.getenv("KISSKI_URL")

if not api_key or not base_url:
    raise ValueError("Missing API key or URL.")

client = OpenAI(
    api_key=api_key,
    base_url=base_url
)

def ask_openai_llm(prompt: str) -> str:
    response = client.chat.completions.create(
        model=llm_model,
        messages=[
            {"role": "system", "content": answer_level},
            {"role": "user", "content": prompt}
        ]
    )
    return response.choices[0].message.content


In [55]:
# Question
import textwrap
from IPython.display import Markdown, display

while True:
    query = input("Enter your question (or type 'q'): ").strip()
    if query.lower() == 'q':
        print("Session ended.")
        break

    nodes = index.as_retriever().retrieve(query)
    context = "\n---\n".join([n.get_content() for n in nodes])
    full_prompt = f"""
Context:
{context}

Question:
{query}"""

    answer = ask_openai_llm(full_prompt)
    print(f"\nQ:")
    display(Markdown(textwrap.dedent(query)))
    print("\nA:")
    display(Markdown(textwrap.dedent(answer)))
    print("___\n")


Q:


Is the sea getting more carbonated?


A:


The question of whether the sea is getting more carbonated is a complex one, but I'll try to provide a concise answer.

The key concept here is ocean acidification, which is a consequence of the increasing amount of carbon dioxide (CO2) in the atmosphere due to human activities, such as burning fossil fuels and deforestation. When CO2 dissolves in seawater, it forms carbonic acid, which increases the acidity of the ocean. This is known as ocean acidification.

The main driver of ocean acidification is the uptake of CO2 by the ocean, which has increased by about 30% since the Industrial Revolution. This increase in CO2 has led to a decrease in the pH of the ocean by about 0.1 units, which may not seem significant, but it has a profound impact on marine life, particularly organisms with calcium carbonate shells, such as coral, shellfish, and some plankton.

The increased acidity of the ocean can lead to a range of consequences, including:

1. Reduced growth rates and increased mortality of marine organisms with calcium carbonate shells.
2. Changes in the composition and structure of marine ecosystems.
3. Impacts on the global carbon cycle, as the ocean acts as a sink for CO2.

In the context of small island developing states, ocean acidification can exacerbate the already significant challenges they face, such as erosion, flooding, and loss of livelihoods.

So, to answer your question, the sea is indeed getting more acidic, but not in the classical sense of "carbonation." The term "carbonation" usually refers to the process of dissolving CO2 in water to create a fizzy or carbonated beverage. In the context of ocean acidification, the term "acidification" is more accurate, as it describes the increase in acidity due to the dissolution of CO2 in seawater.

___

Session ended.
