In [1]:
from langchain_community.document_loaders import DirectoryLoader
from langchain_community.document_loaders import UnstructuredMarkdownLoader
from langchain_community.embeddings.ollama import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.chat_models import ChatOllama
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.runnables import RunnableParallel

In [2]:
# Load all file ends with .md
loader = DirectoryLoader('data/polaris', glob="**/*.md", show_progress=True, loader_cls=UnstructuredMarkdownLoader)
# Break documents into smaller chunks, RecursiveCharacterTextSplitter with chunk_size = 2000 is good
docs = loader.load_and_split(RecursiveCharacterTextSplitter(chunk_size = 1024, chunk_overlap = 100))
# Embedded to local data vector database
db  = Chroma.from_documents(docs, OllamaEmbeddings(model="mxbai-embed-large",show_progress=True))

100%|██████████| 49/49 [00:08<00:00,  5.88it/s]
OllamaEmbeddings: 100%|██████████| 240/240 [09:25<00:00,  2.36s/it]


In [3]:
retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": 4}) # Turn VectorStore into a Retriever
local_model = "llama3" # [Phi 3, Mistral, Gamma]
llm = ChatOllama(model=local_model, temperature = 0) # temperature > 1 more creative, random, temperature < 1 deterministic, repetitive

# System prompt
prompt = PromptTemplate(
    template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an assistant for question-answering tasks. 
    Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. 
    Keep the answer concise <|eot_id|><|start_header_id|>user<|end_header_id|>
    Question: {question} 
    Context: {context} 
    Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
    input_variables=["question", "context"],
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain_from_docs = (
    RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain_with_source = RunnableParallel(
    {"context": retriever, "question": RunnablePassthrough()}
).assign(answer=rag_chain_from_docs)


In [4]:
question = "How can I run a job on Polaris? Write an example job submission script?"
print(question)
print(rag_chain.invoke(question))

How can I run a job on Polaris? Write an example job submission script?


OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  4.69it/s]


To run a job on Polaris, you can use a submission script. Here's an example:

```bash
#!/bin/bash
#PBS -l walltime=24:00:00,nodes=10:ppn=16
#PBS -A your_account_name
#PBS -q small
cd /path/to/your/script
./your_script.sh
```

This script specifies the job requirements (walltime, nodes, and queue), sets the working directory to `/path/to/your/script`, and runs the `your_script.sh` script.

Note that you should replace `your_account_name` with your actual account name on Polaris.


In [6]:
question = "How to login to polaris"
print(question)
print(rag_chain.invoke(question))

How to login to polaris


OllamaEmbeddings: 100%|██████████| 1/1 [00:01<00:00,  1.34s/it]


To login to Polaris, you can follow these steps:

1. Go to the Polaris website and click on the "Login" button.
2. Enter your username and password in the respective fields.
3. Click on the "Log In" button to access your account.

Note: The context provided seems to be related to reviewing Nsight Compute data via GUI, but it does not provide any information about logging into Polaris. If you have any specific questions or concerns about using Polaris, I'll do my best to help.


In [10]:
question = "What is Gromacs"
print(question)
print(rag_chain.invoke(question))

What is Gromacs


OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.47s/it]


GROMACS is not explicitly mentioned in the provided context, but it appears to be a software package or library used for computational tasks, possibly related to molecular dynamics simulations. The code snippet shows usage of GROMACS-related functions and variables, such as `cblas_dgemm` and `C_cblas`, which suggests that GROMACS is being utilized for matrix operations and linear algebra calculations.


In [11]:
question = "How to use GROMACS on Polaris"
print(question)
print(rag_chain.invoke(question))

How to use GROMACS on Polaris


OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.29s/it]


Based on the provided context, it seems that GROMACS is not specifically mentioned as a tool for use on Polaris. However, I can provide some general information on how to run jobs on Polaris.

To submit and run jobs on Polaris, you can follow the instructions provided in the "Submitting and Running Jobs" section of the context. This includes reading through the Running Jobs with PBS at the ALCF page for information on using the PBS scheduler and preparing job submission scripts.

Additionally, if you are trying to use GROMACS specifically, you may want to consider using a job submission script similar to the example provided in the "Running VASP in Polaris" section. This would involve loading necessary modules, setting environment variables, and specifying the desired resources (e.g., nodes, walltime) for your job.

If you have any further questions or need more specific guidance on running GROMACS on Polaris, I recommend reaching out to the ALCF support team or reviewing the documenta