# Git LLM Training Prototype

A retrieval-based QA prototype for personalized Git training using LlamaIndex and Library Carpentry content.


## Introduction

This notebook demonstrates a prototype for a personalized, LLM-assisted Git training system. 
It uses the [LlamaIndex](https://www.llamaindex.ai/) framework and a small dataset of questions and answers based on the [Library Carpentry Git Lesson](https://librarycarpentry.github.io/lc-git/instructor/aio.html).


In [1]:
# Install if needed
# !pip install llama-index sentence-transformers

In [None]:
from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
import json

# Load training data
with open("git_llm_training_data.json", "r") as f:
    qa_pairs = json.load(f)

# Convert to documents
documents = [Document(text=f"Q: {item['prompt']}\nA: {item['response']}") for item in qa_pairs]

# Set up embedding model (globally via Settings)
embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
Settings.embed_model = embed_model
Settings.llm = None  # Disable LLM (retrieval only)

# Create index and query engine
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

# Test query
response = query_engine.query("Why do I need to use git add before committing?")
print("Answer:", response.response.strip())

# Interactive query loop (stop with 'exit')
while True:
    user_input = input("Ask a Git question (or type 'exit'): ")
    if user_input.lower().strip() == "exit":
        break
    response = query_engine.query(user_input)
    print("Answer:", response.response.strip())

LLM is explicitly disabled. Using MockLLM.
Answer: Context information is below.
---------------------
Q: Why do I need to use 'git add' before 'git commit'?
A: 'git add' tells Git which changes you want to include in the next commit. This allows you to control what goes into the project history.

Q: What is Git and why is it useful in research?
A: Git is a version control system that helps track changes in files. In research, it ensures reproducibility, documents development history, and facilitates collaboration.
---------------------
Given the context information and not prior knowledge, answer the query.
Query: Why do I need to use git add before committing?
Answer:


Ask a Git question (or type 'exit'):  What does git commit do


Answer: Context information is below.
---------------------
Q: Why do I need to use 'git add' before 'git commit'?
A: 'git add' tells Git which changes you want to include in the next commit. This allows you to control what goes into the project history.

Q: What is Git and why is it useful in research?
A: Git is a version control system that helps track changes in files. In research, it ensures reproducibility, documents development history, and facilitates collaboration.
---------------------
Given the context information and not prior knowledge, answer the query.
Query: What does git commit do
Answer:


Ask a Git question (or type 'exit'):  What does git commit do?


Answer: Context information is below.
---------------------
Q: What is Git and why is it useful in research?
A: Git is a version control system that helps track changes in files. In research, it ensures reproducibility, documents development history, and facilitates collaboration.

Q: Why do I need to use 'git add' before 'git commit'?
A: 'git add' tells Git which changes you want to include in the next commit. This allows you to control what goes into the project history.
---------------------
Given the context information and not prior knowledge, answer the query.
Query: What does git commit do?
Answer:
