## Ask user Query

In [1]:
import faiss
import numpy as np
from sentence_transformers import SentenceTransformer
import csv

# Load the model and FAISS index
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
faiss_index = faiss.read_index("Data/sentence_embeddings.index")

# Load lecture sentences
with open('Data/sentences.txt', 'r') as file:
    lecture_sentences = file.readlines()
lecture_sentences = [line.strip() for line in lecture_sentences if line.strip()]

lecture_data = []
with open('Data/srt-embedding-metadata.tsv', 'r', encoding='utf-8') as file:
    tsv_reader = csv.reader(file, delimiter='\t')
    for row in tsv_reader:
        if len(row) == 3:
            filename, timestamp, sentence = row
            lecture_data.append((filename.strip(), timestamp.strip(), sentence.strip()))

# Get student's question
student_question = input("Enter your question: ")
question_embedding = np.array(model.encode([student_question])).astype('float32')

# Search all sentences (max number can be total sentences in the index)
distances, indices = faiss_index.search(question_embedding, len(lecture_sentences))

# Define a distance threshold (lower means more similar)
distance_threshold = 0.7

related_sentences = []
related_results = []
for j in range(len(indices[0])):
    i = indices[0][j]
    distance = distances[0][j]
    sentence = lecture_sentences[i]
    
    # Check if the sentence is below the distance threshold and is not a question
    if distance > 0 and distance <= distance_threshold and not sentence.strip().endswith('?'):
        related_sentences.append((sentence, distance))
        filename, timestamp, _ = lecture_data[i+1]
        related_results.append((filename, timestamp, sentence, distance))


# Display related sentences with distances
print("\n Related Sentences:")
for sentence, distance in related_sentences:
    print(f"- {sentence} - {distance:.4f}")

print("\nRelated Sentences with Metadata:")
for filename, timestamp, sentence, distance in related_results:
    print(f"- [{filename}] [{timestamp}] {sentence} - Distance: {distance:.4f}")

  from .autonotebook import tqdm as notebook_tqdm



 Related Sentences:
- As you can see, machine learning is a top skill in the jobs that involves AI skills. - 0.5102
- Machine learning consists of different types of learning, such as supervised learning, unsupervised learning, or reinforcement learning. - 0.5319
- Many machine learning models, they are coming from statistical learning. - 0.5381
- This video will talk about introduction to machine learning. - 0.5936
- So machine learning is part of data science and it is also a subfield of artificial intelligence. - 0.6000
- And here is the Google trend on the term on machine learning and software engineering. - 0.6220
- So machine learning extends the statistical learning by including more complex algorithms, which deal with more complex data and bigger data, and more efficient algorithms. - 0.6400
- Machine learning, we mentioned that machine learning several times during the talk about data science. - 0.6611
- Here are some few examples of machine learning tasks. - 0.6639
- It is o