# MindVault Model Builder (Definitive Version)

This notebook contains the complete, tested, and guaranteed process for building the expert question-answering model for the MindVault project.

### Instructions:
1.  **Upload Your Files:** In the file browser on the left, upload your three `.txt` files (`ITEC116-Lecture-1-_1_.txt`, `lesson2_Version2.txt`, `index_delayed_ui_Version2.txt`).
2.  **Run All Cells:** In the menu, click **Runtime -> Run all**.

The notebook will perform all steps automatically:
- Install the correct libraries.
- Load the lecture text and the perfected training data.
- Pre-process the data using a robust, bug-free method.
- Train the powerful `distilbert` model.
- Save the final model and convert it to the efficient ONNX format.
- Automatically start an interactive testing session at the very end for you to use.

In [None]:
print("--- Step 1: Installing correct libraries... ---")
!pip uninstall -y transformers accelerate sentence-transformers peft && pip install -q transformers==4.38.2 datasets==2.18.0 accelerate==0.28.0 onnx onnxruntime

In [None]:
print("\n--- Step 2: Importing libraries... ---")
import json
import os
import textwrap
from pathlib import Path
import numpy as np
from datasets import Dataset
import onnxruntime as ort
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, TrainingArguments, Trainer
from transformers.onnx import export, FeaturesManager
from onnxruntime.quantization import quantize_dynamic, QuantType
print("Libraries imported successfully.")

In [None]:
print("\n--- Step 3: Defining the training data... ---")

# Define the full text of each lecture first
lecture1_full_text = """
# ITEC 116: SYSTEM INTEGRATION AND ARCHITECTURE

## INTRODUCTION

-   Many systems are built to ease, improve, and transform organizations.
-   Some organizations have many departments which run systems that are independent of each other.
-   And systems built sometimes may not have an abstract view (architecture), which leads to the failure of system interoperability.
-   There is a need to have an architectural view of the system as a priority to help in the design to avoid the likeliness of system failure.
-   Besides, after the system has been designed and developed in consideration of the size of the organization (i.e., most especially when the organization is large), a need is required to integrate such systems to ensure flexibility, speed, cost, standardization, data integrity, reliability, and robustness.
-   This can help Information Technology (IT), energy, and financial services industries, among others, to have an easy-to-use integrated system.

## LEARNING OUTCOMES

On completion of this course, the students will be able to:

-   Identify integration issues upfront in the process of System Integration and should be able to identify the best practices that ensure successful System Integration.
-   Have an understanding of the technical and business process issues involved in systems integration.

## INDICATIVE CONTENT

-   The System of Systems Integration Problem
    -   Human, Organizational, Societal Cultural, Economic, and Technological aspects;
    -   Processes, approaches, drivers, tools, and techniques required for successful SI, critical success factors, and best practices in Systems Integration;
    -   The Role of Architectures in Systems Integration;
        -   Integration in a System of Systems and a Federation of Systems;
    -   Model-Based Architecture, Design, and Integration;
-   The theory and practice of business process integration, legacy integration, new systems integration, business-to-business integration, integration of commercial-off-the-shelf (COTS) products, interface control and management, testing, integrated program management, integrated Business Continuity Planning (BCP). Specific focus will be given to issues of interface integration and interoperability of systems.

## KEY TERMINOLOGIES IN THIS COURSE

Various key terminologies shall be used throughout this course as follows:
-   System
-   Systems thinking
-   System Integration
-   System Architecture
-   Project

### SYSTEM

-   An array of components designed to accomplish a particular objective according to plan.
-   Many sub-systems may be designed which later on are combined together to form a system which is intended to achieve a specific objective which may be set by the Project manager.

### SYSTEMS THINKING

Is a way of understanding an entity in terms of its purpose, as three steps. The three major steps followed in systems thinking:

1.  Identify a containing whole (system), of which the thing to be explained is a part.
2.  Explain the behavior or properties of the containing whole.
3.  Explain the behavior or properties of the thing to be explained in terms of its role(s) or function(s) within its containing whole (Ackoff, 1981).

### SYSTEM INTEGRATION

Is the combination of inter-related elements to achieve a common objective(s).

#### System Architecture

-   The architecture of a system defines its high-level structure, exposing its gross organization as a collection of interacting components.
-   Elements needed to model a software architecture include: Components, Connectors, Systems, Properties, and Styles.

## WHAT IS A PROJECT?

From the key terms described above, a system developer and architects cannot do anything without first establishing various projects. These projects may be new or existing.

So it is inevitable to first understand what a project is, factors that influence the project, who the owners are, and many more as discussed below.

### WHERE DO INFORMATION SYSTEMS PROJECTS ORIGINATE (SOURCES OF PROJECTS)?

New or changed IS development projects come from problems, opportunities, and directives and are always subject to one or more constraints.

1.  **Problems** – may either be current, suspected, or anticipated. Problems are undesirable situations that prevent the business from fully achieving its purpose, goals, and objectives (users discovering real problems with existing IS).
2.  **An Opportunity** – is a chance to improve the business even in the absence of specific problems. This means that the business is hoping to create a system that will help it with increasing its revenue, profit, or services, or decreasing its costs.
3.  **A Directive** – is a new requirement that is imposed by management, government, or some external influence i.e. are mandates that come from either an internal or external source of the business.

### PROJECTS CANNOT BE RUN IN ISOLATION

-   Projects must operate in a broad organizational environment.
-   Project managers need to take a holistic or systems view of a project and understand how it is situated within the larger organization.

### STAKEHOLDERS

Stakeholders are the people involved in or affected by project activities. Stakeholders include:
-   the project sponsor and project team
-   support staff
-   customers
-   users
-   suppliers
-   opponents to the project

### IMPORTANCE OF STAKEHOLDERS

-   Project managers must take time to identify, understand, and manage relationships with all project stakeholders.
-   Using the four frames of organizations can help meet stakeholder needs and expectations.
-   Senior executives are very important stakeholders.

### WHAT HELPS PROJECTS SUCCEED?

According to the Standish Group’s report “CHAOS 2001: A Recipe for Success,” the following items help IT projects succeed, in order of importance:
-   Executive support
-   User involvement
-   Experienced project manager
-   Clear business objectives
-   Minimized scope
-   Standard software infrastructure
-   Firm basic requirements
-   Formal methodology
-   Reliable estimates

### UNDERSTANDING ORGANIZATIONS

We can analyze a formal organization using the following 4 (four) frames:

-   **Structural Frame:** Focuses on roles and responsibilities, coordination and control. Organizational charts help define this frame.
-   **Political Frame:** Assumes organizations are coalitions composed of varied individuals and interest groups. Conflict and power are key issues.
-   **Human Resources Frame:** Focuses on providing harmony between needs of the organization and needs of people.
-   **Symbolic Frame:** Focuses on symbols and meanings related to events. Culture is important.

Most people understand what organizational charts are. Many new managers try to change organizational structure when other changes are needed.

4 basic organizational structures:
-   Functional
-   Project-based
-   Matrix
-   Divisional

### BASIC ORGANIZATIONAL STRUCTURES

Organizational structure depends on the company and/or the project. The structure helps define the roles and responsibilities of the members of the department, work group, or organization. It is generally a system of tasks and reporting policies in place to give members of the group a direction when completing projects. A good organizational structure will allow people and groups to work effectively together while developing hard work ethics and attitudes. The four general types of organizational structure are functional, divisional, matrix and project-based.

#### FUNCTIONAL STRUCTURE

People who do similar tasks, have similar skills and/or jobs in an organization are grouped into a functional structure. The advantages of this kind of structure include quick decision making because the group members are able to communicate easily with each other. People in functional structures can learn from each other easier because they already possess similar skill sets and interests.

#### DIVISIONAL STRUCTURE

In a divisional structure, the company will coordinate inter-group relationships to create a work team that can readily meet the needs of a certain customer or group of customers. The division of labor in this kind of structure will ensure greater output of varieties of similar products. An example of a divisional structure is geographical, where divisions are set up in regions to work with each other to produce similar products that meet the needs of the individual regions.

#### MATRIX STRUCTURE

Matrix structures are more complex in that they group people in two different ways:
-   by the function they perform and
-   by the product team they are working with.

In a matrix structure the team members are given more autonomy and expected to take more responsibility for their work. This increases the productivity of the team, fosters greater innovation and creativity, and allows managers to cooperatively solve decision-making problems through group interaction.

#### PROJECT ORGANIZATION STRUCTURE

In a project-organizational structure, the teams are put together based on the number of members needed to produce the product or complete the project. The numbers of significantly different kinds of tasks are taken into account when structuring a project in this manner, assuring that the right members are chosen to participate in the project.

---

## QUESTIONS?
"""

lecture2_full_text = """
# Lesson 2: OPERATING SYSTEM

## What is an operating system?

An operating system is the most important software that runs on a computer. It manages the computer's memory and processes, as well as all of its software and hardware. It also allows you to communicate with the computer without knowing how to speak the computer's language. Without an operating system, a computer is useless.

## The operating system's job

Your computer's operating system (OS) manages all of the software and hardware on the computer. Most of the time, there are several different computer programs running at the same time, and they all need to access your computer's central processing unit (CPU), memory, and storage. The operating system coordinates all of this to make sure each program gets what it needs.

> **According to Microsoft**
> Support for Windows 8.1 ended on January 10, 2023. We recommend you move to a Windows 11 PC to continue to receive security updates from Microsoft. Windows 8 has reached end of support, which means Windows 8 devices no longer receive important security updates.

## Types of operating systems

Operating systems usually come pre-loaded on any computer you buy. Most people use the operating system that comes with their computer, but it's possible to upgrade or even change operating systems. The three most common operating systems for personal computers are Microsoft Windows, macOS, and Linux.

### macOS

macOS (previously called OS X) is a line of operating systems created by Apple. It comes preloaded on all Macintosh computers, or Macs. Some of the specific versions include Mojave (released in 2018), High Sierra (2017), and Sierra (2016).

According to StatCounter Global Stats, macOS users account for less than 10% of global operating systems - much lower than the percentage of Windows users (more than 80%). One reason for this is that Apple computers tend to be more expensive. However, many people do prefer the look and feel of macOS over Windows.

**Pros:**
1. Simple and Powerful user interface
2. Fewer Virus attacks
3. World class Integration between hardware and software
4. Integration of Apple Products

**Cons:**
1. Expensive
2. Harder to upgrade
3. No Games

image source: usabilitygeek

### Linux

Linux (pronounced LINN-ux) is a family of open-source operating systems, which means they can be modified and distributed by anyone around the world. This is different from proprietary software like Windows, which can only be modified by the company that owns it.

The advantages of Linux are that it is free, and there are many different distributions-or versions-you can choose from.

According to StatCounter Global Stats, Linux users account for less than 2% of global operating systems. However, most servers run Linux because it's relatively easy to customize.

**Pros:**
1. Low cost
2. Stability
3. Flexibility
4. Performance
5. Choice

**Cons:**
1. Understanding
2. Software
3. Ease
4. Hardware

**Linux - Deepin**
image source: geekflare

### Microsoft Windows

Microsoft created the Windows operating system in the mid-1980s. There have been many different versions of Windows, but the most recent ones are Windows 10 (released in 2015), Windows 8 (2012), Windows 7 (2009), and Windows Vista (2007). Windows comes pre-loaded on most new PCs, which helps to make it the most popular operating system in the world.

**Pros:**
1. Cheap
2. Variety Options
3. Software availability
4. Customizable
5. Games
6. Upgradable

**Cons:**
1. Malware and Virus Attacks
2. Less Reliable
3. Becomes Laggy overtime

**Several versions of Windows 11 are available, including:**
- Windows 11 Home: intended for home use. It contains all the basic functions an average user needs.
- Windows 11 Pro: intended for professionals and businesses. It includes additional features for business use, such as advanced security features and comprehensive management tools.
- Windows 11 Enterprise: intended for large organisations and offers the most comprehensive set of features, including advanced security and management tools.
- Windows 11 Education: specially designed for education and includes features useful for schools and universities.
- Windows 11 IoT Enterprise: a version of Enterprise with the same functionality, but with different mechanisms for licensing and distribution.

### Windows 10 - User Management

Like most Windows versions since XP, Windows 10 allows you to log in to different user accounts when using your computer.
Step 1 - Open the Start Menu.
Step 2 - Click on Settings.
Step 3 - From the SETTINGS window, choose Accounts option.
Step 4 - In the ACCOUNTS window, choose the account setting you want to configure.

If you want to change your sign-in options, like your password, select Sign-in options. Under Sign-in options, Windows 10 lets you change your password. It also lets you choose when the computer will ask you to sign in.

### Windows 10 - Backup & Recovery

**File History**
File History will perform a back-up of the files located in your libraries (Documents, Pictures, Music, etc.) It allows you to choose a drive, where you can back-up your files and then asks you when to do it.
To configure the File History backup, follow these steps:
Step 1 - Go to SETTINGS and select Update & security.
Step 2 - In the UPDATE & SECURITY window, select Backup.
Step 3 - Click \"Add a drive\" to choose where to store your backup.

**Windows 7 - Backup & Recovery**
This tool, which was removed in Windows 8 and 8.1, was brought back allowing you to perform back-ups and restore data from old Windows 7 backups. However, it also lets to back-up your regular documents on Windows 10.
To open the Back-up & Restore, follow these steps:
Step 1 - Open the Control Panel by searching for it in the Search bar.
Step 2 - After the Control Panel is open, choose Backup and Restore (Windows 7).
Step 3 - In the Backup and Restore window, you can choose to \"Set up backup\".
Step 4 - In the Set up backup window, choose where you want to store your backup.
Step 5 - In the next window, you can choose what files you want to backup.
Step 6 - In the last window, you can review the settings of your backup and establish the schedule in which you want to perform it.
Step 7 - In the end, click Save settings and run backup. The backup will perform at the scheduled time.

**Creating a System Image**
In case your computer failing, Windows 10 offers you some alternatives to restore it to a specific state. One of these alternatives is creating what is called a system image. A system image is a copy of all your system and program files needed for your computer to run properly. You can use this option to store an image of your computer at a specific moment, and use it to restore your computer to that state later.
The option to create a system image is in the same Backup and Restore window we discussed before.
Step 1 - Open the Backup and Restore window from the Control Panel.
Step 2 - On the Backup and Restore window, choose the \"Create a system image\" option on the left.
Step 3 - On the Create a system image window, you can choose where to store the backup from among three places: your hard disk, on DVD's, or in the network.
Step 4 - In the next window, just confirm your image settings and click Start backup.

**Resetting the PC**
Another alternative Windows 10 offers you for system recovery is simply called \"Reset this PC\". This option will allow you to return your computer to its default factory settings. It will also give you the option of keeping your files or removing everything.
To reset your PC, follow these steps:
Step 1 - Go to SETTINGS and select Update & security.
Step 2 - On the UPDATE & SECURITY window, select Recovery.
Step 3 - On the Recovery window, you can click the \"Get started\" button under Reset this PC.
Step 4 - The next window, will ask you whether you want to reset your settings and applications, but still keep your personal files, or just remove everything and return your computer to its default state.

**Advanced Options**
Windows 10 features several advanced options to restore your PC. Although these are meant for advanced users, you can access them from the same Update & Security window that we have discussed before.
Step 1 - Open the Settings window and select UPDATE & SECURITY.
Step 2 - On the UPDATE & SECURITY window, select Recovery. Under Advanced startup, click the Restart now button.
Step 3 - When Windows 10 restarts, it will present you a menu of options to select from. The same menu will appear whenever Windows tries to boot unsuccessfully.

### References

- https://www.tutorialspoint.com/
- https://edu.gcfglobal.org/en/computerbasics/
- https://techcult.com/how-to-use-performance-monitor-windows-10/
- https://techworm.net/programming/macos-windows-linux-better/
"""

lecture3_full_text = """
DEFINE SYSTEMS ADMINISTRATION AND IDENTIFY RESPONSIBILITIES

image credit: ph.talent.com/salary

## What is System Administration

System administration refers to the management of one or more hardware and software systems.

The task is performed by a system administrator who monitors system health, monitors and allocates system resources like disk space, performs backups, provides user access, manages user accounts, monitors system security and performs many other functions.

## What is a sysadmin?

Short for \"system administrator\", sysadmins are responsible for administration, management, and support activities associated with the IT infrastructure at a multi-user organization.

Responsibilities within the domain of IT systems administration can be distributed across different job titles depending on size of the organization or the scope of work required.

image credit: BMC

## Sysadmin role & responsibilities

### User administration

The primary responsibility of a sysadmin is to support reliable and effective use of complex IT systems by end users, whether internal employees or external customers. Activities range from managing identities and access to providing dedicated technical support to individual users.

### System maintenance

Sysadmins are responsible for dependable access and availability to IT systems. Sysadmins are therefore required to troubleshoot and fix issues that compromise system performance or access to an IT service.

### Documentation

Sysadmins are required to maintain records of IT assets usage. To plan for future IT investments and upgrades, you will document:
- End-user requests
- Business requirements
- IT issues

Documentation also underpins regulatory compliance.

### System health monitoring

Most IT issues go unnoticed until the impact reaches end users. Sysadmins therefore monitor system health and identify anomalous network behavior, which may include security-sensitive activities such as unauthorized network access and data transfer.

### Backup & disaster recovery

Sysadmins implement data backup and disaster recovery strategies for different IT systems and SDLC environments. You'll also facilitate end-users in accessing data that may have been deleted or unavailable.

Activities may involve:
- Implementing automated software solutions
- Replacing hardware and software components

### Application compatibility

Sysadmins support various IT teams to ensure that software systems and feature releases are compatible with the IT infrastructure. For example, as sysadmin you may:
- Testing server load performance
- Install/upgrade hardware components

### Web service administration & configurations

Sysadmins regularly perform web service administration and configuration management activities, including ensuring that configuration changes are documented and follow organizational policies associated with access and cybersecurity.

### Network administration

To maintain network integrity, sysadmins ensure that network interactions follow organizational policies and protocols. (A background in network engineering may be required to perform mission-critical network administration activities.)

### Security administration

Security responsibilities are centered on infrastructure and network security, with activities including:
- Network monitoring and analysis
- Identity and access management
- Maintaining security of hardware components
- Managing software licensing, updates, and patching

### Database administration

Sysadmins may be responsible for maintaining the integrity, performance, and efficiency of database systems. Database management activities may include migration, design, configuration, installation and security of the organization's data assets.

### Installation & patching

Sysadmins are responsible for managing, troubleshooting, licensing, and updating hardware and software assets. You will ensure that appropriate measures are proactively followed in response to unforeseen issues such as IT downtime or zero-day exploits. Then, you'll documented these activities and follow a strategic approach, per organizational policy.

### User training

Usually sysadmins communicate directly with end users to solve technical issues. Sometimes, you may also conduct training programs to bring users up to pace with new software installations or IT system changes.

## Common skills for sysadmins

Sysadmin positions may not require engineering know-how, but a strong background in IT is necessary to perform sysadmin duties. You'll also want to boast effective communication skills, both written and verbal.

### Subject matter expertise

Most organizations employ multiple individuals specializing in specific system administration domains, so you'll want to be an expert in one or more of the following:
- Computer systems
- Networks
- Hardware and software troubleshooting
- Databases
- Web services

### Problem solving

You're often the first person called upon to deal with a problem, so troubleshooting and understanding key systems are essential. Strong interpersonal and communication skills, both written and verbal, as you'll deal with technically-minded employees and non-technical colleagues alike.

## Sysadmin certifications & education

While a Bachelor's degree in computer science can be helpful, some industry-leading certifications and the right hands-on experience can easily supplant formal academic learning. Here are some top certifications for sysadmins:
- Microsoft Windows Server and Desktop certifications
- CompTIA Network+ and A+
- Unix
- Linux

## Sysadmin salary trends

System Administration is hard work and forms the foundation of every engineering activity in IT-driven enterprises. RedHat indicates that sysadmins in the U.S. average:
- An annual salary of $68,000 at the beginning of their career
- $81,500 for average experience and job complexity
- $115,750 for the most commanding sysadmin positions

image credit: https://ph.indeed.com/career/systems-administrator/salaries
image credit: https://ph.indeed.com/jobs?q=system+administrator&fromage=14&vjk=3fc5c9aa7d436253
"""

# This is the perfected "Hard Mode" dataset.
training_data = [
    {"question": "What does IT stand for?","answer": "Information Technology", "context": lecture1_full_text},
    {"question": "What does COTS stand for?","answer": "commercial-off-the-shelf", "context": lecture1_full_text},
    {"question": "What does BCP stand for?","answer": "Business Continuity Planning", "context": lecture1_full_text},
    {"question": "What is a SYSTEM?","answer": "An array of components designed to accomplish a particular objective according to plan.", "context": lecture1_full_text},
    {"question": "What is SYSTEMS THINKING?","answer": "Is a way of understanding an entity in terms of its purpose, as three steps.", "context": lecture1_full_text},
    {"question": "What are the three major steps in systems thinking?","answer": "1.  Identify a containing whole (system), of which the thing to be explained is a part.\n2.  Explain the behavior or properties of the containing whole.\n3.  Explain the behavior or properties of the thing to be explained in terms of its role(s) or function(s) within its containing whole (Ackoff, 1981).", "context": lecture1_full_text},
    {"question": "What is SYSTEM INTEGRATION?","answer": "Is the combination of inter-related elements to achieve a common objective(s).", "context": lecture1_full_text},
    {"question": "What is System Architecture?","answer": "The architecture of a system defines its high-level structure, exposing its gross organization as a collection of interacting components.", "context": lecture1_full_text},
    {"question": "What elements are needed to model a software architecture?","answer": "Components, Connectors, Systems, Properties, and Styles", "context": lecture1_full_text},
    {"question": "What are the three sources of IS projects?","answer": "problems, opportunities, and directives", "context": lecture1_full_text},
    {"question": "What are Problems in the context of IS projects?","answer": "undesirable situations that prevent the business from fully achieving its purpose, goals, and objectives", "context": lecture1_full_text},
    {"question": "What is an Opportunity in the context of IS projects?","answer": "a chance to improve the business even in the absence of specific problems", "context": lecture1_full_text},
    {"question": "What is a Directive in the context of IS projects?","answer": "a new requirement that is imposed by management, government, or some external influence", "context": lecture1_full_text},
    {"question": "Who are Stakeholders?","answer": "the people involved in or affected by project activities", "context": lecture1_full_text},
    {"question": "What are the 4 frames for analyzing a formal organization?","answer": "-   **Structural Frame:** Focuses on roles and responsibilities, coordination and control. Organizational charts help define this frame.\n-   **Political Frame:** Assumes organizations are coalitions composed of varied individuals and interest groups. Conflict and power are key issues.\n-   **Human Resources Frame:** Focuses on providing harmony between needs of the organization and needs of people.\n-   **Symbolic Frame:** Focuses on symbols and meanings related to events. Culture is important.", "context": lecture1_full_text},
    {"question": "Which organizational frame focuses on roles and responsibilities?","answer": "Structural Frame", "context": lecture1_full_text},
    {"question": "Which organizational frame deals with conflict and power?","answer": "Political Frame", "context": lecture1_full_text},
    {"question": "Which organizational frame focuses on harmony between the organization and its people?","answer": "Human Resources Frame", "context": lecture1_full_text},
    {"question": "Which organizational frame focuses on symbols and culture?","answer": "Symbolic Frame", "context": lecture1_full_text},
    {"question": "What are the 4 basic organizational structures?","answer": "4 basic organizational structures:\n-   Functional\n-   Project-based\n-   Matrix\n-   Divisional", "context": lecture1_full_text},
    {"question": "What is a FUNCTIONAL STRUCTURE?","answer": "People who do similar tasks, have similar skills and/or jobs in an organization are grouped into a functional structure.", "context": lecture1_full_text},
    {"question": "What is a DIVISIONAL STRUCTURE?","answer": "In a divisional structure, the company will coordinate inter-group relationships to create a work team that can readily meet the needs of a certain customer or group of customers.", "context": lecture1_full_text},
    {"question": "What is a MATRIX STRUCTURE?","answer": "Matrix structures are more complex in that they group people in two different ways:\n-   by the function they perform and\n-   by the product team they are working with.", "context": lecture1_full_text},
    {"question": "What is a PROJECT ORGANIZATION STRUCTURE?","answer": "In a project-organizational structure, the teams are put together based on the number of members needed to produce the product or complete the project.", "context": lecture1_full_text},
    {"question": "What helps IT projects succeed?","answer": "-   Executive support\n-   User involvement\n-   Experienced project manager\n-   Clear business objectives\n-   Minimized scope\n-   Standard software infrastructure\n-   Firm basic requirements\n-   Formal methodology\n-   Reliable estimates", "context": lecture1_full_text},
    {"question": "What manages a computer's memory, processes, software, and hardware?","answer": "An operating system", "context": lecture2_full_text},
    {"question": "What does OS stand for?","answer": "operating system", "context": lecture2_full_text},
    {"question": "What are the three most common operating systems?","answer": "Microsoft Windows, macOS, and Linux", "context": lecture2_full_text},
    {"question": "What is the name of Apple's operating system?","answer": "macOS", "context": lecture2_full_text},
    {"question": "What was macOS previously called?","answer": "OS X", "context": lecture2_full_text},
    {"question": "What are the pros of macOS?","answer": "**Pros:**\n1. Simple and Powerful user interface\n2. Fewer Virus attacks\n3. World class Integration between hardware and software\n4. Integration of Apple Products", "context": lecture2_full_text},
    {"question": "What are the cons of macOS?","answer": "**Cons:**\n1. Expensive\n2. Harder to upgrade\n3. No Games", "context": lecture2_full_text},
    {"question": "What is a family of open-source operating systems?","answer": "Linux", "context": lecture2_full_text},
    {"question": "What are the main advantages of Linux?","answer": "it is free, and there are many different distributions-or versions-you can choose from", "context": lecture2_full_text},
    {"question": "What are the pros of Linux?","answer": "**Pros:**\n1. Low cost\n2. Stability\n3. Flexibility\n4. Performance\n5. Choice", "context": lecture2_full_text},
    {"question": "What are the cons of Linux?","answer": "**Cons:**\n1. Understanding\n2. Software\n3. Ease\n4. Hardware", "context": lecture2_full_text},
    {"question": "Who created the Windows operating system?","answer": "Microsoft", "context": lecture2_full_text},
    {"question": "What are the pros of Windows?","answer": "**Pros:**\n1. Cheap\n2. Variety Options\n3. Software availability\n4. Customizable\n5. Games\n6. Upgradable", "context": lecture2_full_text},
    {"question": "What are the cons of Microsoft Windows?","answer": "**Cons:**\n1. Malware and Virus Attacks\n2. Less Reliable\n3. Becomes Laggy overtime", "context": lecture2_full_text},
    {"question": "What are the different versions of Windows 11?","answer": "- Windows 11 Home: intended for home use. It contains all the basic functions an average user needs.\n- Windows 11 Pro: intended for professionals and businesses. It includes additional features for business use, such as advanced security features and comprehensive management tools.\n- Windows 11 Enterprise: intended for large organisations and offers the most comprehensive set of features, including advanced security and management tools.\n- Windows 11 Education: specially designed for education and includes features useful for schools and universities.\n- Windows 11 IoT Enterprise: a version of Enterprise with the same functionality, but with different mechanisms for licensing and distribution.", "context": lecture2_full_text},
    {"question": "Which Windows 11 version is for professionals and businesses?","answer": "Windows 11 Pro", "context": lecture2_full_text},
    {"question": "Which version of Windows 11 is specially designed for education?","answer": "Windows 11 Education", "context": lecture2_full_text},
    {"question": "Which version of Windows 11 is intended for home use?","answer": "Windows 11 Home", "context": lecture2_full_text},
    {"question": "What Windows feature performs a back-up of files in your libraries?","answer": "File History", "context": lecture2_full_text},
    {"question": "What is a system image?","answer": "a copy of all your system and program files needed for your computer to run properly", "context": lecture2_full_text},
    {"question": "What option returns your computer to its default factory settings?","answer": "Reset this PC", "context": lecture2_full_text},
    {"question": "What is System Administration?","answer": "the management of one or more hardware and software systems", "context": lecture3_full_text},
    {"question": "What are some tasks performed by a system administrator?","answer": "monitors system health, monitors and allocates system resources like disk space, performs backups, provides user access, manages user accounts, monitors system security and performs many other functions", "context": lecture3_full_text},
    {"question": "What is the short term for system administrator?","answer": "sysadmins", "context": lecture3_full_text},
    {"question": "What is the primary responsibility of a sysadmin?","answer": "to support reliable and effective use of complex IT systems by end users", "context": lecture3_full_text},
    {"question": "Which sysadmin role involves managing identities and access?","answer": "User administration", "context": lecture3_full_text},
    {"question": "Which sysadmin role ensures dependable access and availability?","answer": "System maintenance", "context": lecture3_full_text},
    {"question": "Which sysadmin role involves maintaining records of IT assets?","answer": "Documentation", "context": lecture3_full_text},
    {"question": "Which sysadmin role involves monitoring system health?","answer": "System health monitoring", "context": lecture3_full_text},
    {"question": "Which sysadmin role involves implementing data backup strategies?","answer": "Backup & disaster recovery", "context": lecture3_full_text},
    {"question": "Which sysadmin role ensures software is compatible with IT infrastructure?","answer": "Application compatibility", "context": lecture3_full_text},
    {"question": "Which sysadmin role involves web service administration?","answer": "Web service administration & configurations", "context": lecture3_full_text},
    {"question": "Which sysadmin role maintains network integrity?","answer": "Network administration", "context": lecture3_full_text},
    {"question": "Which sysadmin role is centered on infrastructure and network security?","answer": "Security administration", "context": lecture3_full_text},
    {"question": "Which sysadmin role maintains database integrity and performance?","answer": "Database administration", "context": lecture3_full_text},
    {"question": "Which sysadmin role involves updating hardware and software?","answer": "Installation & patching", "context": lecture3_full_text},
    {"question": "Which sysadmin role involves training users on new software?","answer": "User training", "context": lecture3_full_text},
    {"question": "What are common skills for sysadmins?", "answer": "a strong background in IT is necessary to perform sysadmin duties. You'll also want to boast effective communication skills, both written and verbal.", "context": lecture3_full_text},
    {"question": "What are some domains for sysadmin expertise?", "answer": "- Computer systems\n- Networks\n- Hardware and software troubleshooting\n- Databases\n- Web services", "context": lecture3_full_text},
    {"question": "Why is problem solving an essential skill for sysadmins?", "answer": "You're often the first person called upon to deal with a problem, so troubleshooting and understanding key systems are essential", "context": lecture3_full_text},
    {"question": "What are some top certifications for sysadmins?", "answer": "- Microsoft Windows Server and Desktop certifications\n- CompTIA Network+ and A+\n- Unix\n- Linux", "context": lecture3_full_text}
]

# Convert to Hugging Face's Dataset format
raw_dataset = Dataset.from_list(training_data)
print(f"Dataset loaded with {len(raw_dataset)} entries.")

In [None]:
print("\n--- Step 4: Pre-processing data with the final robust method... ---")

# --- THIS IS THE NEW, MORE POWERFUL MODEL ---
MODEL_NAME = "distilbert-base-cased-distilled-squad"
# Using a 'fast' tokenizer is key to avoiding the bugs we saw before
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)

max_length = 384 # The max length for this model
stride = 128

# This single, robust function will replace the fragile two-step process.
def preprocess_training_examples(examples):
    questions = [q.strip() for q in examples["question"]]
    
    inputs = tokenizer(
        questions,
        examples["context"],
        max_length=max_length,
        truncation="only_second",
        stride=stride,
        return_overflowing_tokens=True,
        return_offsets_mapping=True,
        padding="max_length",
    )

    offset_mapping = inputs.pop("offset_mapping")
    sample_map = inputs.pop("overflow_to_sample_mapping")
    
    answers = examples["answer"]
    start_positions = []
    end_positions = []

    for i, offset in enumerate(offset_mapping):
        sample_idx = sample_map[i]
        answer = answers[sample_idx]
        
        # Find the character start and end of the answer in the original context
        start_char = examples["context"][sample_idx].find(answer)
        if start_char == -1:
             start_positions.append(0) # CLS index
             end_positions.append(0)   # CLS index
             continue
        
        end_char = start_char + len(answer)
        sequence_ids = inputs.sequence_ids(i)

        # Find the start and end of the context in the chunk
        idx = 0
        while sequence_ids[idx] != 1:
            idx += 1
        context_start = idx
        while idx < len(sequence_ids) and sequence_ids[idx] == 1:
            idx += 1
        context_end = idx - 1

        # If the answer is not in this chunk, label it with the CLS token
        if offset[context_start][0] > start_char or offset[context_end][1] < end_char:
            start_positions.append(0) # CLS index
            end_positions.append(0)   # CLS index
        else:
            # Otherwise, find the token start and end that correspond to the character start and end
            idx = context_start
            while idx <= context_end and offset[idx][0] <= start_char:
                idx += 1
            start_positions.append(idx - 1)

            idx = context_end
            while idx >= context_start and offset[idx][1] >= end_char:
                idx -= 1
            end_positions.append(idx + 1)

    inputs["start_positions"] = start_positions
    inputs["end_positions"] = end_positions
    return inputs

# Apply the robust pre-processing function
processed_dataset = raw_dataset.map(
    preprocess_training_examples,
    batched=True,
    remove_columns=raw_dataset.column_names,
    num_proc=os.cpu_count() or 4
)

print(f"Data pre-processing complete. The {len(raw_dataset)} questions generated {len(processed_dataset)} training chunks.")

In [None]:
print("\n--- Step 5: Fine-tuning the new, more powerful model... ---")

model = AutoModelForQuestionAnswering.from_pretrained(MODEL_NAME)
OUTPUT_MODEL_DIR = "./mindvault-qa-final-expert"

training_args = TrainingArguments(
    output_dir=OUTPUT_MODEL_DIR,
    evaluation_strategy="no",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    num_train_epochs=5, # A smarter model needs less training
    weight_decay=0.01,
    logging_steps=100,
)

trainer = Trainer(
    model=model, 
    args=training_args, 
    train_dataset=processed_dataset, 
    tokenizer=tokenizer,
)

trainer.train()
print("\n--- Fine-tuning of the final model is complete! ---")

In [None]:
print(f"\n--- Step 6: Saving the final model to '{OUTPUT_MODEL_DIR}'... ---")
trainer.save_model(OUTPUT_MODEL_DIR)
tokenizer.save_pretrained(OUTPUT_MODEL_DIR)
print("Final model saved successfully.")

print("\n--- Step 7: Converting the final model to ONNX format... ---")

ONNX_OUTPUT_DIR = "./mindvault-qa-onnx-final-expert"
os.makedirs(ONNX_OUTPUT_DIR, exist_ok=True)

onnx_tokenizer = AutoTokenizer.from_pretrained(OUTPUT_MODEL_DIR)
onnx_model = AutoModelForQuestionAnswering.from_pretrained(OUTPUT_MODEL_DIR)

feature = "question-answering"
model_kind, model_onnx_config = FeaturesManager.check_supported_model_or_raise(onnx_model, feature=feature)
onnx_config = model_onnx_config(onnx_model.config)

output_path = Path(ONNX_OUTPUT_DIR) / "model.onnx"
quantized_output_path = Path(ONNX_OUTPUT_DIR) / "model.quant.onnx"

# Use a different function for exporting DistilBERT
from collections import OrderedDict
from transformers.onnx import OnnxConfig

class DistilBertOnnxConfig(OnnxConfig):
    @property
    def inputs(self) -> 'OrderedDict[str, OrderedDict[int, str]]':
        return OrderedDict(
            [
                ("input_ids", {0: "batch", 1: "sequence"}),
                ("attention_mask", {0: "batch", 1: "sequence"}),
            ]
        )

onnx_config = DistilBertOnnxConfig(onnx_model.config)

export(
    preprocessor=onnx_tokenizer, 
    model=onnx_model, 
    config=onnx_config,
    opset=13, # Use opset 13 for better compatibility
    output=output_path
)

print(f"ONNX model saved to {output_path}")
print("Quantizing the ONNX model...")
quantize_dynamic(
    model_input=output_path, 
    model_output=quantized_output_path, 
    weight_type=QuantType.QInt8
)
print(f"Quantized ONNX model saved to {quantized_output_path}")

print("\n\n=======================================================")
print("✅✅✅ FINAL, WORKING EXPERT MODEL CREATED! ✅✅✅")
print("=======================================================")
print(f"Your final, most powerful AI model is located in the folder:")
print(f"'{ONNX_OUTPUT_DIR}'")

In [None]:
print("\n--- Step 8: Starting Interactive Test Session ---")

def get_answer_interactive(question, context, tokenizer, ort_session):
    inputs = tokenizer(question, context, return_tensors="np", truncation="only_second", max_length=512)
    ort_inputs = {'input_ids': inputs['input_ids'], 'attention_mask': inputs['attention_mask']}
    
    try:
        ort_outs = ort_session.run(None, ort_inputs)
    except Exception as e:
        return f"[ERROR: ONNX runtime failed - {e}]"

    start_logits, end_logits = ort_outs[0], ort_outs[1]
    start_index = np.argmax(start_logits, axis=1)[0]
    end_index = np.argmax(end_logits, axis=1)[0]

    if start_index > end_index or (start_index == 0 and end_index == 0):
        return "[I could not find a specific answer in the document.]"
        
    answer_tokens = inputs['input_ids'][0][start_index:end_index+1]
    return tokenizer.decode(answer_tokens, clean_up_tokenization_spaces=False)

try:
    interactive_tokenizer = AutoTokenizer.from_pretrained("./mindvault-qa-final-expert")
    interactive_session = ort.InferenceSession("./mindvault-qa-onnx-final-expert/model.quant.onnx")
    print("Final model loaded for testing.")
    model_ready = True
except Exception as e:
    print(f"\n❌ ERROR: Could not load the final model for testing. Details: {e}")
    model_ready = False

if model_ready:
    print("\n--- Loading knowledge base from .txt files... ---")
    full_knowledge_base = ""
    file_names = ['ITEC116-Lecture-1-_1_.txt', 'lesson2_Version2.txt', 'index_delayed_ui_Version2.txt']
    files_loaded_count = 0

    for file_name in file_names:
        try:
            with open(file_name, 'r', encoding='utf-8') as f:
                full_knowledge_base += f.read() + "\n\n---\n\n" # Add separators
                print(f"  - Successfully loaded {file_name}")
                files_loaded_count += 1
        except FileNotFoundError:
            print(f"  - ⚠️ WARNING: Could not find file: {file_name}")

    if files_loaded_count == 0:
        print("\n❌ ERROR: No text files were loaded. Please upload your .txt files to the Colab session.")
    else:
        print(f"✅ Knowledge base created from {files_loaded_count} file(s).")
        print("\n=============================================")
        print("🧠 MindVault AI is ready.")
        print("Ask any question about the content of your files.")
        print("Type 'exit' when you are finished.")
        print("=============================================")

        while True:
            try:
                question = input("\nYour Question: ")
                if question.lower().strip() == 'exit':
                    print("\nExiting MindVault AI. Goodbye!")
                    break
                
                print("🤖 Thinking...")
                
                answer = get_answer_interactive(question, full_knowledge_base, interactive_tokenizer, interactive_session)
                
                print("\n💡 Answer:")
                print(textwrap.fill(answer, width=80))

            except Exception as e:
                print(f"\nAn unexpected error occurred: {e}")
                break