<h1>The LLM</h1>

In [1]:
!huggingface-cli login

In [2]:
import os
import logging
import sys
from IPython.display import Markdown, display
from llama_index.llms import LlamaCPP
from llama_index import ServiceContext, set_global_tokenizer, set_global_service_context
from llama_index import SimpleDirectoryReader, VectorStoreIndex
from llama_index.prompts import PromptTemplate
from llama_index.embeddings import HuggingFaceEmbedding
from transformers import AutoTokenizer
import torch

In [3]:
logging.basicConfig(
    stream = sys.stdout,
    level = logging.INFO
)
logging.getLogger().addHandler(
    logging.StreamHandler(
        stream = sys.stdout
    )
)

In [4]:
model_name = 'Llama2-7b-chat'
model_path = r"C:\0-VARAD-DESHMUKH\models\llama-2-7b-chat.Q6_K.gguf"
max_new_tokens = 2048
context_window = 4096

llm = LlamaCPP(
    model_path = model_path,
    temperature = 0,
    max_new_tokens = max_new_tokens,
    context_window = context_window,
    generate_kwargs = {},
    model_kwargs = {
        'load_in_8bit' : True
    }
)

tokenizer_model = r'meta-llama/Llama-2-7b-chat-hf'
token = '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'
set_global_tokenizer(
    AutoTokenizer.from_pretrained(
        tokenizer_model,
        token = token
    ).encode
)

AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | 
Model metadata: {'general.name': 'LLaMA v2', 'general.architecture': 'llama', 'llama.context_length': '4096', 'llama.rope.dimension_count': '128', 'llama.embedding_length': '4096', 'llama.block_count': '32', 'llama.feed_forward_length': '11008', 'llama.attention.head_count': '32', 'tokenizer.ggml.eos_token_id': '2', 'general.file_type': '18', 'llama.attention.head_count_kv': '32', 'llama.attention.layer_norm_rms_epsilon': '0.000001', 'tokenizer.ggml.model': 'llama', 'general.quantization_version': '2', 'tokenizer.ggml.bos_token_id': '1', 'tokenizer.ggml.unknown_token_id': '0'}


In [5]:
embed_model_path = r"C:\Users\rck05\.cache\huggingface\hub\models--WhereIsAI--UAE-Large-V1\snapshots\82f6ace7a8954c012dd2ae05e2604fbc9007205b"
embed_model_name = 'WhereIsAI/UAE-Large-V1'

if not os.path.exists(embed_model_path):
    embed_model = HuggingFaceEmbedding(
        embed_model_name
    )
    print('Embedding model not found in cache. Downloading and creating one.!')
else:
    embed_model = HuggingFaceEmbedding(
        embed_model_path
    ) 
    print('Embedding model found in cache.')

print('Details -\nModel name: ', embed_model_name, '\nModel Directory: ', embed_model_path)

Embedding model found in cache.
Details -
Model name:  WhereIsAI/UAE-Large-V1 
Model Directory:  C:\Users\rck05\.cache\huggingface\hub\models--WhereIsAI--UAE-Large-V1\snapshots\82f6ace7a8954c012dd2ae05e2604fbc9007205b


In [6]:
service_context = ServiceContext.from_defaults(
    llm = llm,
    embed_model = embed_model
)

set_global_service_context(service_context)
print('Global context set.')
print('Foundational model: ', model_name)
print('Embedding model: ', embed_model_name)

Global context set.
Foundational model:  Llama2-7b-chat
Embedding model:  WhereIsAI/UAE-Large-V1


In [7]:
#response_iter = llm.stream_complete("What is global warming? Give a technical answer.")
#for response in response_iter:
#    print(response.delta, end="", flush=True)

In [8]:
documents = SimpleDirectoryReader(
    r"C:\0-VARAD-DESHMUKH\Files\data"
).load_data()
# create vector store index
#index = VectorStoreIndex.from_documents(documents)
# set up query engine
#query_engine = index.as_query_engine(streaming=True)
#response = query_engine.query("What services does Axiom Space offer?")
#print(response)

In [9]:
from llama_index.node_parser import SemanticSplitterNodeParser
from llama_index.extractors import (
    SummaryExtractor,
    TitleExtractor,
    EntityExtractor
)
from llama_index.ingestion import IngestionPipeline

In [10]:
# takes too much time!!!
#splitter = SemanticSplitterNodeParser(
 #   buffer_size=1,
  #  breakpoint_percentile_threshold=95,
   # embed_model=embed_model
#)

#extractor = [
 #   SummaryExtractor(
  #      summaries=['prev', 'self', 'next'],
   #     llm=llm
    #),
    #TitleExtractor(
     #   nodes=5,
      #  llm=llm
    #),
    #EntityExtractor(
     #   prediction_threshold=0.5,
      #  label_entities=False,
       # device='cpu'
    #)
#]

#pipeline = IngestionPipeline(
 #   transformations=[splitter, *extractor]
#)


In [11]:
# takes too much of time!!!
#nodes = pipeline.run(
 #documents=documents,
  #in_place=False,
   # show_progress=True
#)

In [12]:

splitter = SemanticSplitterNodeParser(
    buffer_size=1,
    breakpoint_percentile_threshold=95,
    embed_model=embed_model
)

embedding = HuggingFaceEmbedding(embed_model_name)

pipeline = IngestionPipeline(
    transformations=[splitter, embedding]
)


In [13]:
import nest_asyncio 
nest_asyncio.apply()

nodes = pipeline.run(
    documents=documents,
    in_place=False,
    show_progress=True
)

Parsing nodes:   0%|          | 0/22 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/23 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/16 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/15 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/6 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/7 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/5 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/21 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/16 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/18 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/16 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/18 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/12 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/20 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/21 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/14 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/25 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/1 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/29 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/46 [00:00<?, ?it/s]

In [14]:
len(nodes)

46

In [15]:
index = VectorStoreIndex(nodes)

In [17]:
query_engine = index.as_query_engine(streaming=True)

In [18]:
response = query_engine.query("Explain in detail how Axiom Space is looking to construct the world's first commercial space station.")
response.print_response_stream()

 Axiom Space aims to construct the world's first commercial space station by leveraging its strategic partnership with NASA and Thales Alenia Space. The company plans to build the station in phases, starting with the initial assembly of Hab One, which is scheduled to go to space in early 2026. Axiom will then continue to expand the station through additional modules, with the goal of completing the entire structure by 2028.
To construct the station, Axiom is utilizing cutting-edge technologies and techniques, including 3D printing and robotic assembly. The company is also working closely with its partners at Thales Alenia Space to ensure that the modules are designed and built to meet the highest standards of safety and reliability.
Axiom's approach to constructing the world's first commercial space station is unique in several ways. First, the company is focusing on creating a modular design that can be easily expanded and modified as needed. This will allow Axiom to adapt to changing

In [21]:
print(response.source_nodes[0].get_content())

Axiom Space 
 
 
 
Intro-act.com | frank@intro-act.com | 617-454- 1088 
 13 space stations in the coming years. Commercial space stations have business models that accommodate 
commercial for-profit recurring activities in LEO. 
 Axiom Space is working on the world’s next breakthrough innovation platform.  Axiom Space differentiates 
itself from its competitors as the only company having the advantage of connecting its modules to the International 
Space Station. Through this strategic partnership, Axiom Space will smoothly continue its research and 
manufacturing activities through the effective adoption of the hefty multinational user base of the ISS National 
Laboratory. Leveraging techniques available only in microgravity, Axiom Station will introduce people, research, and 
manufacturing, thus proliferating the growth of key industries. In order to provide a high-quality accessible platform 
to private companies and governments, Axiom is making efforts to ensure that the space sta