In [1]:
import openai


import os

openai.api_key = os.getenv("OPENAI_API_KEY")



## Get some RFC data

Took one from 2023 that's not included in the training data

In [4]:
# read the RFC file that's not in the

rfc_file = 'ietf.org_rfc_rfc9340.txt'

with open(rfc_file, 'r') as file:
    data = file.read().replace('\n', '')

# count words

words = data.split()
print('Number of words in text file :', len(words))



Number of words in text file : 14664


## Naive chunking

Each chunk is 100 lines of RFC doc

In [5]:
# Create an array of document with each element containing 100 lines

docs = []

with open(rfc_file, 'r') as file:
    doc = []
    for line in file:
        doc.append(line.strip())
        if len(doc) == 100:
            docs.append(doc)
            doc = []
    if len(doc) > 0:
        docs.append(doc)

len(docs)

# for each doc, join the lines to create a single string adding a \n between each line

for i in range(len(docs)):
    docs[i] = '\n'.join(docs[i])
    

# print the first lines of the first doc

print(docs[0][:100])





Internet Research Task Force (IRTF)                         W. Kozlowski
Request for Comments: 


In [6]:
# create a metadata in the form of a dictionary: {"source": "<docs_id>"}

metadata = []
for i in range(len(docs)):
    metadata.append({"source": i})

print(metadata[0:10])

[{'source': 0}, {'source': 1}, {'source': 2}, {'source': 3}, {'source': 4}, {'source': 5}, {'source': 6}, {'source': 7}, {'source': 8}, {'source': 9}]


In [7]:
# create a list of each ids as an array

ids = []
for i in range(len(docs)):
    ids.append(str(i))

# print the 10 first ids
print(ids[:10])

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']


## Basic Storing of RFCs

Persistent ChromaDB

In [8]:
import chromadb
from chromadb.config import Settings

client = chromadb.Client(Settings(
    chroma_db_impl="duckdb+parquet",
    persist_directory="./chromadb/"
))

In [9]:
collection = client.create_collection(name="naive_rfc_chunking")

In [10]:
collection.add(
    documents=docs,
    metadatas=metadata,
    ids=ids
)

In [12]:
from pprint import pprint

results = collection.query(
    query_texts=["What is a quantum network stack?"],
    n_results=10
)

pprint(results["distances"])
pprint(results["ids"])

[[0.7572208046913147,
  0.8892778754234314,
  0.9584118127822876,
  0.9923790097236633,
  0.9929220080375671,
  1.008371114730835,
  1.0142896175384521,
  1.0535603761672974,
  1.0790119171142578,
  1.0914597511291504]]
[['0', '4', '10', '12', '1', '2', '13', '16', '5', '20']]


In [15]:
# print the most relevant doc as a string

print(docs[0])





Internet Research Task Force (IRTF)                         W. Kozlowski
Request for Comments: 9340                                     S. Wehner
Category: Informational                                           QuTech
ISSN: 2070-1721                                             R. Van Meter
Keio University
B. Rijsman
Individual
A. S. Cacciapuoti
M. Caleffi
University of Naples Federico II
S. Nagayama
Mercari, Inc.
March 2023


Architectural Principles for a Quantum Internet

Abstract

The vision of a quantum internet is to enhance existing Internet
technology by enabling quantum communication between any two points
on Earth.  To achieve this goal, a quantum network stack should be
built from the ground up to account for the fundamentally new
properties of quantum entanglement.  The first quantum entanglement
networks have been realised, but there is no practical proposal for
how to organise, utilise, and manage such networks.  In this
document, we attempt to lay down the framework 

## Prepare some questions based on the RFC with Claude

In [16]:
json_questions = [
  {
    "question": "What is the key distinction between a classical network packet and a qubit?",
    "answer": "In contrast, entanglement is a phenomenon in which two or more qubits exist in a physically distributed state. Operations on one of the qubits change the mutual state of the pair. Since the owner of a particular qubit cannot just read out its state, it must coordinate all its actions with the owner of the pair's other qubit."
  },
  {
    "question": "How is error correction different in quantum networks versus classical networks?",
    "answer": "Therefore, we cannot use the same methods known from classical computing for the purposes of error detection and correction. Nevertheless, quantum error detection and correction schemes exist that take this problem into account, and how a network chooses to manage errors will have an impact on its architecture."
  },
  {
    "question": "What are the key differences between first, second, and third generation quantum networks?",
    "answer": "Generations are defined by the directions of classical signalling required in their distributed protocols for loss tolerance and error tolerance. Classical signalling carries the classical bits, incurring round-trip delays... The three \"generations\" summarised: 1) First-generation quantum networks use heralding for loss tolerance and entanglement distillation for error tolerance. 2) Second-generation quantum networks improve upon the first generation with QEC codes for error tolerance (but not loss tolerance). 3) Third-generation quantum networks directly transmit QEC-encoded qubits to adjacent nodes."
  },
  {
    "question": "Why can't qubits be amplified like classical signals?",
    "answer": "The no-cloning theorem states that it is impossible to create an identical copy of an arbitrary, unknown quantum state. Therefore, it is also impossible to use the same mechanisms that worked for classical networks for signal amplification, retransmission, and so on, as they all rely on the ability to copy the underlying data."
  },
  {
    "question": "What are the two meanings of \"qubit\"?",
    "answer": "In the first meaning, \"qubit\" refers to a physical quantum system whose quantum state can be expressed as a superposition of two basis states, which we often label |0⟩ and |1⟩. Here, \"qubit\" refers to a physical implementation akin to what a flip-flop, switch, voltage, or current would be for a classical bit. In the second meaning, \"qubit\" refers to the abstract quantum state of a quantum system with such two basis states. In this case, the meaning of \"qubit\" is akin to the logical value of a bit, from classical computing, i.e., \"logical 0\" or \"logical 1\"."
  },
  {
    "question": "How do quantum repeaters extend entanglement over long distances?",
    "answer": "The solution is entanglement swapping. A Bell pair between any two nodes in the network can be constructed by combining the pairs generated along each individual link on a path between the two end-points. Each node along the path can consume the two pairs on the two links to which it is connected, in order to produce a new entangled pair between the two remote ends. This process is known as entanglement swapping."
  },
  {
    "question": "What classical communication is necessary in a quantum network?",
    "answer": "Classical communication is a crucial building block of any quantum network. All nodes in a quantum network are assumed to have classical connectivity with each other (within typical administrative domain limits)."
  },
  {
    "question": "What are the key elements of a quantum network architecture?",
    "answer": "We have identified quantum repeaters as the core building block of a quantum network. However, a quantum repeater will have to do more than just entanglement swapping in a functional quantum network. Its key responsibilities will include the following: 1) Creating link-local entanglement between neighbouring nodes. 2) Extending entanglement from link-local pairs to long-range pairs through entanglement swapping. 3) Performing distillation to manage the fidelity of the produced pairs. 4) Participating in the management of the network (routing, etc.)."
  },
  {
    "question": "What are the key differences between the control and data planes in quantum versus classical networks?",
    "answer": "In quantum networks, control plane traffic (routing and signalling messages) is exchanged over a classical channel, whereas data plane traffic (the actual Bell pair qubits) is exchanged over a separate quantum channel."
  },
  {
    "question": "What are the key challenges in monitoring a quantum network?",
    "answer": "The fundamental unit of quantum information, the qubit, cannot be actively monitored, as any readout irreversibly destroys its contents. One of the implications of this fact is that measuring an individual pair's fidelity is impossible. Fidelity is meaningful only as a statistical quantity that requires constant monitoring of generated Bell pairs, achieved by sacrificing some Bell pairs for use in tomography or other methods."
  }
]

## Work on chunking strategies

In [25]:
def query(question):
    results = collection.query(
        query_texts=[question],
        n_results=10
    )

    return(results["distances"][0] , docs[int(results["ids"][0][0])])

In [26]:
distance, answer = query("What are the key challenges in monitoring a quantum network?")

print( distance, answer)

[0.6471181511878967, 0.6924597024917603, 0.7476683855056763, 0.8544892072677612, 0.8706136345863342, 0.8881561756134033, 0.9051676988601685, 0.9195541143417358, 0.9431092739105225, 0.9590636491775513] it is the transmission of qubits that draws the line between a
genuine quantum network and a collection of quantum computers
connected over a classical network.

A quantum network is defined as a collection of nodes that is able to
exchange qubits and distribute entangled states amongst themselves.
A quantum node that is able only to communicate classically with
another quantum node is not a member of a quantum network.

Services and applications that are more complex can be built on top
of entangled states distributed by the network; for example, see
[ZOO].

4.  Achieving Quantum Connectivity

This section explains the meaning of quantum connectivity and the
necessary physical processes at an abstract level.

4.1.  Challenges

A quantum network cannot be built by simply extrapolating all

## First observation

The chunk retrieve by default chroma setup is not the right one.

Correct part of the RFC is :

```
   5.  Make them easy to monitor.

       In order to manage, evaluate the performance of, or debug a
       network, it is necessary to have the ability to monitor the
       network while ensuring that there will be mechanisms in place to
       protect the confidentiality and integrity of the devices
       connected to it.  Quantum networks bring new challenges in this
       area, so it should be a goal of a quantum network architecture to
       make this task easy.

       The fundamental unit of quantum information, the qubit, cannot be
       actively monitored, as any readout irreversibly destroys its
       contents.  One of the implications of this fact is that measuring
       an individual pair's fidelity is impossible.  Fidelity is
       meaningful only as a statistical quantity that requires constant
       monitoring of generated Bell pairs, achieved by sacrificing some
       Bell pairs for use in tomography or other methods.

       Furthermore, given one end of an entangled pair, it is impossible
       to tell where the other qubit is without any additional classical
       metadata.  It is impossible to extract this information from the
       qubits themselves.  This implies that tracking entangled pairs
       necessitates some exchange of classical information.  This
       information might include (i) a reference to the entangled pair
       that allows distributed applications to coordinate actions on
       qubits of the same pair and (ii) the two bits from each
       entanglement swap necessary to identify the final state of the
       Bell pair (Section 4.4.2).
```

Retrieved one:
```
[...]
it is the transmission of qubits that draws the line between a
genuine quantum network and a collection of quantum computers
connected over a classical network.

A quantum network is defined as a collection of nodes that is able to
exchange qubits and distribute entangled states amongst themselves.
A quantum node that is able only to communicate classically with
another quantum node is not a member of a quantum network.

Services and applications that are more complex can be built on top
of entangled states distributed by the network; for example, see
[ZOO].

4.  Achieving Quantum Connectivity

This section explains the meaning of quantum connectivity and the
necessary physical processes at an abstract level.

4.1.  Challenges

A quantum network cannot be built by simply extrapolating all the
classical models to their quantum analogues.  Sending qubits over a
wire like we send classical bits is simply not as easy to do.  There
are several technological as well as fundamental challenges that make
classical approaches unsuitable in a quantum context.

[...]

4.1.4.  Inadequacy of Direct Transmission

Conceptually, the most straightforward way to distribute an entangled
state is to simply transmit one of the qubits directly to the other
end across a series of nodes while performing sufficient forward
Quantum Error Correction (QEC) (Section 4.4.3.2) to bring losses down
to an acceptable level.  Despite the no-cloning theorem and the
inability to directly measure a quantum state, error-correcting
mechanisms for quantum communication exist [Jiang09] [Fowler10]
[Devitt13] [Mural16].  However, QEC makes very high demands on both
resources (physical qubits needed) and their initial fidelity.

[...]
```

## Thoughts

relative distance is indication of relevance?
doesn't seem normalized

