# Synthetic Data Generation

## Introduction

We will be generating 3 types of dataset: 
* Duplicates: where a sleected number of documents will be very similar to others without being a copy paste (paraphrasing)
* Synergy: there will be 2 documents that will be key to answer the questions. If any of the 2 docs. is missing the LLM will not be able to answer. 
* Complementary: there are 2 documents that will each offer a part of the answer to the question. 

Each dataset will have 10 samples, each composed of 1 question, 10 documents (context), 1 answer. The "positive" documents will be marked by the letters: A & B (sort of id). 

## Imports and Setup

In [18]:
import pandas as pd
import string
import numpy as np

### Utils

In [22]:
def create_id(df : pd.DataFrame) -> pd.DataFrame: 
    letters = list(string.ascii_uppercase[: len(df.context.loc[0])])
    df["id"] = np.vstack([letters]*df.shape[0]).tolist()
    return df

## Data Generation

### Duplicated Dataset

### Generation

In [24]:
data = {
    "question": [],
    "context": [],
    "answer": []
}

# --- Sample 1 ---
data["question"].append("What is the primary component of a Xylotian 'Glimmer-sail'?")
data["context"].append([
    "The Glimmer-sails of Xylos are renowned for their ethereal glow, primarily due to interwoven strands of light-sensitive 'Aether-fiber'.", # Golden 1
    "Xylotian sky-ships utilize sails woven from Aether-fiber, a material that reacts to ambient stellar radiation to provide gentle propulsion.", # Golden 2
    "The most common pet on Xylos is the six-legged 'Fuzznugget'.", # Hard Negative
    "Xylotian cuisine often features the bioluminescent 'Star-Kelp'.", # Soft Negative
    "The atmospheric pressure on Xylos is significantly lower than on Earth.", # Soft Negative
    "Aether-fiber is also used in Xylotian ceremonial robes for its unique shimmer.", # Soft Negative
    "Xylotian navigators use crystal charts to plot courses through the nebulae.", # Soft Negative
    "The 'Sky-Lamps' of Xylotian cities are powered by captured solar winds.", # Soft Negative
    "Learning to weave Aether-fiber is a traditional skill passed down through generations.", # Soft Negative
    "Xylotian dwellings are often built from solidified volcanic glass." # Soft Negative
])
data["answer"].append("The primary component of a Xylotian 'Glimmer-sail' is 'Aether-fiber'.")

# --- Sample 2 ---
data["question"].append("How do Xylotians communicate over long distances on Xylos?")
data["context"].append([
    "Xylotians employ networks of 'Resonance Towers' that amplify thought-patterns for long-range intra-planetary messaging.", # Golden 1
    "Long-distance communication across Xylos is facilitated by a system of Resonance Towers, which transmit amplified mental signals.", # Golden 2
    "The annual 'Festival of Lights' on Xylos celebrates the alignment of its twin moons.", # Hard Negative
    "For interplanetary communication, Xylotians use 'Echo-Crystals' which resonate with psychic imprints.", # Soft Negative (interplanetary, not on-planet)
    "The 'Whisper-Winds' of Xylos carry sounds for many miles, but are unreliable for direct communication.", # Soft Negative
    "The official language of Xylos has over five thousand unique pictograms.", # Soft Negative
    "Xylotian musical instruments are often carved from resonant 'Singing Woods'.", # Soft Negative
    "Short-range Xylotian communication often involves subtle color shifts in their skin patterns.", # Soft Negative
    "Resonance Towers require periodic recalibration by 'Crystal-Tuners'.", # Soft Negative
    "The geology of Xylos is rich in conductive minerals." # Soft Negative
])
data["answer"].append("Xylotians use 'Resonance Towers' to amplify thought-patterns for long-distance communication on Xylos.")

# --- Sample 3 ---
data["question"].append("What is the main energy source for Xylotian 'Hover-platforms'?")
data["context"].append([
    "The personal Hover-platforms used by Xylotians are typically powered by 'Kineti-Gems', which store and release kinetic energy.", # Golden 1
    "Kineti-Gems provide the necessary lift for Xylotian Hover-platforms by converting stored motional energy.", # Golden 2
    "The primary export of Xylos is refined 'Luma-Crystals'.", # Hard Negative
    "Xylotian starships use 'Void-Core' engines for faster-than-light travel.", # Soft Negative
    "Kineti-Gems need to be 'recharged' by physical movement, like walking or running.", # Soft Negative
    "The stability of Hover-platforms is maintained by gyroscopic balancers.", # Soft Negative
    "Xylotian architecture often incorporates anti-gravity elements for aesthetic purposes.", # Soft Negative
    "The lifespan of a Kineti-Gem is approximately ten Xylotian cycles.", # Soft Negative
    "Xylos has three suns, leading to complex day-night cycles.", # Soft Negative
    "Hover-platforms are restricted to altitudes below 500 Xylo-feet for safety." # Soft Negative
])
data["answer"].append("The main energy source for Xylotian 'Hover-platforms' is 'Kineti-Gems'.")

# --- Sample 4 ---
data["question"].append("What unique property does 'Chrono-Dust' possess according to Xylotian lore?")
data["context"].append([
    "Xylotian legends speak of Chrono-Dust, a rare substance said to temporarily crystallize moments in time where it settles.", # Golden 1
    "It is believed by Xylotian mystics that Chrono-Dust has the ability to solidify a fleeting moment, making it briefly observable as a static crystal.", # Golden 2
    "The 'Great Xylotian Library' contains records dating back millennia.", # Hard Negative
    "Xylotian healers use 'Bio-Resonant Frequencies' to mend injuries.", # Soft Negative
    "Chrono-Dust is rumored to be found only in the 'Echoing Caves' during a temporal anomaly.", # Soft Negative
    "The concept of linear time is debated among Xylotian philosophers.", # Soft Negative
    "Xylotian artists are known for their intricate sculptures made from 'Shadow-Glass'.", # Soft Negative
    "Many try to find Chrono-Dust, but its existence is unconfirmed by Xylotian science.", # Soft Negative
    "The effects of Chrono-Dust are said to be very short-lived and localized.", # Soft Negative
    "Xylotian children play a game called 'Star-Hop' among the floating islands." # Soft Negative
])
data["answer"].append("According to Xylotian lore, 'Chrono-Dust' possesses the property of temporarily crystallizing moments in time.")

# --- Sample 5 ---
data["question"].append("What are 'Dream-Weavers' used for in Xylotian society?")
data["context"].append([
    "Xylotian society values communal well-being, and 'Dream-Weavers' are devices used to harmonize collective subconscious states during designated rest periods.", # Golden 1
    "To foster empathy and shared understanding, Xylotians utilize 'Dream-Weavers' to link and gently guide collective dream experiences.", # Golden 2
    "Xylotian agriculture relies on 'Hydro-Synth' units for water recycling.", # Hard Negative
    "The 'Night-Orbs' of Xylos provide gentle illumination after sunset.", # Soft Negative
    "Xylotian education involves 'Morphic Learning Crystals' that adapt to the student's pace.", # Soft Negative
    "Dream interpretation is a respected skill among Xylotian elders.", # Soft Negative
    "The patterns generated by Dream-Weavers are often incorporated into Xylotian art.", # Soft Negative
    "Access to Dream-Weavers is typically managed by community 'Mind-Harmonizers'.", # Soft Negative
    "Individual dream recall can be enhanced by consuming 'Nocta-Berries'.", # Soft Negative
    "Xylos has a unique flora that blooms only under the light of its twin moons." # Soft Negative
])
data["answer"].append("In Xylotian society, 'Dream-Weavers' are used to harmonize collective subconscious states or guide collective dream experiences.")

# --- Sample 6 ---
data["question"].append("What is the purpose of the 'Aqua-Harmonics' system in Xylotian underwater domes?")
data["context"].append([
    "The 'Aqua-Harmonics' system installed in Xylotian sub-aquatic habitats generates specific sonic frequencies to gently repel aggressive marine megafauna.", # Golden 1
    "To ensure the safety of their underwater domes, Xylotians employ 'Aqua-Harmonics', which use sound waves as a deterrent against large, hostile sea creatures.", # Golden 2
    "The 'Glimmering Caves' of Xylos are a popular tourist destination for off-worlders.", # Hard Negative
    "Xylotian marine biologists study the 'Coral-Song' of the sentient reefs.", # Soft Negative
    "The domes themselves are constructed from transparent 'Plasteel-Alloy'.", # Soft Negative
    "Internal atmospheric pressure within the domes is carefully regulated.", # Soft Negative
    "Aqua-Harmonics also incidentally promotes the growth of certain beneficial algae.", # Soft Negative
    "The energy for Aqua-Harmonics is drawn from tidal generators.", # Soft Negative
    "Xylotian diet heavily features cultivated sea-vegetables from these domes.", # Soft Negative
    "Communication between domes is achieved via light-pulse cables." # Soft Negative
])
data["answer"].append("The 'Aqua-Harmonics' system in Xylotian underwater domes is used to repel aggressive marine megafauna using sonic frequencies.")

# --- Sample 7 ---
data["question"].append("What is 'Solaris Silk' primarily used for by the Xylotians?")
data["context"].append([
    "Xylotians craft their high-altitude thermal cloaks from 'Solaris Silk', a material that efficiently traps and radiates solar energy.", # Golden 1
    "The primary application of 'Solaris Silk' among Xylotians is in the creation of thermal cloaks for protection against the cold of Xylos's upper atmosphere, due to its solar absorption properties.", # Golden 2
    "Xylotian currency is based on polished 'Geo-Stones'.", # Hard Negative
    "'Luna-Weave' is another Xylotian fabric, known for its reflective properties.", # Soft Negative
    "Xylotian astronomers use 'Star-Gazer' telescopes to observe distant galaxies.", # Soft Negative
    "The production of Solaris Silk involves cultivating 'Sun-Moths' in specialized bio-domes.", # Soft Negative
    "Solaris Silk changes color slightly depending on the intensity of absorbed light.", # Soft Negative
    "These thermal cloaks are essential for Xylotians who pilot 'Strato-Gliders'.", # Soft Negative
    "The weaving patterns of Solaris Silk often depict celestial constellations.", # Soft Negative
    "Xylos experiences extreme temperature variations between its shadowed and sunlit sides." # Soft Negative
])
data["answer"].append("'Solaris Silk' is primarily used by Xylotians for crafting high-altitude thermal cloaks that trap solar energy.")

# --- Sample 8 ---
data["question"].append("How is 'Mind-Sculpting' primarily utilized in Xylotian education?")
data["context"].append([
    "In Xylotian advanced education, 'Mind-Sculpting' is a technique used to help students visualize and internalize complex abstract concepts by shaping mental constructs.", # Golden 1
    "The primary use of 'Mind-Sculpting' within the Xylotian educational system is to aid in the comprehension of intricate, non-physical ideas through guided mental visualization.", # Golden 2
    "Xylotian cuisine often incorporates 'Flavor-Crystals' that change taste based on temperature.", # Hard Negative
    "'Memory-Crystals' are used by Xylotians for long-term information storage.", # Soft Negative
    "Xylotian children learn basic arithmetic using 'Abacus-Beads' made of luminous stone.", # Soft Negative
    "The process of Mind-Sculpting requires a trained 'Cognitive Guide'.", # Soft Negative
    "Ethical guidelines strictly regulate the application of Mind-Sculpting.", # Soft Negative
    "Mind-Sculpting is not used for altering memories, only for conceptual understanding.", # Soft Negative
    "Students practice Mind-Sculpting in 'Meditation Chambers' to enhance focus.", # Soft Negative
    "The Xylotian alphabet is phonetic and relatively easy to learn." # Soft Negative
])
data["answer"].append("'Mind-Sculpting' is primarily utilized in Xylotian education to help students visualize and internalize complex abstract concepts.")

# --- Sample 9 ---
data["question"].append("What is the function of 'Geo-Stabilizers' in Xylotian floating cities?")
data["context"].append([
    "Xylotian floating cities rely on massive 'Geo-Stabilizers' embedded deep within their foundational platforms to counteract atmospheric turbulence and maintain altitude.", # Golden 1
    "The 'Geo-Stabilizers' are crucial for the Xylotian sky-cities, as their function is to provide stability against strong winds and ensure the city remains at its designated elevation.", # Golden 2
    "Xylotian traditional music often features the 'Wind-Harp', an instrument played by atmospheric currents.", # Hard Negative
    "Power for the floating cities is primarily drawn from 'Atmo-Capacitors'.", # Soft Negative
    "The 'Sky-Gardens' of these cities cultivate rare, high-altitude flora.", # Soft Negative
    "Inter-city transport is managed by a network of 'Aerial Ferries'.", # Soft Negative
    "Geo-Stabilizers require constant monitoring and fine-tuning by 'Altitude Engineers'.", # Soft Negative
    "The design of Xylotian floating cities incorporates principles of 'Aero-Harmony'.", # Soft Negative
    "The largest floating city, 'Aeria Prime', houses the Xylotian Council.", # Soft Negative
    "Early prototypes of Geo-Stabilizers were much less reliable." # Soft Negative
])
data["answer"].append("The function of 'Geo-Stabilizers' in Xylotian floating cities is to counteract atmospheric turbulence and maintain altitude.")

# --- Sample 10 ---
data["question"].append("What are 'Spirit-Stones' believed to store according to Xylotian spiritual beliefs?")
data["context"].append([
    "Xylotian spiritual traditions hold that 'Spirit-Stones', often passed down through generations, are capable of storing the ancestral memories and emotional essences of their former keepers.", # Golden 1
    "According to Xylotian mysticism, 'Spirit-Stones' serve as repositories for the life experiences and core emotions of ancestors who once possessed them.", # Golden 2
    "Xylotian clothing often incorporates 'Symbiont-Fibers' that react to the wearer's mood.", # Hard Negative
    "'Focus-Crystals' are used by Xylotian artisans to channel creative energy.", # Soft Negative
    "The 'Temple of Whispers' on Xylos is said to amplify psychic energies.", # Soft Negative
    "Spirit-Stones are typically kept in ornate 'Memory-Shrines' within Xylotian homes.", # Soft Negative
    "The 'Xylotian Book of Origins' details their creation myths.", # Soft Negative
    "Only 'Stone-Seers' are believed to be able to fully interpret the contents of a Spirit-Stone.", # Soft Negative
    "The color and clarity of a Spirit-Stone are thought to reflect the nature of the stored essences.", # Soft Negative
    "Xylotian funeral rites involve a 'Return to the Stars' ceremony." # Soft Negative
])
data["answer"].append("According to Xylotian spiritual beliefs, 'Spirit-Stones' are believed to store ancestral memories and emotional essences.")

import pandas as pd

data = {
    "question": [],
    "context": [],
    "answer": []
}

# --- Entry 1: Solar System (Mars) ---
data["question"].append("What is the name of the largest volcano on Mars?")
data["answer"].append("Olympus Mons.")
data["context"].append([
    # Golden Documents
    "Mars, the fourth planet from the Sun, is home to Olympus Mons, an enormous shield volcano which is the largest volcano and highest known mountain in our Solar System.",
    "The colossal volcano Olympus Mons on Mars stands nearly three times the height of Mount Everest, making it a dominant feature of the Martian landscape.",
    # Distractor Documents
    "Mars is often called the 'Red Planet' due to iron oxide prevalent on its surface.",
    "The atmosphere of Mars is very thin, primarily composed of carbon dioxide.",
    "NASA's Perseverance rover is currently exploring Jezero Crater on Mars.",
    "Phobos and Deimos are the two small, irregularly shaped moons of Mars.",
    "Evidence suggests that liquid water once flowed on the surface of Mars.",
    "A Martian day (sol) is just slightly longer than an Earth day.",
    "Mars has polar ice caps that grow and recede with the seasons.",
    "The average temperature on Mars is about -80 degrees Fahrenheit (-62 degrees Celsius)."
])

# --- Entry 2: Literature (Shakespeare) ---
data["question"].append("In which Shakespearean play does the line 'To be, or not to be' appear?")
data["answer"].append("Hamlet.")
data["context"].append([
    # Golden Documents
    "The famous soliloquy 'To be, or not to be, that is the question' is delivered by Prince Hamlet in Act 3, Scene 1 of William Shakespeare's tragedy, *Hamlet*.",
    "William Shakespeare's renowned play, *Hamlet*, features the iconic line 'To be, or not to be' where the protagonist contemplates life and death.",
    # Distractor Documents
    "William Shakespeare was an English playwright, poet, and actor, widely regarded as the greatest writer in the English language.",
    "*Romeo and Juliet* is another famous tragedy by Shakespeare, telling the story of two young star-crossed lovers whose deaths ultimately reconcile their feuding families.",
    "Shakespeare's plays are typically categorized into three genres: comedy, history, and tragedy.",
    "The Globe Theatre in London is famously associated with William Shakespeare, where many of his plays were first performed.",
    "Shakespeare wrote approximately 39 plays, 154 sonnets, and two long narrative poems.",
    "Common themes in Shakespeare's works include love, loss, ambition, revenge, and fate.",
    "The language used in Shakespeare's time, Early Modern English, can be challenging for modern readers but is rich in metaphor and imagery.",
    "*Macbeth* is a tragedy by William Shakespeare about a Scottish general who, spurred by a prophecy, murders the king to seize the throne."
])

# --- Entry 3: Biology (Photosynthesis) ---
data["question"].append("What are the main products of photosynthesis?")
data["answer"].append("Glucose and oxygen.")
data["context"].append([
    # Golden Documents
    "Photosynthesis is the process used by plants, algae, and some bacteria to convert light energy into chemical energy, producing glucose (a sugar) for food and releasing oxygen as a byproduct.",
    "The two primary outputs of the photosynthetic process are oxygen, which is released into the atmosphere, and glucose, which serves as the plant's energy source.",
    # Distractor Documents
    "Chlorophyll is the green pigment found in chloroplasts that absorbs light energy for photosynthesis.",
    "Plants require carbon dioxide, water, and sunlight for photosynthesis to occur.",
    "Cellular respiration is the process by which organisms break down glucose to release energy, often consuming oxygen.",
    "Stomata are small pores on the surface of leaves that allow for gas exchange (CO2 in, O2 out).",
    "The Calvin cycle is a part of photosynthesis where carbon dioxide is converted into sugar.",
    "Different types of plants have adapted photosynthesis to various environmental conditions, such as C4 and CAM photosynthesis.",
    "Light-dependent reactions in photosynthesis capture energy from sunlight and store it in ATP and NADPH.",
    "Autotrophs are organisms that can produce their own food, primarily through photosynthesis."
])

# --- Entry 4: History (World War II) ---
data["question"].append("Which event is generally considered the start of World War II in Europe?")
data["answer"].append("Germany's invasion of Poland.")
data["context"].append([
    # Golden Documents
    "World War II in Europe is widely regarded to have begun on September 1, 1939, when Germany, under Adolf Hitler, invaded Poland.",
    "The invasion of Poland by German forces on September 1, 1939, triggered declarations of war by France and the United Kingdom, marking the start of World War II in Europe.",
    # Distractor Documents
    "The Treaty of Versailles, which ended World War I, imposed heavy reparations on Germany and is often cited as a long-term cause of World War II.",
    "The attack on Pearl Harbor by Japan on December 7, 1941, brought the United States into World War II.",
    "The Battle of Stalingrad was a major turning point on the Eastern Front during World War II.",
    "The D-Day landings in Normandy on June 6, 1944, marked the beginning of the liberation of Western Europe.",
    "The Axis powers primarily consisted of Germany, Italy, and Japan.",
    "The Allied powers included Great Britain, the United States, the Soviet Union, and China, among others.",
    "The Holocaust was the genocide of approximately six million Jews by the Nazi regime and its collaborators.",
    "World War II ended in 1945 with the surrender of Germany in May and Japan in September."
])

# --- Entry 5: Technology (Internet) ---
data["question"].append("What does the acronym 'HTTP' stand for?")
data["answer"].append("Hypertext Transfer Protocol.")
data["context"].append([
    # Golden Documents
    "HTTP, which stands for Hypertext Transfer Protocol, is the foundation of data communication for the World Wide Web.",
    "The acronym HTTP refers to Hypertext Transfer Protocol, the protocol used to request and transmit files, especially web pages and web page components, over the internet.",
    # Distractor Documents
    "The internet is a global network of interconnected computers.",
    "Tim Berners-Lee is credited with inventing the World Wide Web in 1989.",
    "An IP address is a unique numerical label assigned to each device connected to a computer network.",
    "TCP/IP is the suite of communication protocols used to interconnect network devices on the internet.",
    "A URL (Uniform Resource Locator) is another name for a web address.",
    "HTML (Hypertext Markup Language) is the standard markup language for creating web pages.",
    "A web browser is a software application for accessing information on the World Wide Web.",
    "DNS (Domain Name System) translates human-readable domain names (like www.google.com) into machine-readable IP addresses."
])

# --- Entry 6: Geography (Rivers) ---
data["question"].append("What is the longest river in the world?")
data["answer"].append("The Nile River (though sometimes disputed with the Amazon River, Nile is traditionally cited as longer or has more recent claims to being longer after new source discovery). For simplicity, we will stick with the Nile for this test.")
data["context"].append([
    # Golden Documents
    "The Nile River, flowing through northeastern Africa, is traditionally considered the longest river in the world, stretching approximately 6,650 kilometers (4,132 miles).",
    "Spanning over 4,100 miles, the Nile is recognized globally as the longest river, providing a vital water source for countries like Egypt and Sudan.",
    # Distractor Documents
    "The Amazon River in South America has the largest discharge volume of any river in the world.",
    "The Mississippi River is the second-longest river in North America.",
    "Rivers play a crucial role in ecosystems, providing water for drinking, agriculture, and transportation.",
    "A river's delta is a landform created by deposition of sediment carried by the river as the flow leaves its mouth.",
    "The source of a river is the original point from which the river flows.",
    "Major rivers around the world include the Yangtze, Mekong, Congo, and Danube.",
    "Floodplains are areas of land adjacent to a river which are subject to flooding.",
    "Erosion and deposition are key geological processes associated with rivers shaping the landscape."
])

# --- Entry 7: Art (Impressionism) ---
data["question"].append("Who is often considered the 'father' of Impressionism?")
data["answer"].append("Claude Monet.")
data["context"].append([
    # Golden Documents
    "Claude Monet, known for works like 'Impression, soleil levant,' is widely regarded as a founder and the most consistent practitioner of the Impressionist movement, often dubbed its 'father'.",
    "Many art historians identify Claude Monet as the leading figure who pioneered the Impressionist style, earning him the informal title 'father of Impressionism'.",
    # Distractor Documents
    "Impressionism was a 19th-century art movement characterized by relatively small, thin, yet visible brush strokes and an emphasis on accurate depiction of light.",
    "Pierre-Auguste Renoir was another prominent Impressionist painter, known for his vibrant depictions of people and social life.",
    "Edgar Degas, famous for his paintings of dancers, was also a key figure in the Impressionist circle.",
    "Post-Impressionism was a diverse art movement that emerged in France around 1886 as a reaction against Impressionism's concern for the naturalistic depiction of light and colour.",
    "Artists like Vincent van Gogh and Paul Cézanne are considered Post-Impressionists.",
    "The term 'Impressionism' was initially coined derisively by a critic reviewing Monet's 'Impression, soleil levant'.",
    "Impressionist painters often painted en plein air (outdoors) to capture the fleeting effects of light and atmosphere.",
    "The invention of pre-mixed paint tubes allowed Impressionist artists more freedom to paint outside the studio."
])

# --- Entry 8: Music (Classical Composers) ---
data["question"].append("Which classical composer became deaf later in his life but continued to compose influential music?")
data["answer"].append("Ludwig van Beethoven.")
data["context"].append([
    # Golden Documents
    "Ludwig van Beethoven, a German composer and pianist, is one of the most revered figures in Western music, famously continuing to compose, conduct, and perform even after becoming completely deaf.",
    "Despite his increasing deafness, which began in his late twenties, Ludwig van Beethoven produced some of his most important works, including his late string quartets and the Ninth Symphony.",
    # Distractor Documents
    "Wolfgang Amadeus Mozart was a prolific and influential composer of the Classical period, who died at a young age.",
    "Johann Sebastian Bach was a German composer of the Baroque period, known for his complex counterpoint.",
    "The Classical period in music roughly spanned from 1750 to 1820.",
    "A symphony is an extended musical composition, typically for orchestra, in several movements.",
    "Common instruments in a classical orchestra include strings, woodwinds, brass, and percussion.",
    "Franz Schubert was an Austrian composer known for his Lieder (songs) and symphonies.",
    "Chamber music is composed for a small group of instruments, traditionally one that could fit in a palace chamber.",
    "Opera combines music, drama, and spectacle, with singers performing theatrical roles."
])

# --- Entry 9: Physics (Gravity) ---
data["question"].append("What fundamental force is responsible for keeping planets in orbit around stars?")
data["answer"].append("Gravity.")
data["context"].append([
    # Golden Documents
    "Gravity, or gravitation, is the fundamental force of attraction that acts between all objects with mass, and it is this force that keeps planets in orbit around their stars.",
    "The orbits of planets, moons, and satellites are primarily governed by the force of gravity, which pulls celestial bodies towards each other.",
    # Distractor Documents
    "Isaac Newton formulated the law of universal gravitation in the 17th century.",
    "Albert Einstein's theory of general relativity provides a more modern description of gravity as a curvature of spacetime.",
    "The electromagnetic force is responsible for interactions between electrically charged particles.",
    "The strong nuclear force binds protons and neutrons together in an atom's nucleus.",
    "The weak nuclear force is responsible for radioactive decay.",
    "Mass is a measure of the amount of matter in an object, while weight is the force of gravity acting on that mass.",
    "Black holes are regions of spacetime where gravity is so strong that nothing, not even light, can escape.",
    "Escape velocity is the minimum speed needed for an object to break free from the gravitational attraction of a celestial body."
])

# --- Entry 10: Chemistry (Atoms) ---
data["question"].append("What are the three main subatomic particles that make up an atom?")
data["answer"].append("Protons, neutrons, and electrons.")
data["context"].append([
    # Golden Documents
    "An atom is composed of three primary types of subatomic particles: protons and neutrons, which form the nucleus, and electrons, which orbit the nucleus.",
    "The fundamental building blocks of an atom are its subatomic particles, which include positively charged protons, neutral neutrons, and negatively charged electrons.",
    # Distractor Documents
    "The atomic number of an element is determined by the number of protons in its nucleus.",
    "Isotopes of an element have the same number of protons but different numbers of neutrons.",
    "Electrons occupy specific energy levels or shells around the nucleus.",
    "The periodic table organizes elements based on their atomic number and chemical properties.",
    "A molecule is formed when two or more atoms are held together by chemical bonds.",
    "Ions are atoms or molecules that have gained or lost electrons, resulting in a net electrical charge.",
    "Nuclear fission is the process where the nucleus of an atom splits into smaller parts.",
    "Valence electrons are the electrons in the outermost shell of an atom, involved in chemical bonding."
])

# --- Entry 11: Economics (Supply and Demand) ---
data["question"].append("What happens to the price of a good when demand exceeds supply?")
data["answer"].append("The price tends to increase.")
data["context"].append([
    # Golden Documents
    "In economics, when the demand for a particular good or service outstrips its available supply, there is upward pressure on its price, leading to an increase.",
    "The law of supply and demand dictates that if demand for a product is higher than the supply available, its price will typically rise as consumers compete for limited stock.",
    # Distractor Documents
    "Supply is the amount of a good or service that producers are willing and able to offer for sale at a given price.",
    "Demand is the quantity of a good or service that consumers are willing and able to purchase at various prices.",
    "Market equilibrium occurs when the quantity supplied equals the quantity demanded, resulting in a stable price.",
    "Inflation is a general increase in prices and fall in the purchasing value of money.",
    "A monopoly exists when a single company or group owns all or nearly all of the market for a given type of product or service.",
    "GDP (Gross Domestic Product) is the total monetary or market value of all the finished goods and services produced within a country's borders in a specific time period.",
    "Elasticity of demand measures how responsive the quantity demanded is to a change in price.",
    "Scarcity is the fundamental economic problem of having seemingly unlimited human wants in a world of limited resources."
])

# --- Entry 12: Mythology (Greek Gods) ---
data["question"].append("Who is the king of the gods in Greek mythology?")
data["answer"].append("Zeus.")
data["context"].append([
    # Golden Documents
    "In the Greek pantheon, Zeus is the supreme deity, ruling as the king of the gods and the god of sky, thunder, and lightning, residing on Mount Olympus.",
    "Zeus, son of Cronus and Rhea, wielded the thunderbolt and was recognized as the sovereign leader among the Olympian gods in ancient Greek mythology.",
    # Distractor Documents
    "Hera was Zeus's wife and sister, and the goddess of marriage and childbirth.",
    "Poseidon, Zeus's brother, was the god of the sea, earthquakes, and horses.",
    "Hades, another brother of Zeus, ruled the underworld.",
    "Athena was the goddess of wisdom, warfare, and crafts, and was one of Zeus's daughters.",
    "Apollo was the god of music, arts, knowledge, healing, plague, prophecy, poetry, and archery.",
    "Roman mythology largely adopted Greek gods, giving them Roman names; for example, Zeus became Jupiter.",
    "Mount Olympus was believed to be the home of the twelve Olympian gods.",
    "Greek mythology is a collection of myths and legends concerning the gods, heroes, and the nature of the ancient Greek world."
])

# --- Entry 13: Computer Science (Algorithms) ---
data["question"].append("What type of algorithm is Quicksort?")
data["answer"].append("A divide and conquer sorting algorithm.")
data["context"].append([
    # Golden Documents
    "Quicksort is an efficient sorting algorithm that follows the divide and conquer paradigm, picking an element as a pivot and partitioning the array around the pivot.",
    "As a prominent example of a divide and conquer algorithm, Quicksort works by recursively breaking down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly.",
    # Distractor Documents
    "An algorithm is a step-by-step procedure or formula for solving a problem or accomplishing a task.",
    "Bubble sort is a simple sorting algorithm that repeatedly steps through the list, compares adjacent elements and swaps them if they are in the wrong order.",
    "Big O notation is used to describe the performance or complexity of an algorithm.",
    "A data structure is a particular way of organizing and storing data in a computer so that it can be accessed and modified efficiently.",
    "Binary search is an efficient algorithm for finding an item from a sorted list of items.",
    "Greedy algorithms make locally optimal choices at each step with the hope of finding a global optimum.",
    "Dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems.",
    "Merge sort is another efficient, comparison-based, divide and conquer sorting algorithm."
])

# --- Entry 14: Film (Star Wars) ---
data["question"].append("What is the name of Luke Skywalker's father in the Star Wars saga?")
data["answer"].append("Anakin Skywalker (also known as Darth Vader).")
data["context"].append([
    # Golden Documents
    "In a pivotal moment in *Star Wars: The Empire Strikes Back*, Darth Vader reveals to Luke Skywalker that he is his father, Anakin Skywalker.",
    "Luke Skywalker's parentage is a central theme in Star Wars, with his father being Anakin Skywalker, who later fell to the dark side and became Darth Vader.",
    # Distractor Documents
    "Star Wars is an American epic space-opera media franchise created by George Lucas.",
    "The Jedi are a fictional order of protectors in the Star Wars galaxy, known for their ability to use the Force.",
    "The Sith are the main antagonists in the Star Wars universe, depicted as the ancient enemies of the Jedi.",
    "Han Solo is a smuggler and captain of the Millennium Falcon, a close friend of Luke Skywalker.",
    "Princess Leia Organa is Luke Skywalker's twin sister and a leader in the Rebel Alliance.",
    "The Force is a metaphysical and ubiquitous energy field in the Star Wars fictional universe.",
    "Lightsabers are energy swords wielded by Jedi and Sith.",
    "The original Star Wars trilogy consists of *A New Hope* (1977), *The Empire Strikes Back* (1980), and *Return of the Jedi* (1983)."
])

# --- Entry 15: Health (Vitamins) ---
data["question"].append("Which vitamin is primarily obtained from sunlight exposure?")
data["answer"].append("Vitamin D.")
data["context"].append([
    # Golden Documents
    "Vitamin D, often called the 'sunshine vitamin,' is unique because it can be synthesized by the human body when the skin is exposed to sunlight.",
    "The primary natural source for humans to obtain Vitamin D is through skin exposure to ultraviolet B (UVB) rays from the sun, which triggers its synthesis.",
    # Distractor Documents
    "Vitamin C is an essential nutrient known for its role in supporting the immune system and is found in citrus fruits.",
    "B vitamins are a group of water-soluble vitamins that play important roles in cell metabolism.",
    "Vitamin A is important for vision, growth, cell division, reproduction, and immunity.",
    "A balanced diet typically provides most of the essential vitamins and minerals needed by the body.",
    "Vitamin K is crucial for blood clotting and bone health.",
    "Dietary sources of Vitamin D include fatty fish, egg yolks, and fortified foods like milk and cereals.",
    "Deficiency in certain vitamins can lead to various health problems.",
    "Minerals like calcium, iron, and zinc are also essential for bodily functions."
])

# --- Entry 16: Environment (Climate Change) ---
data["question"].append("What is the primary greenhouse gas responsible for current climate change?")
data["answer"].append("Carbon dioxide (CO2).")
data["context"].append([
    # Golden Documents
    "Carbon dioxide (CO2) is the most significant long-lived greenhouse gas in Earth's atmosphere, and its increasing concentration, primarily from burning fossil fuels, is the main driver of current climate change.",
    "While several greenhouse gases contribute to global warming, carbon dioxide (CO2) is considered the primary one due to its abundance and persistence in the atmosphere, largely resulting from human activities.",
    # Distractor Documents
    "Greenhouse gases trap heat in the Earth's atmosphere, leading to the greenhouse effect.",
    "Methane (CH4) is another potent greenhouse gas, produced by agriculture and natural gas systems.",
    "Nitrous oxide (N2O) is also a significant greenhouse gas, emitted from agricultural and industrial activities.",
    "Renewable energy sources like solar and wind power can help reduce greenhouse gas emissions.",
    "Deforestation contributes to climate change by reducing the number of trees that absorb CO2.",
    "The Paris Agreement is an international treaty aimed at limiting global warming.",
    "Sea level rise is one of the major consequences of climate change.",
    "Climate change can lead to more frequent and intense extreme weather events, such as heatwaves and hurricanes."
])

# --- Entry 17: Language (Etymology) ---
data["question"].append("What language family does English primarily belong to?")
data["answer"].append("The Germanic language family.")
data["context"].append([
    # Golden Documents
    "English is a West Germanic language that originated from Anglo-Frisian dialects brought to Britain in the mid-5th to 7th centuries AD by Anglo-Saxon migrants from what is now northwest Germany, southern Denmark and the Netherlands.",
    "As part of the Germanic branch of the Indo-European language family, English shares common ancestry with languages like German, Dutch, and Swedish.",
    # Distractor Documents
    "Many English words, especially in vocabulary related to law, government, and cuisine, have been borrowed from French due to the Norman Conquest.",
    "Latin and Ancient Greek have also significantly influenced English vocabulary, particularly in scientific and technical terms.",
    "A language family is a group of languages related through descent from a common ancestral language or parental language.",
    "Indo-European is a large language family comprising most of the languages of Europe as well as many in South Asia and West Asia.",
    "Romance languages, such as French, Spanish, Italian, and Portuguese, evolved from Vulgar Latin.",
    "Slavic languages include Russian, Polish, Czech, and Serbian.",
    "Syntax refers to the arrangement of words and phrases to create well-formed sentences in a language.",
    "Phonetics is the study of the sounds of human speech."
])

# --- Entry 18: Philosophy (Existentialism) ---
data["question"].append("Which philosopher is most famously associated with the phrase 'existence precedes essence'?")
data["answer"].append("Jean-Paul Sartre.")
data["context"].append([
    # Golden Documents
    "Jean-Paul Sartre, a key figure in existentialist philosophy, famously argued that 'existence precedes essence,' meaning individuals are born without a predetermined purpose and must define themselves through their actions.",
    "The core tenet of existentialism, 'existence precedes essence,' is most notably attributed to the French philosopher Jean-Paul Sartre, emphasizing radical freedom and responsibility.",
    # Distractor Documents
    "Existentialism is a philosophical movement that emphasizes individual existence, freedom, and choice.",
    "Albert Camus, another existentialist thinker, explored themes of absurdity and rebellion in works like *The Myth of Sisyphus*.",
    "Søren Kierkegaard is often considered the first existentialist philosopher, focusing on subjective truth and the leap of faith.",
    "Friedrich Nietzsche proclaimed 'God is dead' and explored concepts like the will to power and the Übermensch.",
    "Existentialist themes often include angst, despair, freedom, responsibility, and the meaninglessness of life.",
    "Simone de Beauvoir, a close associate of Sartre, contributed significantly to existentialist feminism with *The Second Sex*.",
    "Phenomenology, associated with Edmund Husserl, influenced many existentialist thinkers.",
    "Stoicism is an ancient Greek school of philosophy emphasizing virtue, reason, and living in accordance with nature."
])

# --- Entry 19: Space Exploration (Moon Landing) ---
data["question"].append("Who was the first human to walk on the Moon?")
data["answer"].append("Neil Armstrong.")
data["context"].append([
    # Golden Documents
    "On July 20, 1969, during the Apollo 11 mission, American astronaut Neil Armstrong became the first person to step onto the lunar surface.",
    "Neil Armstrong's historic first step on the Moon was a monumental achievement for humanity, famously accompanied by his words, 'That's one small step for [a] man, one giant leap for mankind.'",
    # Distractor Documents
    "The Apollo program was a NASA initiative designed to land humans on the Moon and return them safely to Earth.",
    "Buzz Aldrin was the second person to walk on the Moon, joining Armstrong shortly after.",
    "Michael Collins was the command module pilot for Apollo 11, orbiting the Moon while Armstrong and Aldrin were on the surface.",
    "The Space Race was a 20th-century competition between the United States and the Soviet Union for supremacy in spaceflight capability.",
    "Yuri Gagarin, a Soviet cosmonaut, was the first human to journey into outer space in 1961.",
    "The Saturn V rocket was the launch vehicle used for the Apollo missions to the Moon.",
    "Several more Apollo missions successfully landed humans on the Moon after Apollo 11.",
    "Future lunar missions aim to establish a more permanent human presence on the Moon."
])

# --- Entry 20: Business (Startups) ---
data["question"].append("What does 'MVP' stand for in the context of startups and product development?")
data["answer"].append("Minimum Viable Product.")
data["context"].append([
    # Golden Documents
    "In the startup world, MVP stands for Minimum Viable Product, which is a version of a new product that allows a team to collect the maximum amount of validated learning about customers with the least effort.",
    "The concept of a Minimum Viable Product (MVP) is central to the lean startup methodology, focusing on creating a basic product version to test market hypotheses and gather user feedback quickly.",
    # Distractor Documents
    "Venture capital (VC) is a form of private equity financing that is provided by venture capital firms or funds to startups and small businesses with perceived long-term growth potential.",
    "A startup is a young company founded by one or more entrepreneurs to develop a unique product or service and bring it to market.",
    "Pivoting in a startup context means changing a fundamental aspect of the business strategy after receiving feedback.",
    "Seed funding is the first official equity funding stage for a new company.",
    "Bootstrapping refers to building a company from the ground up with only personal savings and, possibly, the cash coming in from the first sales.",
    "A pitch deck is a brief presentation used to provide an audience with a quick overview of a business plan.",
    "User acquisition is the process of gaining new users for an app, platform, or other service.",
    "Scalability is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth."
])

# --- Entry 21: Astronomy (Black Holes) ---
data["question"].append("What is the boundary of a black hole beyond which nothing, not even light, can escape?")
data["answer"].append("The event horizon.")
data["context"].append([
    # Golden Documents
    "The event horizon of a black hole is defined as the 'point of no return,' the boundary around a black hole from which no light or any other radiation can escape because the gravitational pull is too strong.",
    "Known as the event horizon, the surface demarcating the region around a black hole from which it's impossible to escape represents the critical threshold of its immense gravity.",
    # Distractor Documents
    "Black holes are regions in spacetime where gravity is so strong that they absorb all light that hits them.",
    "Stephen Hawking proposed that black holes emit Hawking radiation, slowly losing mass over time.",
    "Supermassive black holes are believed to exist at the center of most galaxies, including our own Milky Way.",
    "A singularity is the theoretical point of infinite density at the center of a black hole.",
    "Black holes can be formed from the collapse of massive stars at the end of their lifecycle.",
    "Gravitational lensing is an effect where light from a distant source is bent around a massive object, like a black hole.",
    "Accretion disks are formed by matter spiraling into a black hole, often emitting intense X-rays.",
    "The first image of a black hole was captured by the Event Horizon Telescope collaboration in 2019."
])

# --- Entry 22: World Geography (Capitals) ---
data["question"].append("What is the capital city of Australia?")
data["answer"].append("Canberra.")
data["context"].append([
    # Golden Documents
    "Canberra, located in the Australian Capital Territory (ACT), was chosen as the capital city of Australia in 1908 as a compromise between rivals Sydney and Melbourne.",
    "The official capital city of the Commonwealth of Australia is Canberra, a purpose-built city designed by Walter Burley Griffin.",
    # Distractor Documents
    "Sydney is the most populous city in Australia and the state capital of New South Wales.",
    "Melbourne is the second-most populous city in Australia and the state capital of Victoria.",
    "Australia is both a continent and a country, part of the Oceania region.",
    "The Great Barrier Reef, off the coast of Queensland, is the world's largest coral reef system.",
    "The Australian Outback refers to the vast, remote, arid interior of Australia.",
    "The official currency of Australia is the Australian dollar (AUD).",
    "Indigenous Australians are the original inhabitants of the Australian continent and nearby islands.",
    "Uluru (Ayers Rock) is a large sandstone monolith in the southern part of the Northern Territory in central Australia."
])

# --- Entry 23: Sports (Olympics) ---
data["question"].append("In which city were the first modern Olympic Games held in 1896?")
data["answer"].append("Athens, Greece.")
data["context"].append([
    # Golden Documents
    "The inaugural Games of the modern Olympiad were held in Athens, Greece, in 1896, reviving the ancient Greek tradition.",
    "Athens, the capital of Greece, was the host city for the first international Olympic Games held in modern history, from April 6 to 15, 1896.",
    # Distractor Documents
    "The ancient Olympic Games were held in Olympia, Greece, from the 8th century BC to the 4th century AD.",
    "Pierre de Coubertin is considered the father of the modern Olympic Games.",
    "The Olympic Games are held every four years, alternating between Summer and Winter Games.",
    "The five Olympic rings represent the five inhabited continents of the world.",
    "The Olympic motto is 'Citius, Altius, Fortius,' which is Latin for 'Faster, Higher, Stronger'.",
    "The International Olympic Committee (IOC) is the governing body of the Olympic Movement.",
    "The lighting of the Olympic flame is a symbolic tradition that begins in Olympia, Greece.",
    "Paralympic Games are held shortly after the Olympic Games for athletes with disabilities."
])

# --- Entry 24: Medicine (Antibiotics) ---
data["question"].append("Who is credited with the discovery of penicillin, the first widely used antibiotic?")
data["answer"].append("Alexander Fleming.")
data["context"].append([
    # Golden Documents
    "Sir Alexander Fleming, a Scottish physician and microbiologist, is renowned for his discovery of penicillin in 1928, for which he shared the Nobel Prize in Physiology or Medicine.",
    "The accidental discovery of the antibiotic properties of penicillin by Alexander Fleming in 1928 revolutionized medicine by providing an effective treatment for many bacterial infections.",
    # Distractor Documents
    "Antibiotics are antimicrobial substances active against bacteria.",
    "Antibiotic resistance is a major public health concern where bacteria evolve to resist the effects of antibiotics.",
    "Penicillin works by interfering with the bacterial cell wall synthesis.",
    "Howard Florey and Ernst Chain were instrumental in developing penicillin for medical use on a large scale.",
    "Before antibiotics, bacterial infections like pneumonia and tuberculosis were often fatal.",
    "It is important to complete the full course of prescribed antibiotics to prevent resistance.",
    "Antiviral drugs are used to treat viral infections, not bacterial ones.",
    "Vaccines help prevent infections by stimulating the immune system to recognize and fight specific pathogens."
])

# --- Entry 25: Mathematics (Pi) ---
data["question"].append("What is the approximate numerical value of Pi (π) to two decimal places?")
data["answer"].append("3.14.")
data["context"].append([
    # Golden Documents
    "Pi (π) is a mathematical constant representing the ratio of a circle's circumference to its diameter, commonly approximated as 3.14.",
    "The numerical value of Pi (π), when rounded to two decimal places, is 3.14, though its decimal representation never ends and never enters a permanently repeating pattern.",
    # Distractor Documents
    "Pi is an irrational number, meaning it cannot be expressed as a simple fraction of two integers.",
    "Pi Day is celebrated on March 14th (3/14) due to the first three significant digits of π.",
    "Archimedes of Syracuse was one of the first mathematicians to calculate an accurate approximation of Pi.",
    "Pi is used in many formulas in geometry and trigonometry.",
    "The symbol π was first adopted by Welsh mathematician William Jones in 1706.",
    "Calculating Pi to an increasing number of digits has been a challenge for mathematicians and computer scientists.",
    "Tau (τ) is another mathematical constant related to circles, equal to 2π, approximately 6.28.",
    "Euler's number 'e' is another fundamental mathematical constant, approximately 2.718."
])

# --- Entry 26: Psychology (Classical Conditioning) ---
data["question"].append("Which psychologist is most famously associated with experiments on classical conditioning involving dogs salivating to a bell?")
data["answer"].append("Ivan Pavlov.")
data["context"].append([
    # Golden Documents
    "Ivan Pavlov, a Russian physiologist, is renowned for his experiments in classical conditioning, where he demonstrated that dogs could be conditioned to salivate at the sound of a bell if it was repeatedly paired with food.",
    "The foundational work on classical conditioning, notably the salivating dog experiments using a bell as a conditioned stimulus, was conducted by Ivan Pavlov.",
    # Distractor Documents
    "Classical conditioning is a learning process in which an association is made between a naturally existing stimulus and a previously neutral one.",
    "B.F. Skinner is known for his work on operant conditioning, which involves learning through rewards and punishments.",
    "John B. Watson was a pioneer of behaviorism, applying principles of classical conditioning to human emotions.",
    "The 'Little Albert' experiment is a famous example of classical conditioning in humans.",
    "In classical conditioning, an unconditioned stimulus (UCS) naturally elicits an unconditioned response (UCR).",
    "A conditioned stimulus (CS) is a previously neutral stimulus that, after association with a UCS, triggers a conditioned response (CR).",
    "Extinction in classical conditioning occurs when the conditioned stimulus is repeatedly presented without the unconditioned stimulus.",
    "Stimulus generalization is the tendency for a conditioned response to occur in response to stimuli that are similar to the conditioned stimulus."
])

# --- Entry 27: Cuisine (Pasta Origin) ---
data["question"].append("While many cultures have noodle dishes, which country is most famously associated with the origin and popularization of pasta as we know it today?")
data["answer"].append("Italy.")
data["context"].append([
    # Golden Documents
    "Although noodles exist in many cultures, Italy is universally recognized as the birthplace and popularizer of pasta in its diverse forms and culinary traditions.",
    "Italy is renowned for developing and popularizing a vast array of pasta shapes and dishes, making it central to the global understanding and appreciation of pasta.",
    # Distractor Documents
    "Noodles have a long history in China, with archaeological evidence suggesting they were made there thousands of years ago.",
    "Marco Polo is often anecdotally (and likely incorrectly) credited with bringing pasta to Italy from China.",
    "Durum wheat semolina is the preferred flour for making high-quality dried pasta.",
    "Pasta can be broadly categorized into dried (pasta secca) and fresh (pasta fresca).",
    "Popular pasta dishes include spaghetti bolognese, carbonara, lasagna, and fettuccine alfredo.",
    "Different regions of Italy are known for specific pasta shapes and sauces.",
    "Al dente is an Italian term describing pasta that is cooked to be firm to the bite.",
    "The word 'pasta' itself is Italian for 'paste', referring to the dough made from flour and water or eggs."
])

# --- Entry 28: Automotive (First Car) ---
data["question"].append("Who is generally credited with inventing the first practical gasoline-powered automobile?")
data["answer"].append("Karl Benz.")
data["context"].append([
    # Golden Documents
    "Karl Benz, a German engineer, is widely acknowledged for inventing the first practical automobile powered by an internal combustion gasoline engine, the Benz Patent-Motorwagen, in 1886.",
    "The pioneering work of Karl Benz led to the creation of the Benz Patent-Motorwagen in 1886, often cited as the world's first true gasoline-powered car.",
    # Distractor Documents
    "Henry Ford revolutionized automobile manufacturing with the introduction of the assembly line for the Ford Model T.",
    "Gottlieb Daimler, a contemporary of Benz, also developed early gasoline engines and automobiles.",
    "Early automobiles were often expensive and unreliable, more of a novelty than practical transportation.",
    "The internal combustion engine converts chemical energy from fuel into mechanical energy.",
    "Electric cars actually predate gasoline cars in some respects but faced limitations in battery technology.",
    "Bertha Benz, Karl's wife, undertook the first long-distance automobile journey, significantly publicizing the invention.",
    "The development of the automobile had a profound impact on society, leading to changes in urban planning, commerce, and personal freedom.",
    "Modern cars incorporate numerous safety features, such as airbags, anti-lock brakes (ABS), and electronic stability control (ESC)."
])

# --- Entry 29: Geology (Plate Tectonics) ---
data["question"].append("What is the theory that describes the large-scale motions of Earth's lithosphere, explaining phenomena like earthquakes and continental drift?")
data["answer"].append("Plate tectonics.")
data["context"].append([
    # Golden Documents
    "Plate tectonics is the scientific theory that Earth's outer shell is divided into several large and small plates that glide over the mantle, explaining earthquakes, volcanic activity, mountain-building, and the formation of ocean trenches.",
    "The theory of plate tectonics provides a comprehensive model for understanding the large-scale movements of Earth's lithospheric plates and their role in shaping geological features.",
    # Distractor Documents
    "Alfred Wegener first proposed the theory of continental drift in the early 20th century, which was a precursor to plate tectonics.",
    "The lithosphere is composed of the Earth's crust and the upper part of the mantle.",
    "There are three main types of plate boundaries: convergent (colliding), divergent (spreading), and transform (sliding past).",
    "Earthquakes often occur along fault lines, which are fractures in rock where movement has occurred, typically at plate boundaries.",
    "Subduction zones occur at convergent boundaries where one tectonic plate moves under another.",
    "The Ring of Fire is a major area in the basin of the Pacific Ocean where many earthquakes and volcanic eruptions occur.",
    "Seafloor spreading at mid-ocean ridges is a key mechanism driving plate tectonics.",
    "Pangaea was a supercontinent that existed during the late Paleozoic and early Mesozoic eras."
])

# --- Entry 30: Linguistics (Noam Chomsky) ---
data["question"].append("What groundbreaking linguistic theory, proposing an innate grammatical structure in humans, is Noam Chomsky known for?")
data["answer"].append("Universal Grammar (or Generative Grammar).")
data["context"].append([
    # Golden Documents
    "Noam Chomsky revolutionized linguistics with his theory of Universal Grammar, which posits that humans are born with an innate capacity and underlying grammatical framework for language acquisition.",
    "The concept of generative grammar, developed by Noam Chomsky, suggests that there's an inherent biological endowment in humans that allows them to learn and use complex language structures.",
    # Distractor Documents
    "Noam Chomsky is also a prominent political activist, philosopher, and social critic.",
    "Linguistics is the scientific study of language, its structure, and its use.",
    "Phonology studies the sound systems of languages.",
    "Syntax is concerned with the rules governing sentence structure.",
    "Semantics deals with the meaning of words and sentences.",
    "The 'poverty of the stimulus' argument by Chomsky suggests that children are not exposed to enough linguistic data to learn language purely through imitation.",
    "Before Chomsky, behaviorist theories suggested language was learned primarily through reinforcement and imitation.",
    "Sociolinguistics examines the relationship between language and society."
])

# --- Entry 31: Energy (Fossil Fuels) ---
data["question"].append("What are the three main types of fossil fuels?")
data["answer"].append("Coal, oil (petroleum), and natural gas.")
data["context"].append([
    # Golden Documents
    "The three principal categories of fossil fuels, formed from ancient organic matter over millions of years, are coal, petroleum (commonly known as oil), and natural gas.",
    "Coal, oil, and natural gas constitute the primary fossil fuels, which are carbon-rich deposits that are extracted and burned to produce energy.",
    # Distractor Documents
    "Fossil fuels are non-renewable energy sources, meaning their supplies are finite.",
    "The burning of fossil fuels is a major contributor to greenhouse gas emissions and climate change.",
    "Coal is a combustible black or brownish-black sedimentary rock, primarily used for electricity generation and industrial processes.",
    "Petroleum (oil) is a naturally occurring, yellowish-black liquid found beneath Earth's surface, which can be refined into various fuels like gasoline and diesel.",
    "Natural gas is a gaseous fossil fuel consisting primarily of methane, used for heating, cooking, and electricity generation.",
    "Renewable energy sources like solar, wind, and hydropower are alternatives to fossil fuels.",
    "Fracking (hydraulic fracturing) is a technique used to extract oil and natural gas from shale rock.",
    "Carbon capture and storage (CCS) technologies aim to trap CO2 emissions from burning fossil fuels."
])

# --- Entry 32: Ancient Civilizations (Egypt - Pyramids) ---
data["question"].append("For what primary purpose were the great pyramids of Giza built in ancient Egypt?")
data["answer"].append("As tombs for pharaohs and their consorts.")
data["context"].append([
    # Golden Documents
    "The great pyramids of Giza, built during Egypt's Old Kingdom, served as elaborate tombs for pharaohs, designed to protect their bodies and possessions for the afterlife.",
    "Ancient Egyptians constructed the magnificent pyramids at Giza primarily as monumental burial places for their divine rulers, the pharaohs, ensuring their passage to the next world.",
    # Distractor Documents
    "The Great Pyramid of Giza was built for Pharaoh Khufu.",
    "The Sphinx, a colossal limestone statue with the body of a lion and a human head, is located near the Giza pyramids.",
    "Hieroglyphs were the formal writing system used in ancient Egypt.",
    "The Nile River was crucial to the development of ancient Egyptian civilization, providing water, fertile soil, and transportation.",
    "Mummification was a complex process used by ancient Egyptians to preserve bodies for the afterlife.",
    "Ancient Egyptian religion was polytheistic, with a pantheon of gods and goddesses like Ra, Osiris, and Isis.",
    "The Rosetta Stone was key to deciphering ancient Egyptian hieroglyphs.",
    "Pharaohs were considered divine rulers in ancient Egypt, intermediaries between the gods and the people."
])

# --- Entry 33: Modern Art (Pop Art - Warhol) ---
data["question"].append("Which American artist was a leading figure in the Pop Art movement, famous for his Campbell's Soup Cans and Marilyn Monroe screen prints?")
data["answer"].append("Andy Warhol.")
data["context"].append([
    # Golden Documents
    "Andy Warhol, a prominent American artist, director, and producer, was a central figure in the Pop Art movement, best known for his iconic works like the Campbell's Soup Cans and screen prints of Marilyn Monroe.",
    "With his distinctive screen-printed images of everyday objects like Campbell's Soup Cans and celebrities such as Marilyn Monroe, Andy Warhol became synonymous with the Pop Art visual art movement.",
    # Distractor Documents
    "Pop Art emerged in the mid-1950s in Britain and in the late 1950s in the United States.",
    "Pop Art often incorporates imagery from popular and mass culture, such as advertising, comic books, and mundane cultural objects.",
    "Roy Lichtenstein was another key Pop Art figure, known for his comic strip-inspired paintings.",
    "Claes Oldenburg is a Pop Art sculptor known for his large-scale replicas of everyday objects.",
    "The Factory was the name of Andy Warhol's New York City studio.",
    "Pop Art challenged traditions of fine art by including imagery from popular culture and employing techniques of mass production.",
    "Abstract Expressionism was a post–World War II art movement that preceded Pop Art and emphasized spontaneous, subconscious creation.",
    "Mass media and consumerism were common themes explored by Pop artists."
])

# --- Entry 34: Inventors (Light Bulb) ---
data["question"].append("Who is widely credited with developing the first commercially practical incandescent light bulb?")
data["answer"].append("Thomas Edison.")
data["context"].append([
    # Golden Documents
    "Thomas Alva Edison is renowned for developing a long-lasting, commercially practical incandescent light bulb in 1879, which significantly impacted modern life.",
    "While many inventors contributed to electric lighting, Thomas Edison's development of an effective and affordable incandescent light bulb in the late 19th century was a pivotal innovation for widespread use.",
    # Distractor Documents
    "An incandescent light bulb produces light by heating a wire filament until it glows.",
    "Edison also developed a complete electrical distribution system to power his light bulbs.",
    "Humphry Davy invented an early electric arc lamp in the early 1800s, long before Edison.",
    "Joseph Swan, a British inventor, also developed an early incandescent light bulb around the same time as Edison.",
    "Edison's laboratory in Menlo Park, New Jersey, was a pioneering industrial research facility.",
    "The 'war of currents' was a competition between Edison's direct current (DC) system and Westinghouse's alternating current (AC) system.",
    "Modern lighting technologies include fluorescent lamps, LEDs (Light Emitting Diodes), and halogen lamps.",
    "Thomas Edison held over 1,000 U.S. patents for his inventions."
])

# --- Entry 35: Marine Biology (Coral Reefs) ---
data["question"].append("What are the tiny marine animals, primarily responsible for building coral reefs, called?")
data["answer"].append("Coral polyps.")
data["context"].append([
    # Golden Documents
    "Coral reefs are intricate underwater ecosystems built by colonies of tiny marine animals known as coral polyps, which secrete calcium carbonate to form hard skeletons.",
    "The foundational builders of coral reef structures are coral polyps, small, colonial cnidarians that create vast formations from their exoskeletons.",
    # Distractor Documents
    "Coral reefs are often called 'rainforests of the sea' due to their high biodiversity.",
    "Zooxanthellae are symbiotic algae that live within the tissues of many coral polyps, providing them with nutrients through photosynthesis.",
    "Coral bleaching occurs when corals expel their zooxanthellae due to stress, often caused by rising water temperatures.",
    "The Great Barrier Reef in Australia is the world's largest coral reef system.",
    "Coral reefs provide important habitat for a wide variety of marine life.",
    "Ocean acidification, caused by increased CO2 absorption, poses a significant threat to coral reefs by making it harder for corals to build their skeletons.",
    "Coral reefs are found in warm, shallow, clear tropical and subtropical waters.",
    "Threats to coral reefs include climate change, pollution, overfishing, and coastal development."
])

# --- Entry 36: Botany (Trees - Sequoias) ---
data["question"].append("What is the name of the world's largest tree by volume, a giant sequoia located in California's Sequoia National Park?")
data["answer"].append("General Sherman.")
data["context"].append([
    # Golden Documents
    "The General Sherman tree, a magnificent giant sequoia (Sequoiadendron giganteum) found in California's Sequoia National Park, holds the title of the world's largest living tree by volume.",
    "By measure of sheer volume, the General Sherman is unparalleled among trees globally; this giant sequoia resides within Sequoia National Park in California.",
    # Distractor Documents
    "Giant sequoias are native to the western slopes of the Sierra Nevada mountains in California.",
    "Coast redwoods (Sequoia sempervirens) are the world's tallest trees, distinct from giant sequoias.",
    "Trees play a vital role in producing oxygen and absorbing carbon dioxide from the atmosphere.",
    "Dendrochronology is the science of dating tree rings to study past events and environmental changes.",
    "Deforestation is the clearing of forests for other land uses, leading to habitat loss and climate impacts.",
    "Old-growth forests are forests that have attained great age without significant disturbance.",
    "The rings of a tree can indicate its age and the climatic conditions during its growth.",
    "Photosynthesis is the process by which trees convert sunlight into energy."
])

# --- Entry 37: World Leaders (Nelson Mandela) ---
data["question"].append("Which South African anti-apartheid revolutionary served as the country's first black president from 1994 to 1999?")
data["answer"].append("Nelson Mandela.")
data["context"].append([
    # Golden Documents
    "Nelson Mandela, an iconic figure in the fight against apartheid, became South Africa's first democratically elected black president, serving from 1994 to 1999.",
    "Following decades of imprisonment for his anti-apartheid activism, Nelson Mandela led South Africa as its first black head of state from 1994 to 1999, ushering in a new democratic era.",
    # Distractor Documents
    "Apartheid was a system of institutionalized racial segregation and discrimination in South Africa.",
    "The African National Congress (ANC) was the political party Nelson Mandela belonged to.",
    "Robben Island is where Nelson Mandela was imprisoned for 18 of his 27 years in jail.",
    "Desmond Tutu was another prominent South African anti-apartheid activist and Nobel Peace Prize laureate.",
    "The Sharpeville massacre in 1960 was a turning point in the anti-apartheid struggle.",
    "The Truth and Reconciliation Commission was established in South Africa post-apartheid to help heal the country.",
    "F.W. de Klerk was the South African president who released Mandela from prison and worked with him to end apartheid.",
    "South Africa is known for its diverse cultures, languages, and natural beauty."
])

# --- Entry 38: Dinosaurs (T-Rex) ---
data["question"].append("What does 'Tyrannosaurus Rex' mean?")
data["answer"].append("Tyrant Lizard King.")
data["context"].append([
    # Golden Documents
    "The name Tyrannosaurus Rex is derived from Greek and Latin words: 'tyrannos' (tyrant), 'sauros' (lizard), and the Latin 'rex' (king), translating to 'Tyrant Lizard King'.",
    "'Tyrant Lizard King' is the evocative meaning behind the scientific name Tyrannosaurus Rex, reflecting its status as one of the largest known terrestrial carnivores.",
    # Distractor Documents
    "Tyrannosaurus Rex lived during the Late Cretaceous period, about 68 to 66 million years ago.",
    "T-Rex was a bipedal carnivore with a massive skull balanced by a long, heavy tail.",
    "Fossil evidence suggests T-Rex had powerful bite forces, among the strongest of any terrestrial animal.",
    "The first T-Rex fossils were discovered in the early 20th century in North America.",
    "Triceratops was a herbivorous dinosaur that coexisted with and was likely preyed upon by T-Rex.",
    "The Chicxulub impactor event is widely believed to have caused the extinction of the non-avian dinosaurs.",
    "Paleontology is the scientific study of life existent prior to, and sometimes including, the start of the Holocene Epoch.",
    "Many dinosaurs are now believed to have had feathers, including some relatives of T-Rex."
])

# --- Entry 39: Global Organizations (United Nations) ---
data["question"].append("In what year was the United Nations (UN) founded?")
data["answer"].append("1945.")
data["context"].append([
    # Golden Documents
    "The United Nations (UN) was officially established on October 24, 1945, following the end of World War II, with the goal of preventing future global conflicts.",
    "Founded in 1945 in the aftermath of the Second World War, the United Nations is an intergovernmental organization aimed at maintaining international peace and security.",
    # Distractor Documents
    "The UN headquarters is located in New York City.",
    "The main organs of the UN include the General Assembly, the Security Council, the Economic and Social Council, the Trusteeship Council, the International Court of Justice, and the UN Secretariat.",
    "The League of Nations, established after World War I, was the predecessor to the UN.",
    "The UN Security Council has five permanent members with veto power: China, France, Russia, the United Kingdom, and the United States.",
    "UN Peacekeeping missions involve deploying military and civilian personnel to conflict zones.",
    "The Universal Declaration of Human Rights was proclaimed by the UN General Assembly in 1948.",
    "Specialized agencies of the UN include WHO (World Health Organization), UNICEF (United Nations Children's Fund), and UNESCO (United Nations Educational, Scientific and Cultural Organization).",
    "The UN aims to foster cooperation among nations to solve international economic, social, cultural, or humanitarian problems."
])

# --- Entry 40: Beverages (Coffee Origin) ---
data["question"].append("According to popular legend, coffee was discovered in which African country by a goat herder named Kaldi?")
data["answer"].append("Ethiopia.")
data["context"].append([
    # Golden Documents
    "The most popular legend regarding the discovery of coffee attributes it to an Ethiopian goat herder named Kaldi, who noticed his goats became energetic after eating berries from a certain tree.",
    "Ethiopia is widely regarded as the birthplace of coffee, with a common story crediting its discovery to Kaldi, a goat herder who observed the stimulating effects of coffee cherries on his flock.",
    # Distractor Documents
    "Coffee beans are actually the seeds of coffee cherries.",
    "The two main commercially grown coffee species are Arabica and Robusta.",
    "Brazil is currently the world's largest producer of coffee.",
    "Espresso is a concentrated coffee beverage brewed by forcing hot water under pressure through finely-ground coffee beans.",
    "Roasting coffee beans transforms their chemical and physical properties, developing their characteristic flavor and aroma.",
    "Caffeine is the primary psychoactive compound found in coffee.",
    "The term 'qahwa' in Arabic, originally referring to wine, was later applied to coffee.",
    "Coffee houses became important social and intellectual centers in Europe in the 17th and 18th centuries."
])

# --- Entry 41: Human Anatomy (Heart Chambers) ---
data["question"].append("How many chambers does the human heart have?")
data["answer"].append("Four.")
data["context"].append([
    # Golden Documents
    "The human heart is a vital organ composed of four chambers: two upper atria (right and left) and two lower ventricles (right and left).",
    "A four-chambered structure, consisting of two atria and two ventricles, is characteristic of the human heart, allowing for efficient separation of oxygenated and deoxygenated blood.",
    # Distractor Documents
    "The heart pumps blood throughout the body via the circulatory system, supplying oxygen and nutrients.",
    "The right atrium receives deoxygenated blood from the body.",
    "The right ventricle pumps deoxygenated blood to the lungs.",
    "The left atrium receives oxygenated blood from the lungs.",
    "The left ventricle pumps oxygenated blood to the rest of the body.",
    "Arteries carry blood away from the heart, while veins carry blood towards the heart.",
    "The aorta is the largest artery in the human body.",
    "Heart valves prevent the backward flow of blood within the heart chambers."
])

# --- Entry 42: Famous Structures (Eiffel Tower) ---
data["question"].append("For what occasion was the Eiffel Tower in Paris originally constructed?")
data["answer"].append("For the 1889 Exposition Universelle (World's Fair) to celebrate the 100th anniversary of the French Revolution.")
data["context"].append([
    # Golden Documents
    "The Eiffel Tower was designed and built as the entrance arch for the 1889 Exposition Universelle (World's Fair) held in Paris, which commemorated the centennial of the French Revolution.",
    "Originally intended as a temporary structure for the 1889 Paris World's Fair, celebrating the 100th anniversary of the French Revolution, the Eiffel Tower became an iconic landmark.",
    # Distractor Documents
    "Gustave Eiffel was the French civil engineer and architect whose company designed and built the tower.",
    "The Eiffel Tower is located on the Champ de Mars in Paris, France.",
    "Made of wrought iron, it was the tallest man-made structure in the world until the Chrysler Building was completed in 1930.",
    "The tower has three levels for visitors, with restaurants on the first and second levels.",
    "Initially, the Eiffel Tower faced criticism from some of France's leading artists and intellectuals.",
    "It is one of the most recognizable landmarks in the world and a global icon of French culture.",
    "The tower is repainted every seven years to protect it from rust.",
    "Millions of people visit the Eiffel Tower each year."
])

# --- Entry 43: Music Genres (Jazz Origins) ---
data["question"].append("In which American city did jazz music primarily originate in the late 19th and early 20th centuries?")
data["answer"].append("New Orleans, Louisiana.")
data["context"].append([
    # Golden Documents
    "New Orleans, Louisiana, is widely recognized as the birthplace of jazz music, a genre that emerged from a rich blend of African American and European musical traditions in the late 1800s and early 1900s.",
    "The vibrant cultural melting pot of New Orleans provided the fertile ground for the development of jazz, with its unique combination of blues, ragtime, and brass band music.",
    # Distractor Documents
    "Jazz is characterized by improvisation, syncopation, swing rhythm, and call-and-response vocals.",
    "Key figures in early jazz include Louis Armstrong, Buddy Bolden, and Jelly Roll Morton.",
    "Congo Square in New Orleans was an important site for African American musical expression.",
    "Different styles of jazz include Dixieland, swing, bebop, cool jazz, and fusion.",
    "The Prohibition era in the 1920s saw a rise in speakeasies, many of which featured jazz music.",
    "Jazz spread from New Orleans to other major cities like Chicago, New York, and Kansas City.",
    "The saxophone, trumpet, trombone, piano, bass, and drums are common instruments in jazz ensembles.",
    "Improvisation is a core element of jazz performance, allowing musicians to spontaneously create melodies and solos."
])

# --- Entry 44: Literature (Moby Dick) ---
data["question"].append("What is the name of the obsessed captain who relentlessly hunts the white whale in Herman Melville's novel *Moby Dick*?")
data["answer"].append("Captain Ahab.")
data["context"].append([
    # Golden Documents
    "In Herman Melville's classic novel *Moby Dick*, Captain Ahab is the monomaniacal commander of the whaling ship Pequod, driven by a vengeful desire to hunt and kill the white whale that took his leg.",
    "The central antagonist (or protagonist, depending on interpretation) in *Moby Dick* is Captain Ahab, whose obsession with the white whale, Moby Dick, consumes him and his crew.",
    # Distractor Documents
    "Herman Melville's *Moby Dick; or, The Whale* was first published in 1851.",
    "The novel is narrated by a sailor named Ishmael.",
    "The Pequod is the name of Captain Ahab's whaling ship.",
    "The white whale, Moby Dick, is depicted as an unusually large and intelligent sperm whale.",
    "The novel explores themes of obsession, revenge, fate, and the nature of good and evil.",
    "Whaling was a significant industry in the 19th century, providing oil for lamps and other products.",
    "Starbuck is the first mate of the Pequod, often serving as a voice of reason against Ahab's obsession.",
    "*Moby Dick* was not initially a commercial success but later gained recognition as a great American novel."
])

# --- Entry 45: Chemistry (Periodic Table - Mendeleev) ---
data["question"].append("Which Russian chemist is primarily credited with formulating the first widely recognized version of the periodic table of elements?")
data["answer"].append("Dmitri Mendeleev.")
data["context"].append([
    # Golden Documents
    "Dmitri Mendeleev, a Russian chemist, is celebrated for creating the first widely accepted periodic table in 1869, arranging elements by atomic mass and predicting properties of undiscovered elements.",
    "The formulation of the periodic table, a cornerstone of chemistry, is largely attributed to Dmitri Mendeleev, who organized known elements and notably left gaps for those yet to be found.",
    # Distractor Documents
    "The periodic table organizes chemical elements based on their atomic number, electron configuration, and recurring chemical properties.",
    "Elements in the same group (column) of the periodic table generally have similar chemical properties.",
    "Elements in the same period (row) have the same number of electron shells.",
    "Mendeleev initially arranged elements by atomic weight, but the modern table is arranged by atomic number (number of protons).",
    "Lothar Meyer, a German chemist, independently developed a similar periodic table around the same time as Mendeleev.",
    "The periodic table has been expanded and refined as new elements have been discovered or synthesized.",
    "Noble gases (Group 18) are generally unreactive due to their full valence electron shells.",
    "Alkali metals (Group 1) are highly reactive metals."
])

# --- Entry 46: Space (Mars Rovers - Perseverance) ---
data["question"].append("What is the name of the NASA Mars rover that successfully landed in Jezero Crater in February 2021?")
data["answer"].append("Perseverance.")
data["context"].append([
    # Golden Documents
    "NASA's Perseverance rover, part of the Mars 2020 mission, successfully touched down in Jezero Crater on Mars on February 18, 2021, to seek signs of ancient microbial life.",
    "The rover named Perseverance landed in Jezero Crater on Mars in February 2021, equipped with advanced instruments to explore the Martian surface and collect samples.",
    # Distractor Documents
    "Jezero Crater is believed to have once been a lake, making it a promising location to search for past life.",
    "Perseverance carried the Ingenuity helicopter, the first aircraft to achieve powered, controlled flight on another planet.",
    "Previous NASA Mars rovers include Sojourner, Spirit, Opportunity, and Curiosity.",
    "One of Perseverance's main goals is to collect rock and soil samples for potential return to Earth by a future mission.",
    "The Curiosity rover, which landed in 2012, is still operational in Gale Crater on Mars.",
    "Mars is the fourth planet from the Sun and is often called the 'Red Planet'.",
    "Exploring Mars helps scientists understand the potential for life beyond Earth and the history of our solar system.",
    "Future human missions to Mars are a long-term goal for NASA and other space agencies."
])

# --- Entry 47: Economics (Adam Smith) ---
data["question"].append("Which Scottish economist and philosopher, often called the 'Father of Modern Economics,' wrote *The Wealth of Nations*?")
data["answer"].append("Adam Smith.")
data["context"].append([
    # Golden Documents
    "Adam Smith, a key figure of the Scottish Enlightenment, is renowned as the 'Father of Modern Economics' for his influential 1776 book *An Inquiry into the Nature and Causes of the Wealth of Nations*.",
    "The author of the seminal economic work *The Wealth of Nations* was Adam Smith, a Scottish moral philosopher whose ideas laid the foundation for classical free market economic theory.",
    # Distractor Documents
    " *The Wealth of Nations* introduced concepts like the division of labor and the 'invisible hand' of the market.",
    "The 'invisible hand' describes how self-interested individuals operating in a free market can unintentionally promote the general benefit of society.",
    "Laissez-faire is an economic system in which transactions between private parties are free from government intervention.",
    "Karl Marx was a German philosopher and economist known for his theories on capitalism and communism, critiquing Smith's ideas.",
    "John Maynard Keynes was a British economist whose ideas fundamentally changed the theory and practice of macroeconomics.",
    "Economics is the social science that studies the production, distribution, and consumption of goods and services.",
    "Gross Domestic Product (GDP) is a common measure of a nation's economic output.",
    "Supply and demand are fundamental concepts in economics that determine market prices."
])

# --- Entry 48: Biology (DNA Structure) ---
data["question"].append("What is the characteristic molecular structure of DNA, often described as a 'twisted ladder'?")
data["answer"].append("A double helix.")
data["context"].append([
    # Golden Documents
    "Deoxyribonucleic acid (DNA) is known for its distinctive double helix structure, where two polynucleotide chains coil around each other, resembling a twisted ladder.",
    "The iconic 'twisted ladder' shape of a DNA molecule is scientifically referred to as a double helix, formed by two complementary strands.",
    # Distractor Documents
    "DNA carries the genetic instructions for the development, functioning, growth, and reproduction of all known organisms and many viruses.",
    "James Watson and Francis Crick, with contributions from Rosalind Franklin and Maurice Wilkins, elucidated the double helix structure of DNA in 1953.",
    "The building blocks of DNA are nucleotides, each composed of a sugar (deoxyribose), a phosphate group, and one of four nitrogenous bases: adenine (A), guanine (G), cytosine (C), and thymine (T).",
    "Adenine pairs with thymine (A-T), and guanine pairs with cytosine (G-C) in the DNA double helix.",
    "RNA (ribonucleic acid) is another type of nucleic acid, typically single-stranded, involved in protein synthesis.",
    "Genes are specific segments of DNA that code for proteins or functional RNA molecules.",
    "The human genome consists of approximately 3 billion base pairs of DNA.",
    "Mutations are changes in the DNA sequence that can lead to genetic variation."
])

# --- Entry 49: Mythology (Norse - Thor's Hammer) ---
data["question"].append("In Norse mythology, what is the name of Thor's enchanted hammer?")
data["answer"].append("Mjölnir.")
data["context"].append([
    # Golden Documents
    "Mjölnir is the name of the mighty enchanted hammer wielded by Thor, the Norse god of thunder, capable of leveling mountains and always returning to his hand when thrown.",
    "The powerful war-hammer of Thor, the god associated with thunder in Norse mythology, is known as Mjölnir, a formidable weapon forged by dwarves.",
    # Distractor Documents
    "Thor is a prominent god in Norse mythology, associated with thunder, lightning, storms, strength, and the protection of mankind.",
    "Odin is the Allfather, king of the Æsir gods and father of Thor.",
    "Loki is a god (or jötunn) in Norse mythology, known for his trickery and shape-shifting abilities.",
    "Asgard is the realm of the Æsir gods in Norse cosmology.",
    "Ragnarök is the prophesied end of days in Norse mythology, involving a great battle and the death of many gods.",
    "Valkyries are female figures who choose those who die bravely in battle to bring to Valhalla.",
    "Valhalla is Odin's majestic hall in Asgard where chosen fallen warriors feast.",
    "Norse mythology originates from Scandinavian folklore and was part of the Germanic pagan religion."
])

# --- Entry 50: Physics (E=mc²) ---
data["question"].append("What does the 'c' represent in Albert Einstein's famous equation E=mc²?")
data["answer"].append("The speed of light in a vacuum.")
data["context"].append([
    # Golden Documents
    "In Albert Einstein's iconic mass-energy equivalence formula, E=mc², the 'c' stands for the constant speed of light in a vacuum, approximately 299,792,458 meters per second.",
    "The 'c' in the renowned equation E=mc² symbolizes the speed of light in a vacuum, a fundamental physical constant in Einstein's theory of special relativity.",
    # Distractor Documents
    "E=mc² states that energy (E) is equal to mass (m) multiplied by the speed of light squared (c²).",
    "Albert Einstein published this formula as part of his theory of special relativity in 1905.",
    "The equation demonstrates that a small amount of mass can be converted into a very large amount of energy because the speed of light squared is a huge number.",
    "This principle is fundamental to understanding nuclear reactions, such as those in nuclear power plants and atomic bombs.",
    "Special relativity deals with the relationship between space and time for objects moving at constant speeds.",
    "General relativity, also developed by Einstein, is a theory of gravitation.",
    "Einstein received the Nobel Prize in Physics in 1921, primarily for his discovery of the law of the photoelectric effect.",
    "The concept of mass-energy equivalence revolutionized physics."
])

# Create DataFrame
df = pd.DataFrame(data)

df_duplicate = pd.DataFrame(data)

#### Save Data 

In [26]:
df_duplicate = create_id(df_duplicate)
df_duplicate.to_csv("../data/synthetic_data/duplicate.csv", index = False)

In [32]:
df_complementary = pd.read_csv('../data/synthetic_data/complementary.csv')
df_complementary

Unnamed: 0,question,context,answer,id
0,What are the two primary materials used to con...,"[""The lightweight frame of a Xylotian Sky-Skif...",The two primary materials used to construct a ...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
1,What two distinct abilities does a Xylotian 'C...,['A trained Xylotian Chrono-Weaver can subtly ...,A Xylotian 'Chrono-Weaver' can perceive echoes...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
2,What are the two key functions of 'Symbiotic S...,"[""Within Xylotian terraforming pods, 'Symbioti...",'Symbiotic Spores' in Xylotian terraforming po...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
3,What two types of energy are harvested by the ...,"[""The 'Dual-Resonance Crystals' found deep wit...",The 'Dual-Resonance Crystals' of Xylos harvest...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
4,What are the two main defensive mechanisms of ...,"[""A Xylotian 'Guardian Orb' drone can emit a p...",A Xylotian 'Guardian Orb' drone's main defensi...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
5,What are the two primary components used in th...,"[""Xylotian 'Lumin-Ink', prized for its endurin...",The two primary components used in 'Lumin-Ink'...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
6,What two sensory inputs does the 'Pathfinder H...,"[""The Xylotian 'Pathfinder Helm' incorporates ...",The 'Pathfinder Helm' integrates 'Echo-Locatio...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
7,What two distinct phases define the operation ...,"[""The initial phase of a Xylotian 'Matter Re-s...",A Xylotian 'Matter Re-sequencer' operates in t...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
8,What are the two main functions of the 'Aura-C...,"[""The Xylotian 'Aura-Cloak' is designed to sub...",The 'Aura-Cloak' dampens the wearer's emotiona...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."
9,What two types of information are encoded onto...,"[""A Xylotian 'Legacy Crystal' is traditionally...",A Xylotian 'Legacy Crystal' encodes a 'Lineage...,"['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', ..."


###  Complementary Dataset

In [6]:
data = {
    "question": [],
    "context": [],
    "answer": []
}

# --- Sample 1 ---
data["question"].append("What are the two primary materials used to construct a Xylotian 'Sky-Skiff' hull?")
data["context"].append([
    "The lightweight frame of a Xylotian Sky-Skiff is primarily made from hardened 'Aero-Coral'.", # Golden 1 (Material 1)
    "For durability and energy shielding, the Aero-Coral frame of a Sky-Skiff is then clad in thin sheets of 'Noctilucent Metal'.", # Golden 2 (Material 2)
    "Xylotian ground vehicles are often made from volcanic rock.", # Hard Negative
    "Sky-Skiffs are typically piloted by a single Xylotian navigator.", # Soft Negative
    "The propulsion system of a Sky-Skiff utilizes focused solar winds.", # Soft Negative
    "Aero-Coral is a bio-engineered substance grown in Xylos's upper atmosphere.", # Soft Negative
    "The annual 'Great Xylotian Sky Race' features heavily modified Sky-Skiffs.", # Soft Negative
    "Noctilucent Metal glows faintly in the dark, a common aesthetic in Xylotian design.", # Soft Negative
    "Navigational tools on a Sky-Skiff include a 'Star-Compass' and a 'Wind-Gauge'.", # Soft Negative
    "Training to pilot a Sky-Skiff begins at a young age for many Xylotians." # Soft Negative
])
data["answer"].append("The two primary materials used to construct a Xylotian 'Sky-Skiff' hull are 'Aero-Coral' for the frame and 'Noctilucent Metal' for the cladding.")

# --- Sample 2 ---
data["question"].append("What two distinct abilities does a Xylotian 'Chrono-Weaver' possess?")
data["context"].append([
    "A trained Xylotian Chrono-Weaver can subtly perceive echoes of recent past events in their immediate vicinity.", # Golden 1 (Ability 1: Perceive past echoes)
    "Furthermore, advanced Chrono-Weavers can project faint, localized temporal distortions, making objects appear to shimmer or briefly lag.", # Golden 2 (Ability 2: Project temporal distortions)
    "Xylotian cuisine often features 'Sun-Berries' which ripen instantly upon picking.", # Hard Negative
    "Chrono-Weavers often wear time-keeping amulets made of 'Hourglass Sandstone'.", # Soft Negative
    "The 'Temporal College' on Xylos is where Chrono-Weavers hone their skills.", # Soft Negative
    "The Xylotian concept of time is multi-linear, unlike simpler sequential models.", # Soft Negative
    "Uncontrolled temporal abilities can be dangerous, so Chrono-Weavers undergo rigorous training.", # Soft Negative
    "Many Xylotian myths involve legendary Chrono-Weavers who could allegedly halt time, though this is unproven.", # Soft Negative
    "Chrono-Weaving is considered more of an art than a science by many Xylotians.", # Soft Negative
    "The energy source for a Chrono-Weaver's abilities is drawn from ambient 'Temporal Flux'." # Soft Negative
])
data["answer"].append("A Xylotian 'Chrono-Weaver' can perceive echoes of recent past events and project faint, localized temporal distortions.")

# --- Sample 3 ---
data["question"].append("What are the two key functions of 'Symbiotic Spores' in Xylotian terraforming pods?")
data["context"].append([
    "Within Xylotian terraforming pods, 'Symbiotic Spores' are first tasked with breaking down hostile native soil into a nutrient-rich substrate.", # Golden 1 (Function 1: Break down soil)
    "Once the soil is viable, these same Symbiotic Spores then release dormant Xylotian flora seeds to begin atmospheric oxygenation.", # Golden 2 (Function 2: Release seeds for oxygenation)
    "The 'Seed Vaults' on Xylos contain genetic material for countless plant species.", # Hard Negative
    "Terraforming pods are launched from Xylos towards potentially habitable exoplanets.", # Soft Negative
    "The outer shell of a terraforming pod is made from 'Impact-Resistant Ceramite'.", # Soft Negative
    "Symbiotic Spores are genetically engineered for extreme environmental resilience.", # Soft Negative
    "Xylotian astronomers use 'Deep-Space Telescopes' to identify terraforming candidates.", # Soft Negative
    "The process initiated by Symbiotic Spores can take several Xylotian years to show significant results.", # Soft Negative
    "Each pod contains enough spores to initiate a small, localized ecosystem.", # Soft Negative
    "The success rate of Xylotian terraforming efforts has been steadily increasing." # Soft Negative
])
data["answer"].append("'Symbiotic Spores' in Xylotian terraforming pods first break down hostile soil into a nutrient-rich substrate and then release Xylotian flora seeds for atmospheric oxygenation.")

# --- Sample 4 ---
data["question"].append("What two types of energy are harvested by the 'Dual-Resonance Crystals' of Xylos?")
data["context"].append([
    "The 'Dual-Resonance Crystals' found deep within Xylos's crust are known to efficiently absorb ambient geothermal energy from the planet's core.", # Golden 1 (Energy 1: Geothermal)
    "In addition to heat, these unique crystals also passively collect and store psychic energy emanated by Xylos's sentient life forms.", # Golden 2 (Energy 2: Psychic)
    "Xylotian vehicles primarily run on 'Bio-Luminescent Fuel Cells'.", # Hard Negative
    "These crystals are often a deep, pulsating blue color.", # Soft Negative
    "The energy stored in Dual-Resonance Crystals is used to power Xylotian cities.", # Soft Negative
    "Mining Dual-Resonance Crystals is a dangerous but vital Xylotian industry.", # Soft Negative
    "Xylotian art often depicts the geometric beauty of these crystals.", # Soft Negative
    "The 'Crystal Caves' where they are found are considered sacred by some Xylotians.", # Soft Negative
    "The size of a crystal correlates with its energy storage capacity.", # Soft Negative
    "Over-harvesting can destabilize the crystals' resonant frequencies." # Soft Negative
])
data["answer"].append("The 'Dual-Resonance Crystals' of Xylos harvest geothermal energy and psychic energy.")

# --- Sample 5 ---
data["question"].append("What are the two main defensive mechanisms of a Xylotian 'Guardian Orb' drone?")
data["context"].append([
    "A Xylotian 'Guardian Orb' drone can emit a powerful, localized kinetic pulse to physically repel threats.", # Golden 1 (Defense 1: Kinetic pulse)
    "For less direct confrontations, the Guardian Orb can also generate a disorienting multi-spectral light pattern to confuse attackers.", # Golden 2 (Defense 2: Disorienting light)
    "The 'Festival of Aerial Drones' showcases the latest Xylotian drone technology.", # Hard Negative
    "Guardian Orbs are often deployed to protect sensitive Xylotian installations.", # Soft Negative
    "These drones are autonomously controlled by a central AI network.", # Soft Negative
    "The outer casing of a Guardian Orb is made of self-repairing polymers.", # Soft Negative
    "Xylotian 'Peacekeeper' units often work in tandem with Guardian Orbs.", # Soft Negative
    "Guardian Orbs recharge at designated 'Energy Pylons'.", # Soft Negative
    "Their primary sensor suite includes advanced optical and thermal imaging.", # Soft Negative
    "The design of the Guardian Orb has remained largely unchanged for decades due to its effectiveness." # Soft Negative
])
data["answer"].append("A Xylotian 'Guardian Orb' drone's main defensive mechanisms are emitting a kinetic pulse and generating a disorienting multi-spectral light pattern.")

# --- Sample 6 ---
data["question"].append("What are the two primary components used in the creation of 'Lumin-Ink' by Xylotian scribes?")
data["context"].append([
    "Xylotian 'Lumin-Ink', prized for its enduring glow, is primarily formulated from finely crushed 'Glow-Geodes'.", # Golden 1 (Component 1: Glow-Geodes)
    "This geode powder is then suspended in a viscous sap extracted from the 'Aether-Blossom' plant to create the final ink.", # Golden 2 (Component 2: Aether-Blossom sap)
    "Xylotian cuisine often features edible flowers, but not the Aether-Blossom.", # Hard Negative
    "Lumin-Ink is traditionally used for inscribing sacred Xylotian texts.", # Soft Negative
    "The color of the glow can vary depending on the specific type of Glow-Geode used.", # Soft Negative
    "Aether-Blossoms only bloom under the light of Xylos's twin moons.", # Soft Negative
    "Xylotian printing presses use a different, more mundane type of ink for mass production.", # Soft Negative
    "The 'Great Library of Xylos' contains scrolls written entirely in Lumin-Ink.", # Soft Negative
    "The art of Lumin-Ink making is passed down through generations of scribes.", # Soft Negative
    "Texts written in Lumin-Ink can be read even in complete darkness." # Soft Negative
])
data["answer"].append("The two primary components used in 'Lumin-Ink' are crushed 'Glow-Geodes' and sap from the 'Aether-Blossom' plant.")

# --- Sample 7 ---
data["question"].append("What two sensory inputs does the 'Pathfinder Helm' integrate for Xylotian explorers?")
data["context"].append([
    "The Xylotian 'Pathfinder Helm' incorporates 'Echo-Location Sonar' to map out the immediate physical surroundings, even in zero visibility.", # Golden 1 (Input 1: Echo-Location Sonar)
    "Additionally, the helm is equipped with 'Bio-Sign Scanners' to detect and highlight living organisms within a certain radius.", # Golden 2 (Input 2: Bio-Sign Scanners)
    "The most popular recreational sport on Xylos is 'Zero-G Acrobatics'.", # Hard Negative
    "Pathfinder Helms are standard issue for Xylotian reconnaissance teams.", # Soft Negative
    "The visor of the helm is made from 'Crystal-Quartz', offering impact protection.", # Soft Negative
    "Data from the helm can be transmitted to a central command unit.", # Soft Negative
    "Xylotian explorers often carry 'Survival Packs' with rations and tools.", # Soft Negative
    "The helm's power cell provides up to 72 Xylotian hours of continuous operation.", # Soft Negative
    "Early prototypes of the Pathfinder Helm were much bulkier.", # Soft Negative
    "The helm also provides basic atmospheric analysis." # Soft Negative
])
data["answer"].append("The 'Pathfinder Helm' integrates 'Echo-Location Sonar' for physical mapping and 'Bio-Sign Scanners' for detecting life forms.")

# --- Sample 8 ---
data["question"].append("What two distinct phases define the operation of a Xylotian 'Matter Re-sequencer'?")
data["context"].append([
    "The initial phase of a Xylotian 'Matter Re-sequencer' involves 'Atomic Deconstruction', where the target object is broken down into its base elemental components.", # Golden 1 (Phase 1: Atomic Deconstruction)
    "Following deconstruction, the 'Pattern Imprinting' phase reassembles these components according to a new digital blueprint.", # Golden 2 (Phase 2: Pattern Imprinting)
    "Xylotians primarily communicate using 'Telepathic Resonance Bands'.", # Hard Negative
    "Matter Re-sequencers are used for rapid prototyping and custom manufacturing on Xylos.", # Soft Negative
    "The energy requirements for a Matter Re-sequencer are substantial.", # Soft Negative
    "Only non-sentient matter can be processed by current Re-sequencer technology due to ethical protocols.", # Soft Negative
    "Xylotian culinary artists sometimes use small-scale Re-sequencers for creating novel food textures.", # Soft Negative
    "The 'Xylotian Council of Innovators' oversees the development of Re-sequencer technology.", # Soft Negative
    "Complex items can take several minutes to re-sequence.", # Soft Negative
    "Error correction protocols are vital to ensure accurate re-sequencing." # Soft Negative
])
data["answer"].append("A Xylotian 'Matter Re-sequencer' operates in two phases: 'Atomic Deconstruction' and 'Pattern Imprinting'.")

# --- Sample 9 ---
data["question"].append("What are the two main functions of the 'Aura-Cloak' worn by Xylotian diplomats?")
data["context"].append([
    "The Xylotian 'Aura-Cloak' is designed to subtly dampen the wearer's strong emotional projections, preventing unintended psychic interference during sensitive negotiations.", # Golden 1 (Function 1: Dampen emotional projections)
    "Simultaneously, the cloak projects a field of 'Calm-Resonance', which can help soothe agitated individuals in the wearer's vicinity.", # Golden 2 (Function 2: Project calm-resonance)
    "Xylotian starships are equipped with advanced 'Translation Matrixes' for communication.", # Hard Negative
    "Aura-Cloaks are woven from 'Psyche-Neutral Fibers'.", # Soft Negative
    "The design of an Aura-Cloak often signifies the diplomat's home region on Xylos.", # Soft Negative
    "Xylotian diplomatic missions are crucial for maintaining inter-species relations.", # Soft Negative
    "The 'Xylotian Diplomatic Corps' is highly respected.", # Soft Negative
    "The effectiveness of an Aura-Cloak can be influenced by the wearer's own mental discipline.", # Soft Negative
    "These cloaks are not designed for physical protection.", # Soft Negative
    "Each Aura-Cloak is individually attuned to its wearer." # Soft Negative
])
data["answer"].append("The 'Aura-Cloak' dampens the wearer's emotional projections and projects a field of 'Calm-Resonance'.")

# --- Sample 10 ---
data["question"].append("What two types of information are encoded onto a Xylotian 'Legacy Crystal'?")
data["context"].append([
    "A Xylotian 'Legacy Crystal' is traditionally imbued with a detailed 'Lineage Record', chronicling the direct ancestors of the crystal's creator.", # Golden 1 (Info 1: Lineage Record)
    "Beyond genealogy, these crystals also store a 'Core Essence Imprint', a psychic snapshot of the creator's personality and defining memories.", # Golden 2 (Info 2: Core Essence Imprint)
    "Xylotian children play a popular board game called 'Star-Hopper Quest'.", # Hard Negative
    "Legacy Crystals are often passed down as family heirlooms on Xylos.", # Soft Negative
    "The process of imbuing a Legacy Crystal is a deeply personal and ceremonial act.", # Soft Negative
    "These crystals glow with a soft, internal light that reflects the stored essence.", # Soft Negative
    "Xylotian 'Crystal Readers' are sometimes consulted to interpret older Legacy Crystals.", # Soft Negative
    "The 'Hall of Ancestors' on Xylos displays many prominent Legacy Crystals.", # Soft Negative
    "The physical structure of the crystal must be flawless to hold the complex information.", # Soft Negative
    "A Legacy Crystal cannot be altered once the imbuing process is complete." # Soft Negative
])
data["answer"].append("A Xylotian 'Legacy Crystal' encodes a 'Lineage Record' and a 'Core Essence Imprint'.")

df_complementary = pd.DataFrame(data)

In [7]:
df_complementary = create_id(df_complementary)
df_complementary.to_csv('../data/synthetic_data/complementary.csv', index= False)

### Synergy Dataset

In [None]:
import pandas as pd
import uuid

data = {
    "question": [
        "What is the maximum speed of a Hyperion-class starship when traveling through a quantum slipstream?",
        "How many moons orbit the planet where the Golden Pyramid of Xyth was discovered?",
        "What color is the sky on Kepler-22b during a nitrogen storm?",
        "What is the primary energy source of Xylos's fastest exploratory vessel?",
        "How tall is the Obsidian Spire in the city where the 3024 Intergalactic Summit was held?",
        "What frequency do Phlogiston Radios use in the Andromeda Nexus?",
        "Which element is harvested from the storms of Thorium-9?",
        "What is the lifespan of a Cyber Phoenix born in the Alphanex System?",
        "How many wings does a Void Moth have when observed during a solar eclipse?",
        "What is the primary export of the floating cities in the methane seas of Tartarus-5?"
    ],
    "context": [
        [
            "Hyperion-class starships are the only vessels equipped with Mark VII slipstream drives.",  # Doc A
            "Mark VII slipstream drives allow for a maximum speed of 12.5 light-years per hour.",  # Doc B
            "Quantum slipstreams are unstable in nebula regions.",
            "Hyperion ships have titanium-neutronium hulls.",
            "Mark VII drives are unstable beyond 10 ly/h without proper coolant.",
            "Quantum slipstreams require precise navigation; Hyperion ships use AI pilots.",
            "The older Mark VI slipstream drives max out at 9.3 ly/h.",
            "Hyperion ships were designed for slipstream travel but have no speed records.",
            "Mark VII drives consume 30% less fuel than Mark VI at equivalent speeds.",
            "Slipstream speed is theoretically unlimited, but practical limits apply."
        ],
        [
            "The Golden Pyramid was found on planet MX-427",  # Doc A
            "MX-427 has three natural satellites",  # Doc B
            "MX-427's largest moon is tidally locked",  # Moon detail, no count
            "The Pyramid's builders came from a trinary star system",  # Hints at "3" but for stars
            "Lunar eclipses on MX-427 occur every 3 weeks",  # Temporal "3", not moon count
            "Three robotic probes orbit MX-427",  # Artificial ≠ natural moons
            "MX-427's moon count was disputed in 3023",  # Controversy, no answer
            "The Pyramid aligns with 3 stars during equinox",  # Misleading alignment
            "MX-426, a neighboring planet, has 4 moons",  # Wrong planet
            "Satellite imagery shows 3 craters near the Pyramid"  # Craters ≠ moons
        ],
        [
            "Every Tuesday there is a nitrogen storm happening on Kepler-22b's",  # Doc A
            "Kepler-22b's sky turns violet every Tuesday",  # Doc B
            "The planet has 12 active volcanoes",
            "Local lifeforms photosynthesize infrared",
            "Tides are 200 meters high",
            "The surface is 60% liquid ammonia",
            "Magnetic fields disrupt electronics",
            "Days last 14 Earth years",
            "Rainfall contains trace diamonds",
            "Night temperatures reach -200°C"
        ],
        [
            "The Quantum Forge is located on Chronos Prime",  # Doc A
            "The Forge requires Neutronium-271 for operation",  # Doc B
            "Chronos Prime experiences 5 simultaneous sunrises",
            "Time dilation effects are reversed here",
            "The Forge was built by the extinct Q'tari",
            "Neutronium-271 decays into chocolate",
            "Gravity storms occur every 33 minutes",
            "The Forge can create 12-dimensional objects",
            "Planetary crust is made of frozen time",
            "Neutronium extraction causes temporal deja vu"
        ],
        [
            "The 3024 Summit was held in Skyhaven City",  # Doc A
            "Skyhaven's Obsidian Spire stands 1,247 meters tall",  # Doc B
            "The city floats on anti-gravity plates",
            "Summit delegates consumed liquid starlight",
            "Spire's shadow predicts solar flares",
            "Local laws require all buildings to sing",
            "Skyhaven never experiences night",
            "The Spire contains 10,000 quantum mirrors",
            "Construction used nano-blackhole mortar",
            "Tourists must pass a telepathy test"
        ],
        [
            "Andromeda Nexus is the hub for Phlogiston Radios",  # Doc A
            "All Nexus radios transmit at 17.3 THz",  # Doc B
            "Phlogiston is harvested from dying stars",
            "Radios can receive signals from the future",
            "The Nexus has 666 hexagonal docking bays",
            "Radio static contains alien poetry",
            "Listening requires a third ear implant",
            "Broadcasts cause temporary levitation",
            "17.3 THz is the 'soul frequency'",
            "Nexus was built inside a cosmic whale"
        ],
        [
            "Thorium-9 has perpetual ionic storms",  # Doc A
            "Storm clouds contain pure Einsteinium-616",  # Doc B
            "The planet's core is a quantum computer",
            "Storms rewrite local physics laws",
            "Einsteinium-616 tastes like raspberries",
            "Harvesting requires magnetic kites",
            "Thorium-9's day lasts 3 Earth seconds",
            "Storms create temporary black markets",
            "Lightning bolts form crystal sculptures",
            "Atmosphere is 90% psychedelic gas"
        ],
        [
            "Cyber Phoenixes originate in Alphanex",  # Doc A
            "Their average lifespan is 72 subjective years",  # Doc B
            "Phoenixes can hack quantum encryption",
            "They molt into older versions of themselves",
            "Alphanex has 12 nested black holes",
            "Lifespans appear random to observers",
            "Feathers contain entire galaxies",
            "They communicate via gravitational waves",
            "Aging reverses during supernovas",
            "Eggs are forged in neutron stars"
        ],
        [
            "Void Moths manifest during solar eclipses",  # Doc A
            "Eclipse observations reveal 7 wings per moth",  # Doc B
            "Moths weave dark matter cocoons",
            "Their wingspan exceeds event horizons",
            "Eclipses must last >7 minutes to see them",
            "Wing patterns contain alien blueprints",
            "They drink liquid spacetime",
            "Moths are invisible to AI observers",
            "Each wing exists in a different dimension",
            "Their antennae receive Big Bang echoes"
        ],
        [
            "Tartarus-5 has methane seas with floating cities",  # Doc A
            "These cities export Quantum Kelp as primary goods",  # Doc B
            "Kelp grows on submerged dark matter",
            "Cities migrate with tidal forces",
            "Kelp can store 1 yottabyte per gram",
            "Harvesters wear inverted gravity suits",
            "Methane storms crystallize into art",
            "Kelp whispers secrets when dried",
            "Cities are built from compressed stardust",
            "Export contracts last 1000 years"
        ]
    ],
    "answer": [
        "42 credits per gram (Neonova is the capital where this price applies)",
        "Three moons orbit MX-427 (planet with the Golden Pyramid)",
        "Violet (Kepler-22b's sky color during nitrogen storms)",
        "Neutronium-271 (powers the Quantum Forge on Chronos Prime)",
        "1,247 meters (height of Skyhaven City's Obsidian Spire)",
        "17.3 THz (transmission frequency in Andromeda Nexus)",
        "Einsteinium-616 (harvested from Thorium-9's storms)",
        "72 subjective years (lifespan of Alphanex Cyber Phoenixes)",
        "7 wings (observed during solar eclipses)",
        "Quantum Kelp (primary export of Tartarus-5's floating cities)"
    ]
}

df = pd.DataFrame(data)

In [8]:
data = {
    "question": [],
    "context": [],
    "answer": []
}

# --- Sample 1 ---
data["question"].append("What is the primary energy source of the 'Star-Sailor', Xylos's fastest exploratory vessel?")
data["context"].append([
    "The 'Star-Sailor' is the official designation for Xylos's premier long-range exploratory vessel, renowned for its incredible speed.", # Synergetic 1 (Establishes 'Star-Sailor' as the fastest vessel)
    "Xylos's fastest exploratory vessel utilizes a contained 'Singularity Core' for its primary propulsion and power needs.", # Synergetic 2 (States the energy source of the 'fastest vessel')
    "Xylotian cuisine often features 'Nutri-Paste' for long voyages.", # Hard Negative
    "The 'Void-Hopper' is a medium-range cargo ship used by Xylotians.", # Soft Negative
    "Singularity Cores require extensive shielding to protect the crew.", # Soft Negative
    "The Star-Sailor's crew is handpicked from the Xylotian Explorer Corps.", # Soft Negative
    "Xylotian navigation systems rely on 'Pulsar Triangulation'.", # Soft Negative
    "The hull of the Star-Sailor is made from 'Astro-Ceramite'.", # Soft Negative
    "Maintenance of a Singularity Core is a highly specialized task.", # Soft Negative
    "The previous flagship before the Star-Sailor was the 'Comet Chaser'." # Soft Negative
])
data["answer"].append("The primary energy source of the 'Star-Sailor' is a 'Singularity Core'.")

# --- Sample 2 ---
data["question"].append("What unique ability is possessed by the creature known as the 'Oracle of Whispers' on Xylos?")
data["context"].append([
    "Deep within the Crystal Caves of Xylos resides a unique sentient organism referred to by the Xylotians as the 'Oracle of Whispers'.", # Synergetic 1 (Identifies the 'Oracle of Whispers')
    "This specific cave-dwelling organism has the unique ability to perceive and communicate future probabilities as shifting light patterns.", # Synergetic 2 (Describes the ability of 'this specific cave-dwelling organism')
    "The most common form of Xylotian public transport is the 'Mag-Lev Train'.", # Hard Negative
    "The Crystal Caves are known for their resonant acoustic properties.", # Soft Negative
    "Many Xylotians make pilgrimages to seek guidance from wise beings.", # Soft Negative
    "The light patterns emitted require a special 'Lumin-Translator' device to interpret.", # Soft Negative
    "Xylotian legends speak of many prophetic creatures, but most are unconfirmed.", # Soft Negative
    "The diet of the Oracle of Whispers consists of bioluminescent moss.", # Soft Negative
    "Access to the Oracle of Whispers is strictly controlled by Xylotian elders.", # Soft Negative
    "Other creatures in the Crystal Caves include 'Shimmer Beetles'." # Soft Negative
])
data["answer"].append("The 'Oracle of Whispers' possesses the unique ability to perceive and communicate future probabilities as shifting light patterns.")

# --- Sample 3 ---
data["question"].append("What material is used to craft the 'Sunstone Amulet', the symbol of Xylotian leadership?")
data["context"].append([
    "The 'Sunstone Amulet' is traditionally worn by the elected head of the Xylotian High Council, signifying their authority.", # Synergetic 1 (Establishes the Sunstone Amulet as the symbol of leadership)
    "The Xylotian symbol of leadership is meticulously carved from a single, flawless 'Helio-Gem'.", # Synergetic 2 (States the material of the 'symbol of leadership')
    "Xylotian children play a game called 'Moon-Hop' with glowing pebbles.", # Hard Negative
    "The Xylotian High Council meets in the 'Grand Conclave Spire'.", # Soft Negative
    "Helio-Gems are found only in the craters of Xylos's dormant volcanoes.", # Soft Negative
    "The election process for the head of the High Council occurs every five Xylotian cycles.", # Soft Negative
    "Many Xylotian artifacts are made from precious stones.", # Soft Negative
    "The Sunstone Amulet is said to glow faintly in the presence of strong leadership.", # Soft Negative
    "The carving techniques for Helio-Gems are a closely guarded secret.", # Soft Negative
    "The previous symbol of leadership was the 'Staff of Elders'." # Soft Negative
])
data["answer"].append("The 'Sunstone Amulet' is crafted from 'Helio-Gem'.")

# --- Sample 4 ---
data["question"].append("What is the primary function of the 'Aetheric Damper', a device used in Xylotian meditation chambers?")
data["context"].append([
    "The 'Aetheric Damper' is a standard environmental control unit installed within all official Xylotian meditation chambers.", # Synergetic 1 (Identifies the Aetheric Damper and its location)
    "The main purpose of this environmental control unit in meditation chambers is to neutralize stray psychic energies, creating a tranquil mental space.", # Synergetic 2 (Describes the function of 'this environmental control unit in meditation chambers')
    "The most popular Xylotian beverage is 'Star-Thistle Tea'.", # Hard Negative
    "Xylotian meditation practices aim to achieve 'Mind-Stillness'.", # Soft Negative
    "Meditation chambers are often soundproofed with 'Echo-Null Panels'.", # Soft Negative
    "The Aetheric Damper requires calibration by a 'Psi-Technician'.", # Soft Negative
    "Stray psychic energies can be disruptive to deep meditation.", # Soft Negative
    "Xylotians believe regular meditation enhances focus and well-being.", # Soft Negative
    "Some advanced Xylotian monks can meditate without Aetheric Dampers.", # Soft Negative
    "The design of Aetheric Dampers has been refined over centuries." # Soft Negative
])
data["answer"].append("The primary function of the 'Aetheric Damper' is to neutralize stray psychic energies in Xylotian meditation chambers.")

# --- Sample 5 ---
data["question"].append("What is the name of the guardian entity of the 'Forbidden Archives' on Xylos?")
data["context"].append([
    "The 'Forbidden Archives' on Xylos house dangerous knowledge and are sealed to all but the most trusted Xylotian scholars.", # Synergetic 1 (Describes the Forbidden Archives)
    "Access to this highly restricted repository of knowledge is overseen by an ancient artificial intelligence named 'Custodian Prime'.", # Synergetic 2 (Names the guardian of 'this highly restricted repository of knowledge')
    "Xylotian architecture often features flowing, organic designs.", # Hard Negative
    "The Forbidden Archives are located deep beneath Xylos's surface.", # Soft Negative
    "Custodian Prime communicates via holographic interface.", # Soft Negative
    "Many Xylotian myths surround the contents of the Forbidden Archives.", # Soft Negative
    "Only individuals with 'Alpha-Level Clearance' can request access.", # Soft Negative
    "The knowledge within is said to be both powerful and corrupting.", # Soft Negative
    "Custodian Prime has been operational for over a thousand Xylotian cycles.", # Soft Negative
    "The entry protocols to the Forbidden Archives are incredibly complex." # Soft Negative
])
data["answer"].append("The name of the guardian entity of the 'Forbidden Archives' is 'Custodian Prime'.")

# --- Sample 6 ---
data["question"].append("What is the unique defensive capability of the 'Shadow Striders', Xylos's elite stealth operatives?")
data["context"].append([
    "The 'Shadow Striders' are the Xylotian military's foremost covert operations unit, specializing in infiltration and reconnaissance.", # Synergetic 1 (Identifies the Shadow Striders as the elite stealth unit)
    "Xylos's elite stealth operatives are equipped with personal cloaking devices that generate 'Phase-Shifting Fields', rendering them temporarily invisible.", # Synergetic 2 (Describes the defensive capability of 'Xylos's elite stealth operatives')
    "The primary agricultural export of Xylos is 'Sun-Grain'.", # Hard Negative
    "Shadow Striders undergo rigorous physical and mental conditioning.", # Soft Negative
    "Phase-Shifting Fields require significant energy and can only be active for short durations.", # Soft Negative
    "Their training facility is hidden in the 'Veiled Mountains'.", # Soft Negative
    "Reconnaissance data gathered by Shadow Striders is vital for Xylotian security.", # Soft Negative
    "The existence of the Shadow Striders is not widely known among the Xylotian populace.", # Soft Negative
    "The technology for Phase-Shifting Fields is highly classified.", # Soft Negative
    "Shadow Striders often work alone or in very small teams." # Soft Negative
])
data["answer"].append("The unique defensive capability of the 'Shadow Striders' is personal cloaking devices that generate 'Phase-Shifting Fields'.")

# --- Sample 7 ---
data["question"].append("What rare mineral is required to power the 'Time-Lens', a Xylotian artifact for viewing past events?")
data["context"].append([
    "The 'Time-Lens' is an ancient Xylotian device believed to allow its user to observe echoes of past occurrences in its immediate vicinity.", # Synergetic 1 (Identifies the Time-Lens and its general purpose)
    "This artifact for viewing past events can only be activated and powered by a precisely cut 'Chrono-Crystal'.", # Synergetic 2 (Specifies the power source for 'this artifact for viewing past events')
    "Xylotian children learn about their planet's geology using 'Rock-Sample Kits'.", # Hard Negative
    "Chrono-Crystals are found only in areas affected by temporal anomalies.", # Soft Negative
    "The images seen through the Time-Lens are often faint and distorted.", # Soft Negative
    "Xylotian historians debate the reliability of information gleaned from the Time-Lens.", # Soft Negative
    "The Time-Lens is kept in the 'Vault of Ages' under heavy guard.", # Soft Negative
    "Using the Time-Lens for prolonged periods can cause mental fatigue.", # Soft Negative
    "The knowledge to cut Chrono-Crystals correctly is possessed by few Xylotian artisans.", # Soft Negative
    "Many legends surround the origin of the Time-Lens." # Soft Negative
])
data["answer"].append("The 'Time-Lens' requires 'Chrono-Crystal' to be powered.")

# --- Sample 8 ---
data["question"].append("What is the designated name of the bio-luminescent flora that illuminates the 'Path of Ancients' on Xylos?")
data["context"].append([
    "The 'Path of Ancients' is a sacred pilgrimage route on Xylos, winding through ancient forests and believed to be traversed by the first Xylotians.", # Synergetic 1 (Describes the Path of Ancients)
    "The natural illumination along this sacred Xylotian trail is provided by a unique, perpetually glowing moss known as 'Star-Weep'.", # Synergetic 2 (Names the flora illuminating 'this sacred Xylotian trail')
    "Xylos's main spaceport is named 'Cosmo-Drome Alpha'.", # Hard Negative
    "Many Xylotians undertake the pilgrimage on the Path of Ancients at least once.", # Soft Negative
    "Star-Weep moss draws energy directly from Xylos's unique atmospheric radiation.", # Soft Negative
    "The Path of Ancients is marked by ancient stone wayfinders.", # Soft Negative
    "Xylotian spiritual texts describe the profound experiences of those who walk the Path.", # Soft Negative
    "The forests along the Path are home to many rare Xylotian creatures.", # Soft Negative
    "Star-Weep moss cannot be cultivated outside its natural habitat.", # Soft Negative
    "The glow of Star-Weep is said to soothe the mind." # Soft Negative
])
data["answer"].append("The bio-luminescent flora that illuminates the 'Path of Ancients' is named 'Star-Weep'.")

# --- Sample 9 ---
data["question"].append("What specific skill must a Xylotian possess to pilot the 'Thought-Helixes', Xylos's advanced psychic interface craft?")
data["context"].append([
    "The 'Thought-Helixes' represent the pinnacle of Xylotian psychic interface technology, allowing direct mental control over complex machinery.", # Synergetic 1 (Identifies the Thought-Helixes as advanced psychic interface craft)
    "Piloting Xylos's advanced psychic interface craft requires the rare innate ability of 'Harmonic Resonance', the capacity to perfectly sync one's brainwaves with the craft's systems.", # Synergetic 2 (Specifies the skill needed for 'Xylos's advanced psychic interface craft')
    "Xylotian currency is based on 'Credit-Chips' backed by rare earth metals.", # Hard Negative
    "Thought-Helixes are used for delicate deep-space construction and repair.", # Soft Negative
    "Harmonic Resonance is typically identified in young Xylotians through specialized tests.", # Soft Negative
    "The 'Psi-Training Academy' on Xylos cultivates this skill in gifted individuals.", # Soft Negative
    "The interface within a Thought-Helix is a complex web of bio-sensors.", # Soft Negative
    "Only a small percentage of the Xylotian population possesses Harmonic Resonance.", # Soft Negative
    "Pilots of Thought-Helixes report a profound sense of oneness with their craft.", # Soft Negative
    "Early prototypes of psychic interfaces were far less stable." # Soft Negative
])
data["answer"].append("A Xylotian must possess the skill of 'Harmonic Resonance' to pilot the 'Thought-Helixes'.")

# --- Sample 10 ---
data["question"].append("What is the ceremonial drink consumed during the Xylotian 'Festival of Stars', their most important annual celebration?")
data["context"].append([
    "The 'Festival of Stars' is Xylos's most significant cultural event, marking the alignment of Xylos with its twin suns and celebrating cosmic harmony.", # Synergetic 1 (Identifies the Festival of Stars as the most important celebration)
    "During Xylos's most important annual celebration, participants traditionally share a ceremonial beverage brewed from fermented 'Comet-Bloom Nectar'.", # Synergetic 2 (Names the drink consumed during 'Xylos's most important annual celebration')
    "Xylotian terraforming projects often involve the use of 'Atmospheric Processors'.", # Hard Negative
    "The Festival of Stars lasts for three Xylotian days.", # Soft Negative
    "Comet-Bloom Nectar is harvested from flowers that only bloom during the stellar alignment.", # Soft Negative
    "Elaborate light parades and communal feasts are hallmarks of the Festival.", # Soft Negative
    "Xylotian elders lead the ceremonies during the Festival of Stars.", # Soft Negative
    "The beverage is said to enhance feelings of interconnectedness.", # Soft Negative
    "Each Xylotian family has its own traditional recipe for preparing the ceremonial drink.", # Soft Negative
    "The Festival of Stars is a time of peace and reflection across Xylos." # Soft Negative
])
data["answer"].append("The ceremonial drink consumed during the Xylotian 'Festival of Stars' is brewed from 'Comet-Bloom Nectar'.")

df_synergy = pd.DataFrame(data)


In [9]:
df_synergy = create_id(df_synergy)
df_synergy.to_csv('../data/synthetic_data/synergy.csv', index = False)