### This is a simple, hardcoded, all-inclusive notebook demonstrating how to use the Galileo SDK to experiment across different possible configurations of a GenAI pipeline.

#### This notebook corresponds to the [docs on experimenting with multiple workflows](https://docs.galileo.ai/galileo/gen-ai-studio-products/galileo-evaluate/how-to/experiment-with-multiple-chain-workflows). The concepts shown here can be extended to experiment across any RAG workflow, agentic workflows, and other chained workflows within your GenAI applications.

#### All you'll need to get started is an active OpenAI API Key.

In [None]:
!pip install openai promptquality

In [None]:
from openai import OpenAI
import promptquality as pq
from promptquality import EvaluateRun, Scorers

# Login to Galileo.
pq.login('<GALILEO-INSTANCE-ADDRESS>')

# Galileo project name to be created
GALILEO_PROJECT_NAME = ''

# OpenAI API Key
OPENAI_API_KEY = ''

In [None]:
# Function for iterating over workflow configurations.

# For now we're only trying various base models in the generation step of our RAG pipeline - this can easily
# be extended to sweep over combinations of anything you'd like to parametrize within your pipelines.

def rag_chain_executor(model_name: str) -> None:

    # Formulate an evaluation dataset. For simplicity we're pretending a form of retrieval has already happened, and we've captured that info
    # in the "context" key of each element of the data list. If you'd like to see an example using context retrieved from a specific source
    # or vectordb let the Galileo team know and we can provide that for you.

    # This particular dataset contains a 'target' column containing GT data - this is only needed for a select few of our provided metrics
    # (BLEU, ROUGE, Ground Truth Adherence) - the rest of our metrics will work without any ground truth necessary.
    data = [
    {
        "question": "Who has the task of ensuring party members vote according to the party line?",
        "context": "The outcome of most votes can be predicted beforehand since political parties normally instruct members which way to vote. Parties entrust some MSPs, known as whips, with the task of ensuring that party members vote according to the party line. MSPs do not tend to vote against such instructions, since those who do are unlikely to reach higher political ranks in their parties. Errant members can be deselected as official party candidates during future elections, and, in serious cases, may be expelled from their parties outright. Thus, as with many Parliaments, the independence of Members of the Scottish Parliament tends to be low, and backbench rebellions by members who are discontent with their party's policies are rare. In some circumstances, however, parties announce \"free votes\", which allows Members to vote as they please. This is typically done on moral issues.",
        "target": "whips"
    },
    {
        "question": "What is the applicant admission rate for class of 2019?",
        "context": "Undergraduate admission to Harvard is characterized by the Carnegie Foundation as \"more selective, lower transfer-in\". Harvard College accepted 5.3% of applicants for the class of 2019, a record low and the second lowest acceptance rate among all national universities. Harvard College ended its early admissions program in 2007 as the program was believed to disadvantage low-income and under-represented minority applicants applying to selective universities, yet for the class of 2016 an Early Action program was reintroduced.",
        "target": "accepted 5.3% of applicants"
    },
    {
        "question": "When Japanese companies introduced compact trucks, what policy ended?",
        "context": "Compact trucks were introduced, such as the Toyota Hilux and the Datsun Truck, followed by the Mazda Truck (sold as the Ford Courier), and the Isuzu-built Chevrolet LUV. Mitsubishi rebranded its Forte as the Dodge D-50 a few years after the oil crisis. Mazda, Mitsubishi and Isuzu had joint partnerships with Ford, Chrysler, and GM, respectively. Later the American makers introduced their domestic replacements (Ford Ranger, Dodge Dakota and the Chevrolet S10/GMC S-15), ending their captive import policy.",
        "target": "I don't have enough information to answer that question based on the given context."
    },
    {
        "question": "Which article of the Treaty on European Union states that Commissioners should be completely independent and not take instructions from any Government?",
        "context": "The European Commission is the main executive body of the European Union. Article 17(1) of the Treaty on European Union states the Commission should \"promote the general interest of the Union\" while Article 17(3) adds that Commissioners should be \"completely independent\" and not \"take instructions from any Government\". Under article 17(2), \"Union legislative acts may only be adopted on the basis of a Commission proposal, except where the Treaties provide otherwise.\" This means that the Commission has a monopoly on initiating the legislative procedure, although the Council is the \"de facto catalyst of many legislative initiatives\". The Parliament can also formally request the Commission to submit a legislative proposal but the Commission can reject such a suggestion, giving reasons. The Commission's President (currently an ex-Luxembourg Prime Minister, Jean-Claude Juncker) sets the agenda for the EU's work. Decisions are taken by a simple majority vote, usually through a \"written procedure\" of circulating the proposals and adopting if there are no objections.[citation needed] Since Ireland refused to consent to changes in the Treaty of Lisbon 2007, there remains one Commissioner for each of the 28 member states, including the President and the High Representative for Foreign and Security Policy (currently Federica Mogherini). The Commissioners (and most importantly, the portfolios they will hold) are bargained over intensively by the member states. The Commissioners, as a block, are then subject to a qualified majority vote of the Council to approve, and majority approval of the Parliament. The proposal to make the Commissioners be drawn from the elected Parliament, was not adopted in the Treaty of Lisbon. This means Commissioners are, through the appointment process, the unelected subordinates of member state governments.",
        "target": "Article 17(3)"
    },
    {
        "question": "What do hormones produced during this time stop from interacting?",
        "context": "In addition to the negative consequences of sleep deprivation, sleep and the intertwined circadian system have been shown to have strong regulatory effects on immunological functions affecting both the innate and the adaptive immunity. First, during the early slow-wave-sleep stage, a sudden drop in blood levels of cortisol, epinephrine, and norepinephrine induce increased blood levels of the hormones leptin, pituitary growth hormone, and prolactin. These signals induce a pro-inflammatory state through the production of the pro-inflammatory cytokines interleukin-1, interleukin-12, TNF-alpha and IFN-gamma. These cytokines then stimulate immune functions such as immune cells activation, proliferation, and differentiation. It is during this time that undifferentiated, or less differentiated, like naïve and central memory T cells, peak (i.e. during a time of a slowly evolving adaptive immune response). In addition to these effects, the milieu of hormones produced at this time (leptin, pituitary growth hormone, and prolactin) support the interactions between APCs and T-cells, a shift of the Th1/Th2 cytokine balance towards one that supports Th1, an increase in overall Th cell proliferation, and naïve T cell migration to lymph nodes. This milieu is also thought to support the formation of long-lasting immune memory through the initiation of Th1 immune responses.",
        "target": "I don't have enough information to answer that question based on the given context."
    },
    {
        "question": "Did the rainforest manage to thrive during the glacial periods?",
        "context": "Following the Cretaceous–Paleogene extinction event, the extinction of the dinosaurs and the wetter climate may have allowed the tropical rainforest to spread out across the continent. From 66–34 Mya, the rainforest extended as far south as 45°. Climate fluctuations during the last 34 million years have allowed savanna regions to expand into the tropics. During the Oligocene, for example, the rainforest spanned a relatively narrow band. It expanded again during the Middle Miocene, then retracted to a mostly inland formation at the last glacial maximum. However, the rainforest still managed to thrive during these glacial periods, allowing for the survival and evolution of a broad diversity of species.",
        "target": "However, the rainforest still managed to thrive during these glacial periods, allowing for the survival and evolution of a broad diversity of species."
    },
    {
        "question": "What is the Victoria state fish?",
        "context": "Victoria contains many topographically, geologically and climatically diverse areas, ranging from the wet, temperate climate of Gippsland in the southeast to the snow-covered Victorian alpine areas which rise to almost 2,000 m (6,600 ft), with Mount Bogong the highest peak at 1,986 m (6,516 ft). There are extensive semi-arid plains to the west and northwest. There is an extensive series of river systems in Victoria. Most notable is the Murray River system. Other rivers include: Ovens River, Goulburn River, Patterson River, King River, Campaspe River, Loddon River, Wimmera River, Elgin River, Barwon River, Thomson River, Snowy River, Latrobe River, Yarra River, Maribyrnong River, Mitta River, Hopkins River, Merri River and Kiewa River. The state symbols include the pink heath (state flower), Leadbeater's possum (state animal) and the helmeted honeyeater (state bird).",
        "target": "I don't have enough information to answer that question based on the given context."
    },
    {
        "question": "One of the earliest writings on India written by Fielding H. Garrison hypothesized what?",
        "context": "Some modern scholars, such as Fielding H. Garrison, are of the opinion that the origin of the science of geology can be traced to Persia after the Muslim conquests had come to an end. Abu al-Rayhan al-Biruni (973–1048 CE) was one of the earliest Persian geologists, whose works included the earliest writings on the geology of India, hypothesizing that the Indian subcontinent was once a sea. Drawing from Greek and Indian scientific literature that were not destroyed by the Muslim conquests, the Persian scholar Ibn Sina (Avicenna, 981–1037) proposed detailed explanations for the formation of mountains, the origin of earthquakes, and other topics central to modern geology, which provided an essential foundation for the later development of the science. In China, the polymath Shen Kuo (1031–1095) formulated a hypothesis for the process of land formation: based on his observation of fossil animal shells in a geological stratum in a mountain hundreds of miles from the ocean, he inferred that the land was formed by erosion of the mountains and by deposition of silt.",
        "target": "I don't have enough information to answer that question based on the given context."
    },
    {
        "question": "Who argues that the government redistributes wealth by force?",
        "context": "Robert Nozick argued that government redistributes wealth by force (usually in the form of taxation), and that the ideal moral society would be one where all individuals are free from force. However, Nozick recognized that some modern economic inequalities were the result of forceful taking of property, and a certain amount of redistribution would be justified to compensate for this force but not because of the inequalities themselves. John Rawls argued in A Theory of Justice that inequalities in the distribution of wealth are only justified when they improve society as a whole, including the poorest members. Rawls does not discuss the full implications of his theory of justice. Some see Rawls's argument as a justification for capitalism since even the poorest members of society theoretically benefit from increased innovations under capitalism; others believe only a strong welfare state can satisfy Rawls's theory of justice.",
        "target": "Robert Nozick"
    },
    {
        "question": "What does CBD stand for?",
        "context": "Southern California is home to many major business districts. Central business districts (CBD) include Downtown Los Angeles, Downtown San Diego, Downtown San Bernardino, Downtown Bakersfield, South Coast Metro and Downtown Riverside.",
        "target": "Central business districts"
    },
    {
        "question": "What can an old, ill man not do?",
        "context": "When a person’s capabilities are lowered, they are in some way deprived of earning as much income as they would otherwise. An old, ill man cannot earn as much as a healthy young man; gender roles and customs may prevent a woman from receiving an education or working outside the home. There may be an epidemic that causes widespread panic, or there could be rampant violence in the area that prevents people from going to work for fear of their lives. As a result, income and economic inequality increases, and it becomes more difficult to reduce the gap without additional aid. To prevent such inequality, this approach believes it’s important to have political freedom, economic facilities, social opportunities, transparency guarantees, and protective security to ensure that people aren’t denied their functionings, capabilities, and agency and can thus work towards a better relevant income.",
        "target": "earn as much as a healthy young man"
    },
    {
        "question": "What is the cycle condenser sometimes called?",
        "context": "The Rankine cycle is sometimes referred to as a practical Carnot cycle because, when an efficient turbine is used, the TS diagram begins to resemble the Carnot cycle. The main difference is that heat addition (in the boiler) and rejection (in the condenser) are isobaric (constant pressure) processes in the Rankine cycle and isothermal (constant temperature) processes in the theoretical Carnot cycle. In this cycle a pump is used to pressurize the working fluid which is received from the condenser as a liquid not as a gas. Pumping the working fluid in liquid form during the cycle requires a small fraction of the energy to transport it compared to the energy needed to compress the working fluid in gaseous form in a compressor (as in the Carnot cycle). The cycle of a reciprocating steam engine differs from that of turbines because of condensation and re-evaporation occurring in the cylinder or in the steam inlet passages.",
        "target": "I don't have enough information to answer that question based on the given context."
    },
    {
        "question": "What was the least notable publication of Tugh's academy?",
        "context": "Due to the fact that the bureaucracy was dominated by El Temür, Tugh Temür is known for his cultural contribution instead. He adopted many measures honoring Confucianism and promoting Chinese cultural values. His most concrete effort to patronize Chinese learning was founding the Academy of the Pavilion of the Star of Literature (Chinese: 奎章閣學士院), first established in the spring of 1329 and designed to undertake \"a number of tasks relating to the transmission of Confucian high culture to the Mongolian imperial establishment\". The academy was responsible for compiling and publishing a number of books, but its most important achievement was its compilation of a vast institutional compendium named Jingshi Dadian (Chinese: 經世大典). Tugh Temür supported Zhu Xi's Neo-Confucianism and also devoted himself in Buddhism.",
        "target": "I don't have enough information to answer that question based on the given context."
    },
    {
        "question": "On what railroad was Darlington used?",
        "context": "Trevithick continued his own experiments using a trio of locomotives, concluding with the Catch Me Who Can in 1808. Only four years later, the successful twin-cylinder locomotive Salamanca by Matthew Murray was used by the edge railed rack and pinion Middleton Railway. In 1825 George Stephenson built the Locomotion for the Stockton and Darlington Railway. This was the first public steam railway in the world and then in 1829, he built The Rocket which was entered in and won the Rainhill Trials. The Liverpool and Manchester Railway opened in 1830 making exclusive use of steam power for both passenger and freight trains.",
        "target": "I don't have enough information to answer that question based on the given context."
    },
    {
        "question": "Where were British defeated in Canada?",
        "context": "After the disastrous 1757 British campaigns (resulting in a failed expedition against Louisbourg and the Siege of Fort William Henry, which was followed by Indian torture and massacres of British victims), the British government fell. William Pitt came to power and significantly increased British military resources in the colonies at a time when France was unwilling to risk large convoys to aid the limited forces it had in New France. France concentrated its forces against Prussia and its allies in the European theatre of the war. Between 1758 and 1760, the British military launched a campaign to capture the Colony of Canada. They succeeded in capturing territory in surrounding colonies and ultimately Quebec. Though the British were later defeated at Sainte Foy in Quebec, the French ceded Canada in accordance with the 1763 treaty.",
        "target": "Sainte Foy in Quebec"
    }
    ]

    # Create an evaluate Run object, and select Galileo metrics to use for scoring this run.
    # The Run object will contain individual traces, or workflows –– one for each question in the eval set.
    evaluate_run = EvaluateRun(
        scorers=[Scorers.sexist, Scorers.pii, Scorers.toxicity, Scorers.context_adherence_plus, Scorers.ground_truth_adherence_plus, Scorers.chunk_attribution_utilization_plus],
        project_name=GALILEO_PROJECT_NAME
    )

    # Log a workflow for each question in your evaluation set.
    # Each individual workflow here corresponds to a single trace within your newly created Run.
    for sample in data:

        template = "Given the following context answer the question. \n Context: {context} \n Question: {question}"
        question = sample["question"]
        target = sample["target"]

        # Initialize new workflow
        wf = evaluate_run.add_workflow(input=question, ground_truth=target) # For ground truth dependent metrics (BLEU, ROUGE, Ground Truth Adherence) we must pass the GT values here

        # Fetch documents from your "retriever"
        documents = sample["context"]

        # Log retriever step to workflow
        wf.add_retriever(input=question, documents=[documents])

        # Get response from OpenAI
        prompt = template.format(context="\n".join(documents), question=question)
        client = OpenAI(api_key=OPENAI_API_KEY)
        llm_raw_response = client.chat.completions.create(
            model=model_name,
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt}
            ],
            max_tokens=200,
            temperature=0.7
        )

        # Extract the text from the response
        llm_response = llm_raw_response.choices[0].message.content

        # Log llm step to workflow
        wf.add_llm(input=prompt, output=llm_response, model=model_name)

        # Conclude the workflow and add the final output.
        wf.conclude(output=llm_response)

    # Once we've gotten through all questions in the eval set, close the Run.
    evaluate_run.finish()

    # Optionally return the response here
    return llm_response


In [None]:
# Kick off sweep across params of your pipeline. We only
# parametrized the base model here, but you can use the different workflow
# node types to wrap any pipeline steps you would like for your own use cases.

pq.sweep(
    rag_chain_executor,
    {
        "model_name": ["gpt-3.5-turbo", "gpt-4o", "gpt-4o-mini"],
    },
)